Disk health with >1TB/day write
I have a project which demands tons of CPU power and disk space.
Under current testing, it will write 1~5TB data to disk per day depending the CPU power.
Most of those are overwrite, and thus the data size on disk increase only about 50G per day.
I think this project will need to be run several months.
Currently we are testing it on several different machines, including HDD, SSD, NVMe.
Should I worried about disk durability?
If so, is there any method of checking disk health without root access?
Comments
HDD would be best for this use case.
I don't think an SSD (including an NVMe SSD) is appropriate for this task as flash memory degrades the more you write to it. If you want to use SSDs, check the TBW ("terabytes written") rating. This measures the endurance of the drive - It's the amount of writes it's designed to handle before it's considered to be "worn out" and out of warranty, at which point you're on your own. Cheaper SSDs may only be warrantied for 100TBW whereas higher-end SSDs are thousands of TBW, but in either case 5TB/day is going to quickly wear it out.
Regular HDDs don't have the same issue. The cheap Kimsufi servers have 10-year-old heavily used HDDs in them and they're still working fine.
Daniel15 | https://d.sb/. List of all my VPSes: https://d.sb/servers
dnstools.ws - DNS lookups, pings, and traceroutes from 30 locations worldwide.
I see. The problem with HDD is that disk becomes the bottleneck instead of CPU .
But slow HDD is better than broken SSD.
How old is the data being overwritten?
If most overwritten data is from the last few hours, you don't need to commit these data to disk at all.
Instead, write them into DRAM or Optane DIMM, and then commit to disk when the data is unlikely to be changing.
Accepting submissions for IPv6 less than /64 Hall of Incompetence.
Yes, most overwritten data are from the last few hours.
There is one way to store those data in RAM instead of disk.
But previous tests show that one thread will need >8GB RAM, and those machines have only 128GB RAM.
Today I find a machine with 1TB RAM, and I'll test on that.
The method you suggested sounds good, but it requires significant modification of our current program and changing of underlying algorithm.
I'll think how to implement it.
Is Chia still a thing? What are you doing really? What kind of server are you going to do this on without root access? If you're not seeking a lot, HDD's intended for this sort of use (surveillance drives made to record video 24/7) might be a good bet.
Not Chia. There are databases which serves as cache(because RAM is not large enough), and the computation need to iterative over all elements in such databases. In the iteration it will update various elements in other databases, hence it requires seeking.
Currently using: dual Gold5118 with 64G RAM and HDD(RAM and disks are bottleneck),
E5 2640-v4 with 128G RAM and SSD(CPU is the bottleneck).
E5 2640-v4 with 128G RAM and NFS@>2Gbps (NFS speed is good enough, CPU is the bottleneck)
If you will be renting a server, use SSD.
If colo (your own hardware), use HDD.
♻ Amitz day is October 21.
♻ Join Nigh sect by adopting my avatar. Let us spread the joys of the end.
Lol. I don't pay for those hardware. So probably I should use SSD.
At the begining I'm afraid that if SSD broke, I have to start from the begining.
At some point I realize that I can save a snapshot and backup to another machine.
The actively used portion of your database should fit in RAM.
If it cannot fit in the RAM of a single server, you have an architectural problem, not a hardware problem.
The solution is to design it as a distributed system.
Read the Hadoop papers and you'll see the idea.
Accepting submissions for IPv6 less than /64 Hall of Incompetence.
It is a hybrid of architectural and hardware problem. Current architectural allows thread operates on different database so that avoid database locks. However, it means each thread needs some amount of data, and the machine do not have enough RAM.
I agree that it should be fit into RAM, and thus I'm also testing that way.
Currently I think it is better, as long as I adjust the thread number according to available RAM, rather than CPU cores.
Thanks for the suggestion of Hadoop. I'll look into it.
Probably cheapest to buy an old server frankly.
If you're somewhere with cheap electricity then those are definitely viable
This is a hardware problem then.
You need to install more RAM per CPU core.
Major cloud provider offers "high CPU" and "high RAM" models.
Basically, you have a "balanced" or "high CPU' model but you actually need a "high RAM" model.
Accepting submissions for IPv6 less than /64 Hall of Incompetence.
I cannot buy anything, and I am sucked with current available hardware(although there are several possible choices).
If you were to go for say 4x Kingston DC500M 3.84TB SSDs in RAID you may be ok.
These disks have a Drive Writes Per Day of 1.3 meaning that per disk you can write 5TB per day per disk. Now if you're in RAID 10 thats going to get you 10TB per day of writes for the drive lifespan which is usually the warranty period. In the case of the Kingston DC500M thats 5 years.
This isn't guaranteed though, the higher the write workload the higher the risk of drive failure so you definitely want some redundancy.
This being said if you can solve the problem in software by caching hot data you can avoid the need for expensive DC grade hardware although ideally you should do both.
It may be worth looking into the durability of the SSDs you are testing on and check the DWPD, if you need any help to spec up hardware for this application feel free to get in touch
Agreed with @yoursunny, this seams like a great place to use RAM. But then you're in a low end forum, haha
On a budget, I would probably use multiple NVMes for redundancy and then replace them once their flash wears out, as recently I've begun to not be able to stand poor performance.
I thought HDDs were still prone to writing limits? E.g:
https://documents.westerndigital.com/content/dam/doc-library/en_us/assets/public/western-digital/product/internal-drives/wd-red-hdd/product-brief-western-digital-wd-red-hdd.pdf
It says "Supports up to 180TB/yr" so I thought it meant the HDD isn't meant for continuous reads/writes? It is physical spinning rust after all...
Yes, thanks.
Now I know that I need machines with high enough RAM per core, and high single core performance.
The limits are far higher for HDDs, at least comparing non-enterprise HDDs to non-enterprise SSDs (keeping in mind this is a low-end forum where not everyone runs enterprise disks). HDDs tend to last longer than the specs suggest. I used to have a Quantum Fireball 650MB IDE hard drive that still worked fine 20 years after manufacture
Writes physically wear out the RAM chips on SSDs, and the wear is more noticeable than on HDDs. HDDs are usually somewhat recoverable if something bad happens to them, whereas SSDs have a tendency to just completely break when they go bad.
Anyways, like others have said, RAM is better than any disk. If you need to write temporary files,
/dev/shm
is your friend - When it exists (it's an optional kernel feature), it's guaranteed to be a RAM disk usingtmpfs
, so any files you write to it will be all in memory rather than on disk. On the other hand,/tmp
is sometimestmpfs
and sometimes a regular directory on disk.Daniel15 | https://d.sb/. List of all my VPSes: https://d.sb/servers
dnstools.ws - DNS lookups, pings, and traceroutes from 30 locations worldwide.
It might help if you describe what actual problem you are trying to solve. Lots of people use databases for things that can be done with old fashioned serial i/o and sorting, or that kind of thing. Particularly if this is an offline application, you might not need all those disk seeks, in which case HDD's are fine. The old saying was "disk is tape", and often tape can in fact do the job.
@Daniel , think you said got the Wishosting 5950X? Would you post a benchmark and recommend the VPS?
I believe you are looking for this
https://talk.lowendspirit.com/discussion/comment/75016/#Comment_75016
Posted by @snz
blog | exploring visually |
I posted a YABS here: https://talk.lowendspirit.com/discussion/comment/74102#Comment_74102 and a Monster Bench with Asia speed test results here: https://talk.lowendspirit.com/discussion/comment/74108#Comment_74108
I'm getting rid of it, but only because I got a dedicated server during Black Friday and I'm going to move everything onto that. Performance on the Wishosting box was very very good.
Daniel15 | https://d.sb/. List of all my VPSes: https://d.sb/servers
dnstools.ws - DNS lookups, pings, and traceroutes from 30 locations worldwide.