Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Data Storage Stats Cloud News Hardware Technology

8TB Drives Are Highly Reliable, Says Backblaze (yahoo.com) 209

An anonymous reader writes from a report via Yahoo News: Cloud backup and storage provider Backblaze has published its hard drive stats for Q2 2016. Yahoo News reports: "The report is based on data drives, not boot drives, that are deployed across the company's data centers in quantities of 45 or more. According to the report, the company saw an annualized failure rate of 19.81 percent with the Seagate ST4000DX000 4TB drive in a quantity of 197 units working 18,428 days. The next in line was the WD WD40EFRX 4TB drive in a quantity of 46 units working 4,186 days. This model had an annualized failure rate of 8.72 percent for that quarter. The company's report also notes that it finally introduced 8TB hard drives into its fold: first with a mere 45 8TB HGST units and then over 2,700 units from Seagate crammed into the company's Blackblaze Vaults, which include 20 Storage Pods containing 45 drives each. The company moved to 8TB drives to optimize storage density. According to a chart provided in the report, the 8TB drives are highly reliable. The HGST HDS5C8080ALE600 worked for 22,858 days and only saw two failures, generating an annualized failure rate of 3.20 percent. The Seagate ST8000DM002 worked for 44,000 days and only saw four failures, generating an annual failure rate of 3.30 percent." For comparison, Backblaze's reliability report for Q1 2016 can be found here.

UPDATE 8/2/16: Corrected Seagate Model "DT8000DM002" to "ST8000DM002."
This discussion has been archived. No new comments can be posted.

8TB Drives Are Highly Reliable, Says Backblaze

Comments Filter:
  • by by (1706743) ( 1706744 ) on Tuesday August 02, 2016 @05:13PM (#52631641)
    ...they use helium in the drives, so all your music sounds like Alvin and the Chipmunks.
    • Jokes aside not all the 8tb drives are helium based iirc.
    • to fix that, just run your 45's at 33.

      (GOML)

    • speaking of music, I have a hard drive based music player in my car and its been in the car since about 2003. it had whatever was the best IDE (not sata, too early for sata) notebook drive of its time. I put as much mp3 fileage on that as would fit.

      in all these years of hot and cold (bay area, but still we get some hot days when the car is left out in the sun) and vibration from daily driving, I have yet to replace that drive! it could have been a samsung or ibm or maybe hitachi. funny enough, none of t

      • by jedidiah ( 1196 )

        I have a 1.5TB Seagate drive that is still spinning.

        You can manage to get lucky even with the most notorious hardware.

        • Agreed. In the last 20 years, I've had only two drives fail out of more than a dozen - the original 30GB (was either Quantum or Maxtor) in my G4 Power Mac back in 2008, and a 2TB Seagate in a home-built machine in 2012.
  • Reliability is not so great an issue with raid systems being what they are today. What the bean counters fail to consider is the cost in man power required to replace seagate drives on a constant basis. Not just in the racks but process RMA's or the proper destruction and disposal of drives which may contain sensitive data.

    I wonder how those numbers would look if other vendors were offered an equal analysis period. I know WD was mentioned but it didn't appear they got equal share.

    Also: First. :)

    • by rthille ( 8526 )

      OTOH, given SSDs and the inability to guarantee the erasure of all data on the drive, unencrypted data should never hit the drives at all, and the key should of course also never be stored on the same media (unencrypted).

      That said, only my newer systems use encrypted volumes. My old drives I take apart and shatter/melt the platters.

      • by Hylandr ( 813770 )

        the inability to guarantee the erasure of all data on the drive, unencrypted data should never hit the drives at all

        Wow, that's not something I had considered. Thanks for that bit of info!

      • My old drives I take apart and shatter/melt the platters.

        I use a drill press to bore 8 or 10 holes straight through my old drives, then for fun I hit 'em with a hammer a few times while whispering my ex-wife's name. If the CIA/NSA/FBI wants my data bad enough to recover it after that, they're welcome to it.

        • Belt sanders are fun too!
        • Had to destroy a drive that had a lot of student data on it.

          Used a FN-FAL and 147 grain FMJ bullets at about 2700 feet per second.

          • Had to destroy a drive that had a lot of student data on it.
            Used a FN-FAL and 147 grain FMJ bullets at about 2700 feet per second.

            Yeah, I used to take them out and shoot them (it's fun) but I got tired of picking up all the little bits and pieces of the hard drive. I don't want to litter my shooting areas with fragments of stuff like that. But it is fun to shoot a hard drive and watch it turn to metallic confetti. :)

      • OTOH, given the inability to guarantee the erasure of all data on any drive, unencrypted data should never hit the drives at all, and the key should of course also never be stored on the same media (unencrypted).

        FTFY.

        You are absolutely correct though -- you should never rely on making data inaccessible via erasure instead of via encryption.

        Incidentally, the ST8000DM002s that we are talking about here support for OPAL which makes it trivial to "throw away the key" by sending the drive a reset-DEK command.

      • Re:Reliability (Score:5, Insightful)

        by AK Marc ( 707885 ) on Tuesday August 02, 2016 @08:57PM (#52632815)

        OTOH, given SSDs and the inability to guarantee the erasure of all data on the drive,

        Wow, SSD even survives incinerators? Where I used to work, the policy for drives was to open them up and strip them for their magnets, then have magnet fun. The platters made good frisbees, but the problem is that they go through car windows, and the dents in cars are deep, so frisbee with care.

        • The platters made good frisbees, but the problem is that they go through car windows, and the dents in cars are deep, so frisbee with care.

          And they can hurt too. Not that I'd have any personal experience with that....
      • by AmiMoJo ( 196126 )

        HDDs are even worse. They silently remap blocks all the time, with no way to erase the data off the old partially-failed ones. At least most SSDs use encryption, and doing an ATA secure erase command will wipe and regenerate the key.

        Also, SSDs are much easier to physically destroy in a shredder or with a blowtorch. Much less metal and armour on them.

    • by Jahoda ( 2715225 )
      Clearly, something is up with the Seagate DX series, but they have thousands of the DM models with a 2.66% failure rate. That's pretty remarkable. I've personally been very pleased with this series from seagate.
    • If you are looking at the Economics of this from a Labor / Parts issue, you're already behind the curve.

      The real value is the data on the drives, and how much you'll miss that data if and when it goes away. The problem is, nobody values their data, until it is gone. And then it is too late.

      • RaidZ2
        • by dbIII ( 701233 )
          Backblaze has redundant servers. It's the only way their "jam drives in where they will fit" pod designs make sense. At max load most of those drives will still be idle.
          IMHO raidz2 is a better idea but it's not what they are doing.
          • I was too brief. I was trying to say that anyone who thinks the "real value is the data on the drive" as the GP did and is still going to put that data on a single drive without some redundancy or at least backup has already lost the plot no matter how reliable they believe their single drive solution to be. The folks at BB have the right idea IMO.
    • by PCM2 ( 4486 )

      Reliability is not so great an issue with raid systems being what they are today.

      At the scale Backblaze is talking about, I would say it's an issue. Somebody has to keep all those drives in stock and walk back to a cage to replace them. It's not data loss we're worried about here, it's costs.

      • by Hylandr ( 813770 )

        And that's exactly what I stated in my post too.

      • by adolf ( 21054 )

        At the cost of these drives, isn't it just cheaper to rack the arrays on wheels, and shove them out the back door into the recycling trailer?

      • by jedidiah ( 1196 )

        Even for a small home array, it's terribly annoying when all of your drives die young and all at the same time (Seagate).

  • Totally not trying to be pedantic, but the Seagate model they reference should actually be the "ST8000DM002"
  • These are all platter drives, but you can only discover that in the comments at TFA.

    There are so few 8GB HGST drives, and they're so new, that the current data about them is statistically insignificant/unreliable, as is any model with less than 500 units and 200k drive days.

    • Why is heaven's name would anyone think that a cheap cloud backup company would be installing 8 TB SSDs in their massive storage arrays? Those things are thousands of dollars per drive!

    • Re:Not SSD Drives (Score:4, Interesting)

      by msauve ( 701917 ) on Tuesday August 02, 2016 @07:30PM (#52632409)
      "There are so few 8GB HGST drives, and they're so new, that the current data about them is statistically insignificant/unreliable"

      The numbers in the summary come from different places, because the first chart in the linked article, for the April-June quarter says:

      Seagate 8TB, 2720 drives, 35840 drive days, 3 failures (13 days average per drive, 3% annual failure rate)
      HGST 8TM, 45 drives, 3825 drive days, 0 failures (85 days average per drive, 0% annual failure rate)

      The second chart, from April 2013 through the end of June, doesn't show drive numbers, just days, failures, and rates. The numbers in the summary seem to be pulled from both.

      Assuming that the 8TB drives stay in use until they die, here's where the stats seem to come from (drive days/# of drives). Drive days pulled from the "all time" chart, # of drives from the latest quarter chart):

      22858/45= 507 days average use HGST HUH728080ALE600
      44000/2700= 16 days average use Seagate ST8000DM002

      Now, anyone experienced with Seagate wouldn't expect the 3.3% annualized failure rate to be that low in another year and a half. The HGST rate _is_ after almost a year and a half.
  • High failure rate (Score:5, Insightful)

    by JustAnotherOldGuy ( 4145623 ) on Tuesday August 02, 2016 @05:44PM (#52631839) Journal

    "...the company saw an annualized failure rate of 19.81 percent with the Seagate ST4000DX000 4TB drive"

    A failure rate of almost 20% in a data center? Geez, that's pathetic.

    A temperature-controlled environment, clean power, low shock and vibration, and 1 out of 5 still fails? Remind me never to buy Seagate. Oh, wait, I already vowed never to buy another Seagate- about 10 years ago after experiencing their unequaled propensity to die fast and hard.

    Maybe other people have had better luck with Seagate than I have, but for me they've always been disappointing.

    • by PRMan ( 959735 )
      I had one of those. 1 year warranty. And it died in 14 months.
      • I had one of those. 1 year warranty. And it died in 14 months.

        Thanks for this. I was just about to read the article, till I saw your comment that you had 1 and it failed in 14 months. Saved me all the trouble.

    • Depends on how old the drives are. When I was working in a data centre I was having a lot of hard drive failures but they were laptop SCSI drives in blades that had been running continuously for over three years so it was expected for them to be hitting their end of life. (It was around 2007 so that's why the those were the drives.)

      So if the 4TB drives are a few years old getting a lot of use then I can see why they would be failing at a high rate. If they are newer then I would be worried. I'm more concern

      • Perhaps they don't keep the temperature as cool as they should in order to save a few bucks?

        That could be, but the other brands were failing at a much lower rate so it does make you wonder about the overall longevity of Seagate drives.

        • by dbIII ( 701233 )
          Yes but if the drives are failing due to heat it's not really useful information for someone else with file servers with very good airflow.
          Their results should be taken as an strong indication and not 100% reliable unless you have a situation very similar to Backblaze.
      • They are using consumer drives for data center needs, this is the big reason their failure rate is relatively high. Still, with the redundancy, it is cheaper to run this way. Rumor is that Google ran that way with off the shelf computers. Use dirt cheap commodity products that are good quality, have exceptional redundancy, throw them away as they implode.

        • by adolf ( 21054 )

          Is there a real mechanical difference between "consumer" and "enterprise" drives, these days, at the bleeding edge of the storage-per-unit curve?

          Mostly I see differences in firmware, which (IMHO) ought to be end-user selectable anyway.

          (Before anyone replies, I chose those words carefully to avoid outliers like Raptor little-drive-in-a-big-heatsink configurations, or any other stuff that puts any metric other than capacity-per-dollar as a primary criteria.)

          • by vovin ( 12759 )

            Yes. There are specific mechanical differences in build quality around stability and vibration dampening between enterprise and consumer level drives. It's more than just flashing some different firmware (but that may be a part of what differentiates drives).

            The best indicators are length of warranty and specification of purpose, in my experience.

            • by adolf ( 21054 )

              Don't you mean "damping"?

              And isn't the spindle motor still affixed firmly to the chassis, which is affixed to the enclosure?

              • Don't you mean "damping"?

                That's one of my pet peeves too, but we've lost that war and now we'll never know for sure if people mean getting things soggy or cancelling oscillations.

      • by dbIII ( 701233 )

        Perhaps they don't keep the temperature as cool as they should in order to save a few bucks?

        They have a few things about their "pod" design on their website and if you look at it you will see that you are correct. It looks like an utterly insane design until you consider that the things are mostly idle, so typically don't generate a lot of heat, and that they have distributed servers with distributed workloads so they can afford to lose one entirely for a while.

    • Having owned a lot of the Seagate 2-3TB drives 20% is way too low from my experience. I think I have 2-3 still running out of a batch of 8. Including RMAs.

    • If you wrote off every manufacturer that hit a 20% annualized failure rate you would now be unable to buy any drives.

      • by AmiMoJo ( 196126 )

        True, but Seagate are particularly bad as they have a history of releasing unreliable drives on a regular basis and then just endlessly swapping them for more unreliable drives until the warranty expires.

        For individuals (not datacentres) HGST is the best bet. As well as being generally very reliable they do proper testing and fix their problems. They might cost a bit more, but what's a few quid here and there to avoid all that hassle?

    • Re:High failure rate (Score:5, Informative)

      by waveclaw ( 43274 ) on Tuesday August 02, 2016 @08:44PM (#52632767) Homepage Journal
      If only that blackbaze pods were even remotely like other datacenter equipment. As far as vibration is concerned they are still pretty much a torture test for anything with a spinning motor. Minimal vibration protection while being mechanically coupled to a weak foundation while crammed in as tightly as geometry allows.

      A temperature-controlled environment, clean power, low shock and vibration, and 1 out of 5 still fails

      The density and structure of a pod is only temperature-controlled in that it is going to get hot, quickly.

      Remind me never to buy Seagate.

      The numbers from Backblaze you'll actually see that you shouldn't buy one particular desktop model of hard drive for your "datacenter." Numbers like Backblaze releases are quite fascinating in that you can analyze them. You can find which models at any vendor to prefer or avoid.

      Oh, wait, I already vowed never to buy another Seagate- about 10 years ago after experiencing their unequaled propensity to die fast and hard.

      Sorry to hear about your loss. I hope you kept backup copies. If not, I hope it taught you that if you don't have a copy then you don't have a backup.

      It is certainly reasonable to avoid a vendor when a lot of their products from many lines have defects at a given time. Seagate's desktop line certainly took a hit from the initial Backblaze numbers. The DM1000's huge failure rate is almost as legendary as the IBM Death Star line or Maxtor click-of-death. But stuff from before or after a given run may have better or worse quality. And of course even manufactures can get batches of bad parts. (Hidden variables like that are one of the reasons why the singular of data isn't anecdote.)

      I also wonder if we'll ever get numbers from Backblaze on things like the actual temperature, decibels and power these drives lived through. More than just avoiding a particular model. It would be nice to know how hot, loud and nasty you can get before your commodity-class storage starts pooping out.

      • by dbIII ( 701233 )

        I also wonder if we'll ever get numbers from Backblaze on things like the actual temperature

        We got some numbers earlier from another story but they were entirely useless since they were average temperatures on machines that are idle most of the time. Maximums could tell us something useful. I won't bother linking to the earlier story because it was like a high school project. If it wasn't for their niche use of distributed archiving where their machines are unlikely to get very hot (but possibly individu

      • Brian from Backblaze here.

        > I also wonder if we'll ever get numbers from Backblaze on things like the actual temperature ... power these drives lived through.

        The raw data dump includes drive temperatures as reported by "smartctl". You can find a dump here: https://www.backblaze.com/b2/h... [backblaze.com]

        We analyzed the failures correlated with temperature in this blog post in 2014: https://www.backblaze.com/blog... [backblaze.com]

        In a conversation with some of the Facebook Open Storage people, they said hard drives have
    • They seemed fine to me until they bought Maxtor in 2006; then you never knew what you were going to get, a Maxtor w/ a Seagate badge or a HDD that might have less than a 20% annual failure rate, in the first year.

      I'd guess since then; they closed all the Seagate factories and run exclusively from the cheaper Maxtor facilities. (all of that is a guess, but MBAs always think reducing cost > * so probably in the ballpark).

    • by dbIII ( 701233 )

      A temperature-controlled environment

      Not exactly. Take a look at their web page and their "pod" design. They have jammed in drives where they will fit but they have very different loads to a normal data center so can get away with it most of the time. Unlike a normal data center they will never be running all the drives in a "pod" flat out so something that would be a smoking mess elsewhere merely has drives in the middle that cannot shed heat properly.
      That's not to excuse Seagate, it's just to point out

      • by tlhIngan ( 30335 )

        Not exactly. Take a look at their web page and their "pod" design. They have jammed in drives where they will fit but they have very different loads to a normal data center so can get away with it most of the time. Unlike a normal data center they will never be running all the drives in a "pod" flat out so something that would be a smoking mess elsewhere merely has drives in the middle that cannot shed heat properly.
        That's not to excuse Seagate, it's just to point out why failure rates are higher than what

        • by dbIII ( 701233 )
          Indeed, as I could have told you or you could have learned from their website.
          I suggest you take a look at their description of how they pack their disks in and you will understand the heat issue I mentioned above.
        • by jabuzz ( 182671 )

          If you are doing a file server, then SATA multiplexers are more than adequate especially for what Backblaze are doing. Let put it this way 45 drives is 9 SATA multiplexers which at 3Gbps SATA is a total of 27Gbps throughput, more than enough to saturate two 10Gbps Ethernet links.

          However you can get 6Gbps SATA multiplexers these days, and Backblazes latest pods have 60 drives, so that is 72Gbps, which is nearly enough to saturate a couple of 40Gbps Ethernet links.

          People always and I mean *ALWAYS* overestimat

        • Brian from Backblaze here.

          > I think their pods only have GigE interfaces

          Originally (up until 3 years ago) that was true, but all new pods have 10 GbE interfaces, and 100% of the pods in our "Backblaze 20 pod Vaults" have 10 GbE interfaces. And there are some really strange (and wonderful) performance twists on using 20 pods to store each file: when you fetch a 1 MByte file from a vault, we need 17 pods to respond each supplying only 60k bytes to reassemble the complete file from the Reed Solomon.
  • by Linsaran ( 728833 ) on Tuesday August 02, 2016 @06:32PM (#52632093) Homepage
    I presume there's some detail I'm missing here since we did not have 8 TB hard drives 120 years ago.
  • Come back in 3 or 5 years and tell me out of all the 8TB sold in 2016/2017 just how many are still functional and THEN what the failure rate is/was.

    My "prediction" is it will most likely be that there is an 70% failure rate with Seagate being the top offender.
    • Re: (Score:3, Insightful)

      by Anonymous Coward

      Come back in 3 or 5 years and tell me out of all the 8TB sold in 2016/2017 just how many are still functional and THEN what the failure rate is/was.

      My "prediction" is it will most likely be that there is an 70% failure rate with Seagate being the top offender.

      By then the data is worthless to anybody except the manufacturer. We necessarily have to accept a deficit of statistical quality to make forward predictions that are actually worth something, like knowing if I'm building a SAN, what drives I should buy.

      In 5 years, I'm not going to be buying 8TB drives, so knowing what the failure rate for some 8TB drive was is inconsequential. Either HDDs continue to improve and I buy 32TB or larger HDDs, or they don't, and I'll be filling my SAN with 8TB or larger SSDs, Xp

    • by dbIII ( 701233 )
      I've got a large cardboard box full of dead Hitachi drives for some reason, along with a smaller number of dead Seagate drives. I've stopped buying both. Maybe I just have less dead Seagates due to being turned off that brand earlier.
  • by dbIII ( 701233 ) on Tuesday August 02, 2016 @08:47PM (#52632777)
    If it's working for them in their packed in boxes with crap airflow and really poor heat transfer then it will work even better in conventional file servers with hot swap drives at the front and a heap of airflow.

    Take it with a grain of salt when Backblaze say a drive is crap since it may only be crap in their very hostile environment, but if they didn't break it then it's very likely to work well anywhere.
    • Take it with a grain of salt when Backblaze say a drive is crap since it may only be crap in their very hostile environment, but if they didn't break it then it's very likely to work well anywhere.

      What's the typical drive temperature in Backblaze's cases in their environment?

      • by dbIII ( 701233 )

        What's the typical drive temperature in Backblaze's cases in their environment?

        They are not saying apart from an entirely useless average for machines that are idle a lot of the time with drives spun down. I'm not entirely sure they know or care what their maximums are and how long drives are hot for.
        I suggest taking a look at their web pages that describe their pod designs to get a better idea of the situation instead of just taking my word that they shove drives in wherever they will fit without taking h

  • I had a 1st Gen Seagate 80GB SATA fail last month after 11 years and change, of 24/7 daily operation and very few power-off cycles.

  • Contrasting anecdote (Score:5, Interesting)

    by billcopc ( 196330 ) <vrillco@yahoo.com> on Wednesday August 03, 2016 @01:57AM (#52633867) Homepage

    I'm an independent white-box NAS guy, and with the exception of the truly awful 1.5TB Seagate drives from 2008-2009 or so, I have not had any significant problems with them. I've got a few thousand 3 to 8 TB drives deployed with my clients, most of them cheap consumer drives (not even the "NAS" editions), and the annual failure rate is roughly 2% across all brands. This has been consistent for many years and I factor these stats into my costs and warranty projections. I have

    The thing that bothers me about Backblaze, and the reason why I have a very hard time taking their results seriously, is the way they design their pods. They take a custom fabbed chassis, then fill it with the most ghetto components known to man: SATA port multipliers, ultra-low-end HBAs, dual "gamer" power supplies, very substandard cooling, and until recently they used super sketchy desktop boards. It's only last year that they finally changed the board for a Supermicro, primarily to get 10GbE very cheaply. For that same money, you can buy a ready-made 60-bay Supermicro chassis with redundant power and SAS - and a warranty. Hell, I bet SM would deliver directly to Backblaze's doorstep *and* give them a friendly discount.

    Anyway... epic digression aside, when people ask me which brand is better, I tell them to buy whichever has the best warranty. A hard drive *will* die, the question is when, so the only logical course of action is to plan around its inevitable demise by keeping backups and redundancies, and learning the ins and outs of the RMA process.

    • Well, think of it this way ... If the hard drive can survive in the environment provided by Backblaze, then they will certainly do better in a home computer properly built and will last longer than the hard drives that fail prematurely in the Backblaze environment. There is nothing better to test if a hardware is weak than to put it in a hostile environment and see what happens
    • The thing that bothers me about Backblaze, and the reason why I have a very hard time taking their results seriously, is the way they design their pods. They take a custom fabbed chassis, then fill it with the most ghetto components known to man: SATA port multipliers, ultra-low-end HBAs, dual "gamer" power supplies, very substandard cooling, and until recently they used super sketchy desktop boards. It's only last year that they finally changed the board for a Supermicro, primarily to get 10GbE very cheap

  • by adolf ( 21054 ) <flodadolf@gmail.com> on Wednesday August 03, 2016 @03:01AM (#52634027) Journal

    What I've learned from reading the comments here is that people are just as clueless when it comes to storage reliability as they ever were, and are just as capable of throwing the baby out with the bathwater as at any other time.

    Dear Slashdot: Never change.

  • by bravecanadian ( 638315 ) on Wednesday August 03, 2016 @08:40AM (#52635221)

    the most unreliable.

    That is why you buy in the sweet spot for best value and let someone else prove new technologies and HD densities for you..

    • Yeah I always shop for best value. So I now have 8TB drives in my system.

      Oh what you didn't realise that 8TB SMR drives were the cheapest per megabyte before posting?

"Our vision is to speed up time, eventually eliminating it." -- Alex Schure

Working...