Saturday, July 18, 2009

WDTLER and WDIDLE3

Western Digital states that the Caviar GP drives are not recommended for RAID arrays, and that instead you should get their enterprise RE-4 drive. But there's a $100 price difference between the two right now! ($230 vs $330 at Newegg). So I decided to risk it and build my RAIDZ array with the GP drives. Check back in a couple of years to see if I have any regrets!

In building the array I discovered two very important fixes I needed to make to the drives, in order to make them behave more like the RE-4 drives.

First was to enable Time-Limited Error Recovery. This tells the drive to NOT make ridiculous efforts to recover a sector that it's having trouble reading, and to instead quickly report back an error that the sector could not be read. See, if the drive takes too long to answer a read request, the RAID level will assume it has gone kaput and boot it from the array. By enabling TLER, you prevent this from happening, thus letting the RAID level handle the error. Use the WDTLER utility to do this.

Second, the GP drives have a feature called Intellipark, which parks the drives heads (moves them off the platters) so as to reduce air resistance drag on the motor that spins the platter (every little power saving counts!). You can hear it clearly when it kicks in: it makes a slight clicking sound when parking. When you need to use the drive again, there's a clear delay and new clicking sound while the disk head unparks.

While nice in theory, it's unfortunately rather frustrating in practice. See, modern OS's use write caching to gather up a bunch of writes in RAM, and only actually write to the hard drives in bulk, every 10-30 seconds. The GP's idle timer is 8 seconds by default (a rather poorly chosen default). As a result the drive incessantly parks and unparks as random services write a few bytes here and there. Eventually, too many such cycles (I've read in forums that 300,000 is the spec'd limit) will cause wear & tear and increase the chance of failure. This thread on the Linux Kernel mailing list gives some details. While this is a problem even in non-RAID settings, it's exacerbated by RAID because now you have N drives that park/unpark, in sequence.

Fortunately, there's another utility called WDIDLE3 that lets you increase the time (to a max of 25.5 seconds, which I don't think is enough), or to disable the timer entirely, which is what I did.

If you don't run Windows and thus cannot directly run these EXEs, one simple workaround is to slipstream them into the Ultimate Boot CD as described here. Those instructions are for WDTLER specifically, but simply slip in WDIDLE3 at the same time. Keep the resulting CD accessible since you'll likely need to run it again if you have to replace any drives in your array!

As best I can tell, Western Digital does not officially support these utilities, so use them at your own risk. They both worked fine for me, on OpenSolaris, but your mileage may vary!

10 comments:

  1. latest versions of firmware don't allow TLER modifications

    ReplyDelete
  2. The limit is 300000 seconds, not 25.5 seconds. From the help file:

    USAGE
    WDIDLE3 [/S[]] [/D] [/R] [/?]
    where:
    /S[] Set timer, units in seconds. Default=8.0 (8.0 seconds).
    Resolution is 0.1 seconds from 0.1 to 12.7 seconds.
    Resolution is 30 seconds from 30 seconds to 300000 seconds.
    Note, times between 12.8 and 30 seconds will be set to 30 seconds.
    /D Disable timer.
    /R Report current timer.

    ReplyDelete
  3. cheers - you helped me get there in end. managed to disable head parking on my mac pro.

    ReplyDelete
  4. "Check back in a couple of years to see if I have any regrets!"

    ....So any Regrets??

    ReplyDelete
  5. Wow, it has been a bit over a couple of years already!

    Absolutely no regrets: my original drives are still running just fine,
    despite fairly heavy usage over this time. I installed a hot spare in
    there, and it's never been used.

    ReplyDelete
    Replies
    1. Meanwhile, I never did this when I got my drives. My drives have been running a few years and show up with 400,000+ head parks in the SMART diagnostics. 2/3 failed.

      Delete
  6. Thanks for the response...
    Unfortunately.. from spending the last 3 days or so reading... It looks like you can no longer run the wdtler on the newer drives..
    And with hardware RAID, it sure seems like it could cause a major issue, I was looking forward to using the wd green drives.

    ReplyDelete
  7. Regarding the second point you've remarked in the article: I experienced the same on my netbook (Acer Aspire One), also supplied with a WD disk. Well, it turned out that you don't even need wdtler. I managed to change the settings with hdparm. Here's what I did:

    hdparm -B254 /dev/sda
    hdparm -S180 /dev/sda

    Now, no more clicking and the disk has become very silent, too.

    Note that this must be executed upon each startup (/etc/rc.local) and after waking up from standby as well.

    ReplyDelete
  8. "Check back in a couple of years to see if I have any regrets!"

    ....So any Regrets??

    I am going to try using the new 4TB Green drives

    ReplyDelete
  9. No regrets! The drives did great, and I never hit a failure. I've now replaced that original file server, after 3 years of service, with a newer one ...

    ReplyDelete