Tuesday, November 10, 2009

Recovering from Terastation Meltdown, Without a Backup

On Sunday night, I was doing some video editing for a friend, when I realized that my Terastation (original) was not running. I glanced in my server closet and discovered 4 green HDD lights (all on solid), a diag light (flashing), and no fan or hard drive noise coming from the unit.

I cycled the power and the unit started up again. It worked for a short time (performing an array check) and then would quit and go into the same state again. Each time I powered it up, it lasted between 30 seconds to 3 minutes or so.

I called Buffalo tech support, who graciously offered me assistance (even though this device is sorely out of warranty). The tech recommended I turn it off, hold the "init" button in and power it on, while continuing to hold the init button down for 15 seconds. He recommended I wait until the device powered up and completed the array check, then perform a firmware upgrade. He seemed fairly confident that this would solve my problem, so we hung up.

No sooner did the phone disconnect, then the Terastation shutdown again with a "click". The init button routine seemed to do absolutely nothing. I decided I was on my own, and I attempted a firmware upgrade.

The firmware upgrade (available from Buffalo Technology's website) ran, and seemed to complete successfully. However, when I booted the Terastation back up after the upgrade, it came up in "EM mode" (Buffalo's name for the recovery firmware, stored on flash, that only allows you to upgrade/write the firmware to the hard disks). This was probably what the tech was trying to get me into when we did the INIT button thing... I again attempted the firmware upgrade. This time, while writing the firmware, I shuttered as I heard:

* click *

... Sure enough, the device had shut off, and it would no longer boot at all (not even to EM mode!) My Terastation was a brick. I searched around online and found that it's possible to recover the device with some soldering of a connector that I didn't have, a JTAG cable that I didn't have, and a firmware upgrade. At this point, I was panicking a little, as a backup of the device was another thing I didn't have (yes, yes, shame on me.)

Recovering the Data

My next feat was to get the data off the drives, which were in RAID 5 configuration, without the Terastation. I found several articles online, in particular this one from UFSexplorer . Basically, you can hook the 4 PATA drives up to a computer running Windows, and use UFSexplorer to build the array virtually, and copy the data off of it. Here's the basic procedure I followed.

Obviously, do all of this at you're own risk, but presuming you're doing this because you have no backup, you're probably into risk taking anyway.
What you'll need:

-We'll be running a total of 5 drives (the 4 Terastation drives, plus one Windows system drive), so you'll need a full tower PC case, with a hard drive running Windows.

-It should have at least 4 free power connectors and 2 IDE controllers on board. You can improvise, such as with an extra IDE controller card if you have one, and a 'Y' adapter drive power cable, or an external drive power supply. You'll probably need to disconnect any DVD drives to free up some power and data connectors.

-An external PATA IDE->USB converter (for the 5th drive, which will actually be drive #4 on the Terastation). Again, you can improvise with things like an extra IDE controller card, if you have one.

1. Assuming you have the Terastation "original", you'll have to pretty much completely disassemble it to get to the drives.. This involves the removal the outer case (many screws), a metal guard (many screws), the system board (many screws), one more small metal guard (2 screws), and finally, the drive cage (2 screws). The drives are numbered 4,3,2,1 from left (system board side) to right. You'll probably need to remove drive #4 completely, as you'll be connecting that to your USB hard drive adapter.

2. Set the jumpers on all of drives, and connect them to power.

Note: you may be able to get away with powering the drive array with the Terastation power supply, but in my case, I later found out that the power supply was bad, which was responsible for this whole mess. It's advisable to use as little from the Terastation as possible unless you are certain you know what's wrong with it).

After a lot of futzing around to get the drives all powered and connected, and changing jumper settings on all the drives, I finally got them all to be recognized by the computer. Here's a full description of the setup, by drive:

Windows system drive:
-Primary onboard IDE controller, jumpered as master.

Terastation drive #1:
-Primary onboard IDE controller, jumpered as slave

Terastation drive #2:
-Secondary onboard IDE controller, jumpered as master

Terastation drive #3:
-Secondary onboard IDE controller, jumpered as slave

Terastation drive #4:
-USB IDE adapter, jumpered as master or cable select

Get your Windows PC all booted up after verifying the drives are all visible to the BIOS setup. They won't show up in Windows as drive letters, because they don't have recognizable file systems. Don't panic.

3. Now you'll need to download (and eventually buy, for about $75 USD) UFS Explorer Professional Recovery. It's worth the money. One major issue this gets around is that the RAID was originally built on a PPC, big-endian-based system, and you are now trying to access it on a PC, which is little-endian. If you don't know what that means, just continue.

4. Follow the instructions on the UFS Explorer site, specific to the Terastation. Here's the basic steps:

-When you launch the program, you should see all 4 drives with a mess of partitions. Some will be "XFS", and some will be "Unknown".

You'll most likely need to need to use the "Hex View" function to establish which drive is which. This is documented on the UFS Explorer site, as well. View each of the large partitions (232GB on mine) and pay attention to the very first 4 bytes or so - they should help you identify what part of the RAID 5 you're looking at. If you connected the drives as I did above, your array order should be 2 (Superblock), 3 (iNode block), 1 (parity), 4 (parity).

-Click the "RAID Builder button"

-Choose the partition option, not the disk option (I forget the exact wording)

-Go through and add all 4 of the really big partitions to the right side.

-Use the "move to top" and "move to bottom" buttons to get them in the correct order

-Click Ok and you should see a new partition on the bottom of the list. Right click and choose "explore". If you see all of your folders on the right, you can now copy your data somewhere (i.e. to a network or USB drive). If you don't see all of the folders, or you get "error in filesystem", you probably have your drives in the wrong order. Right click to close the partition and try again in a different order.

After almost 2 days of copying, I have all of my data back, and I am now building a FreeNAS-based NAS, using a dual eSATA enclosure and an older Dell. Terastations are great, but I'm too strapped for cash to replace it now. Maybe someday I'll resurrect it by replacing the power supply, but right now I just don't have the time. The other thing I plan to do, ASAP, is setup Jungledisk to sync the NAS files with an offsite backup.