Jump to content
RESET Forums (homeservershow.com)

RAID array trashed...any suggestions before I cut my losses?


fieldhouse
 Share

Recommended Posts

The story:

I have a RAID volume on my main WHS2011 box that was getting pretty full - five 2TB drives in a RAID5 array with a sixth drive as a hot spare. I was down to around 300GB free space and debated whether I should move some stuff off or expand the array. I'm using a RockeRaid 2680 and have used OCE several times before to add drives as I moved chunks of stuff off of my EX490. So I thought, why not? I'll add another drive. I still have two unused ports on the 2680, might as well.

 

I had another 2TB drive that I had pulled out of the TR5M that used to be plugged into the EX490 but I wanted to be sure it was clean so I fired up Seatools for DOS on an old beater box and ran diagnostics and performed a full erase on the disk just to be on the safe side. Then I popped the drive into an open bay in the Icy Dock MB974SP-B (gotta love those catchy names Icy Dock comes up with). I started the OCE process through the GUI and after seeing it settle around 16 hours left to complete the expansion I left it to its own devices and hit the hay.

 

The next day I didn't remember to check the expansion progress before I took off for work. So when I got home I was surprised to find that, not only was the process not complete but that I couldn't connect to the 2680 controller through the GUI. WHS was ticked off because one of its volumes wasn't where it should be (obviously) but was still up and running since the OS is on a standalone drive using the controller on the motherboard. I could still access the two arrays on the other RAID controller in the system - a RocketRaid 622 - but I couldn't access the 622 through the GUI either. So, thinking I was taking the safe approach, I left things as is overnight to make sure the rebuild had time to finish on the off chance the 2680 was still working on the expansion.

 

Unfortunately, things weren't any better the next day and I decided I should go ahead and shut down / restart. When I brought things back up the 2680 claimed it was still in the process of rebuilding the array. I had a pretty good idea where things were going at this point as Windows after the restart claimed I had an 8GB partition of type RAW with a total disk size of 10TB. I went ahead and let the HighPoint finish doing what it thought needed to be done and noticed that the system seemed glitchy - becoming unresponsive at times to keyboard or mouse. After the rebuild of my 10 terrabytes of nothing finished, I checked the drives. The Seagate green drive I had added was in a funky state. The 2680 claimed the status was "normal" but when I tried to view SMART info through the RAID Management Console it returned an error.

E 5/9/2012 7:31:33 PM An error occured on the disk at 'ST32000542AS-9XW0ADE5' at Controller1-Channel2.
Gee, thanks. How helpful.

For the record, of the 28 drives currently on my spreadsheet (500 SATA or larger) only two have failed and both were 2TB Seagate Barracuda LP's. In fact, now that I think about it, this may be the warranty replacement for the other failed Barracuda LP. hmmm.

 

Anyway, the only thing I thought I could do that made any sense was to pull the "new" drive and have the controller use the existing hot spare drive to rebuild the array. After waiting another eternity for that to complete I have been left with a very large, and apparently unformatted, RAID volume. Which brings me to where I am now.

 

For context, this is all running virtual on a Server 2008 R2 Hyper-v host, Core i5-750 on a Gigabyte P55-USB3 motherboard with 16 GB RAM. The RAID volume was being used as a pass-through physical disk inside the WHS 2011 VM just for data. The VM OSes are hosted on a RAID5 array on the RocketRaid 622 along with another RAID5 array that I have set up as an iSCSI target (long story). The WHS 2011 is (was) the main purpose of the hardware but I was running it virtual so I could bring up other OSes when needed and not install a bunch of crap on the WHS OS. The system itself has been stable and behaving well for quite a while (since WHS 2011 was beta) and I generally have had no complaints about the hardware. I'll also mention that all of our digital photos and documents are safe on another drive so what I'm missing is primarily my movies, tv shows, kids videos, anime, ebooks. It sucks that it's not on the drives but it's not the end of the world (we still have half a year before December 21st rolls around.) :D

 

Is there anything I could have (should have) done differently? and, any suggestions on what to do now? i'm tempted to call the 7.5 TB of stuff gone and take the hardware out in the woods for target practise with an assault rifle...

Link to comment
Share on other sites

Is there anything I could have (should have) done differently? and, any suggestions on what to do now? i'm tempted to call the 7.5 TB of stuff gone and take the hardware out in the woods for target practise with an assault rifle...

 

I would pull the green drive and see if its possible to a consistency check in the raid software

 

I would try and do without Green,LP,and mixed mfg in raid arrays, try and use them for array backups. much better suited for this task.

Link to comment
Share on other sites

Thanks for the feedback. What really bugs me is that I was just in the process of planning out a move of the drives/arrays to a Norco 4220 case and doing away with the TR5M eSATA enclosures because I was considering them to be the "weak link" in my storage. I guess on the positive side, I don't need to worry about the data on the large array when I move the drives from their current locations to the new chassis. <_<

I would pull the green drive and see if its possible to a consistency check in the raid software
The evil LP drive is out of the array. The rebuild after its removal supposedly completed successfully but on the off chance that it will help I've kicked off a "verify" of the array as you recommended. It ticks me off that SeaTools claims the drive in question is still good even though it very obviously is not. Placing it back in the EX490 causes the 490 to pause / hang, similar to the problems it appears the 2680 had with the drive. It's disappointing that the drive manufacturers are producing diagnostic tools that won't detect severe issues with the basic functionality of their products.

It is equally disappointing that HighPoint would lack whatever logic is needed to detect a non-responsive drive and work around the suspect drive. But I guess you get what you pay for and the 2680 definitely isn't the Cadillac or Mercedes of the RAID controller market.

I would try and do without Green,LP,and mixed mfg in raid arrays, try and use them for array backups. much better suited for this task.
Hindsight is always 20/20 and I would have preferred not mixing drive manufacturers. I am grouping relatively similar disks in my arrays - 7200 RPM 1.5 TB drives ini a different RAID array from the Green / LP drives but I'm trying to work with what I have. My philosophy with my v1 rig was to purchase as many different drives as possible to minimize multiple failures due to common issues across batches of drives. This worked really well with DE in v1 but it obviously doesn't lend itself well to a hardware RAID.

I've been burned in the past with arrays built with the same model/manufacturer date where a common flaw caused a cascading failure of drives and the added stress of rebuilding the array as each drive failed eventually caused catastrophic loss of the entire array so there are risks to any single volume RAID solution.

 

At this point I'm planning on moving back to a solely mirrored model since that allows for recovery of data even in the event of multiple failures as opposed to this whole array corruption BS. This should also have the added benefit of being able to spin drives down when they aren't in use - saving power costs.

 

WHS 2011 has been a huge disappointment from a reliability perspective. On v1 with DE I only ever experienced data loss once early on (when it was still beta) and that was due to some logic problem in the algorithm MS was using to rebalance files during disk removal from the drive pool.

 

Given how fragile RAID volumes apparently are when performing a rebuild or RAID level migration / online expansion, if I do use RAID in future builds I will stick with fixed-size volumes and take the longer route of copying/verifying data from one RAID volume to another when space becomes an issue. It takes a lot longer and requires more disks but generally should eliminate the possibility of waking up to a massive whole-volume corruption and complete data loss.

 

Does anyone have a suggestion for a JBOD controller to use with a SAS expander card? I was planning on using an Intel RES2SV240 with the Norco 4220 chassis but I'm doubting the RocketRaid 2680 would be up to the task considering how it's been handling five or six drives.

Link to comment
Share on other sites

First let me say ,you have a gift for writing :)

 

I'm on my phone so I will reply more later,

Link to comment
Share on other sites

lol. does that make me "special"?

 

thanks again. Not in any hurry (obviously, since this drive's been out of commission for several weeks now).

Link to comment
Share on other sites

I do prefer somewhat higher end RAID cards: LSI, 3ware (now owned by LSI), Mylex. That said, many members are using RR cards very successfully - pcdoc in particular. The RR 2720 seems to be very popular. It's unfortunate that the higher end cards can be a bit expensive.

Link to comment
Share on other sites

Ditch RAID and use StableBit DrivePool. Even if you have a problem with DrivePool you still won't loose your data. All data is in NTFS format and can be read on any system. I've been using DrivePool since early beta and haven't lost any data. I've found multiple bugs that were fixed almost immediately by the developer. It's a very impressive piece of sftware. I also use StableBit Scanner to scan my drives. I think you would also have better luck with SeaTools checking your drives. You'll now be able to run it on the drives without having to remove them from the RAID ARRAY.

Link to comment
Share on other sites

Have you thought about one or several Nas boxes? I have Three Dlink 323 mapped to my Hyper-V server. I have them set up for miroring. I am in the planing process to build a 4bay larger raid 5 aray but that is a huge expence.

 

Its just some thoughts and it is easyer to manage small amounts of data than a large amount.

 

http://www.qnap.com/useng/index.php?lang=en-us&sn=862&c=355&sc=688&t=695&n=3888

m_741_20111129093342_65028.png

Link to comment
Share on other sites

I do prefer somewhat higher end RAID cards: LSI, 3ware (now owned by LSI),...The RR 2720 seems to be very popular. It's unfortunate that the higher end cards can be a bit expensive.

I looked at the 2720 a bit this morning. I think if I'm going to shell out more money for a new controller it would be wise to pick up a full hardware RAID controller with onboard cache, etc.

Ditch RAID and use StableBit DrivePool. Even if you have a problem with DrivePool you still won't loose your data. All data is in NTFS format and can be read on any system. I've been using DrivePool since early beta and haven't lost any data.

I'm really tempted to head this direction now. The downside is that it uses a lot more disk capacity than a RAID5 array. If HD prices were lower I'd grab StabeBit in a heartbeat but with current prices it might actually end up costing less to get a LSI or even an Adaptec / Areca controller. Still, given the complete lack of recoverability with RAID5 this may be the best option.
I also use StableBit Scanner to scan my drives. I think you would also have better luck with SeaTools checking your drives. You'll now be able to run it on the drives without having to remove them from the RAID ARRAY.
This is actually a really attractive feature of the software solutions and something I miss from my days running v1.
Have you thought about one or several Nas boxes?
I have a couple smaller NAS boxes that I've been playing with (PogoPlug, Zyxel NSA210 w/ FFP, and recently a Synology DS-212j). From my experiences with them so far you end up paying more for the NAS than you'd pay for equivalent components in a PC. You do have the advantage of possibly lower power utilization depending on your setup. On the other hand, you're going to be using a lot more network bandwidth moving files to and fro. The QNAP looks like a nice little box and might work well with an external USB3 RAID array like the CineRAID CR-H458. But I'm still leaning towards some form of mirroring where I can pull the data off the drives even if half of my disks went up in smoke.

 

Has anyone messed around with Storage Spaces on Server 2012 much? I wonder how that compares with StableBit as far as the flexibility of configuration. I've been messing around with some different linux distros recently and have been getting spoiled with the ability to quickly throw together a directory structure made up of shares spread across multiple systems. It does't completely solve the drive pooling problem but it sure makes finding things easier. I've used DFS in the past but it seems like MS has gone out of their way over the years to make it more difficult to implement. Is Storage Spaces limited to local drives? Or can it include network locations as well?

Link to comment
Share on other sites

Not to sound defensive but using like drives has always been deemed as a mandatory step in any raid setup. Using different drives on "any" raid controller can cause issues no matter the price, though some are more forgiving than others. Both the 2680 and the 2720 will do fine if a drive goes bad, will survive a reboot numerous time during a rebuild, and just about anything related to the card itself. That said, it did not obviously handle your green drive very well. In the end, is it really the cards fault if we deviate from MFG recommendations... Just my two cents.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...