Jump to content
RESET Forums (homeservershow.com)

Predictive Drive Failures


iandrews
 Share

Recommended Posts

I fitted schoondoggy’s drive bracket to my Gen 8 on Friday (though I had attached 2 SSD’s they were not connected  / powered as I am awaiting a breakout cable).

 

So at the time, my Boot SSD was running off the B120 (in raid mode) on port 5, and the 4 disks in the front cage off the P212 in Raid 5. As I had the server powered off, when I powered it back on I decided to plug it into a plug power energy meter that I had just got to get an indication of how much power it was using (as didn’t want to overload the PSU with 4*HHD, 3*SSDs (once they were all connected) and the P212.  

 

The server started fine, but I could hear a kind of tick, tick, tick, click sound from one of the hard disks in the drive cage. I ran the ATTO test software and started a test on the Raid 5 disk(s), it ran, but seemed slow, so I aborted it ½ way through, and powered off the server, and removed the power meter plug, and re-seated the 4 disks (in case of had come loose while I moved the server fitting the bracket).

 

On boot up I went into the HP Raid Configuration and it said that the raid 5 logical disk had failed, and all data was lost, and also said that 3 of the disks had Predictive Drive Failures. Seeing as I didn’t have any data on the disk, I deleted the array, and re-booted in the hope the Predictive Drive Failures would go, but they didn’t. I moved the disk cable from the P212 back to the B120, but still said Predictive Drive Failures, I then swapped the location of the 1 ok disk and a predictive disk (in case it was an issue with the bays), but the ok disk stayed ok in it’s new bay.

 

Re-booted and ran intelligent provisioning, and then ran disk diagnostics, on the ok disk and a predictive disk, the ok disk passed ok, but the predictive disk failed right away on disk read.

 

The disks are HGST 3TB NAS drive, and there is some SMART testing software on their web site, but it can’t talk to the disks in raid mode, so I set the controller to ACHI mode, and then installed Windows 2012 (using intelligent provisioning) onto the disk in bay one (which was the ok disk). Once that was all installed I installed / ran the HGST software, and ran quick test, on the other 3 disks (couldn’t do the ok disk as it was the OS disk), and all came back ok. Ran a long test on one of the disks and that also came back ok.

 

 Also download Crystal Disk Info, and ran that, that showed an error on the “ok” disk, but said the other 3 ware ok.

 

(While the server rebooted in ACHI mode, during POST it showed 2 of the disks as Smart compatible disks, but 2 of them as not smart compatible.)

 

Re-booted and put the controller back into raid mode, and this time at reboot it said that all 4 disks had predictive failure.

 

Going round in circles I decided to remove the P212 card but that made no difference, I also connected the disks (one at a time) to a different cable plugged into the B120 (to rule out the bays / backplane) but they also showed as predictive failure.

 

But the B120 back into ACHI mode and loaded intelligent provisioning, and ran disk diagnostics on 2 of the disks that happened to be in the bays at the time and they passed all the tests. Put the controller back into Raid mode, and back to showing predictive failure.

 

So it seems if I have the B120 in raid mode (and on the P212) it shows the disks as predictive failure, but in ACHI mode (apart from the non smart bit at bootup, and at one point Crystal Disk saying 1 of the disk were bad) none of the tests show any issues.

 

I have put the controller back into ACHI mode, and I am currently installing Windows 2012 onto an old 160GB disk and will then try the HGST / Crystal Disk utill on one disk at a time to try and see what that shows.

 

I want to either return the disks to Amazon (but they are only offering a refund, and currently the disks seem to be £20 more expensive each), or try to get an exchange via HGST, but worried that if all standard (ie non raid) smart tests are coming back ok then they may deny (of fail to find) a fault (though I do have screen shots of the error when in Raid mode).

 

Not sure why 3 disks all “failed” in one go (and now the 4th), and if the power energy meter had anything to do with it, but I have been going round in circles all weekend trying to diagnose why raid shows error but ACHI doesn’t. Anyone have any ideas.

 

 

 

Link to comment
Share on other sites

  • Replies 33
  • Created
  • Last Reply

Top Posters In This Topic

  • ikon

    6

  • schoondoggy

    5

  • HellDiverUK

    5

  • iandrews

    18

First off, I like your approach. You're breaking the situation down into chunks and testing each chunk.

 

Here's a though: with that many drives going bad all at once, I would be suspicious of the RAID controller, especially since the problem only occurs in RAID mode. I would try testing all those drives in another computer. If they come back clean then I would be even more suspicious of the RAID controller. How about putting the P212 into another computer and testing the drives?

 

I would also, if possible, put different drives into the Gen8; see if they too come back with Predictive Failures.

 

Hate to say it, but it's beginning to look like your mobo may have issues.

Link to comment
Share on other sites

Thanks for the reply, I have had the disks both attached to the P212 (they were attached to this when they went wrong), and the B120 in raid mode. In both cases they show Predictive Failures. I have since removed the P212 (but not tried it in another PC). I have tried a "different" disk in the Gen8 (but only attached to the B120 in raid mode) and that seems ok. May have to put the P212 back in, and attach it to that to see what happens.

 

Just ran the HGST and Crystal utils on 2 of the disks in ACHI mode, and both showed ok in both.

 

So would expect the disks to show ok in another pc in ACHI mode. Not sure I could test them in another pc in raid mode. 

Link to comment
Share on other sites

Yes, the P212 is on 6.6, and gen8 on 06/06/14. But as stated, at the moment the P212 is not in the server, and the drives show the same errors on the B120 in raid mode.

Link to comment
Share on other sites

Yes, that's what I thought, but the P212 isn't supported in the Gen8 (though the errors do also show on the B120), and also thought they only supported HP disks.

 

The HGST and crystal tests on the other 2 disks showed ok.

 

Put the P212 back in, and connected the disk cage cable to it, and the 160GB disk shows no errors. Added in some of the 3TB disks, and they show predictive failure. Ran the HP diagnostic on one 2 of the disks, and they passed with no errors.

 

So all very strange. Seems as though the disks have somehow been marked Predictive failure that raid mode can pick up, but not the usual SMART tests.

 

Link to comment
Share on other sites

The only thing I can think that's unchanged between your tests is the backplane cable.

 

If you have a SFF to SATA breakout cable, you could just plug the drives in ignoring the backplane cable.  Though you would need some SATA power splitters.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share


×
×
  • Create New...