Jump to content


Photo

Drives Dropping? important fix for San Digital multi-bay eSata enclosure users!


  • Please log in to reply
5 replies to this topic

#1 JazJon

JazJon

    HSS Star

  • Members
  • 61 posts
  • LocationSan Francisco, California

Posted 15 July 2012 - 06:36 AM

I wanted to share an important fix for San Digital multi-bay eSata enclosure users!

Both my San Digital TR5M-B and TR5M-PB 5-bay eSata enclosures would randomly DROP out of Windows when I was moving a lot of files around. The only way to get them back was to reboot the server. (power cycling the enclosure didn't fix it) Well it turns out there is a simple fix all along. I wish I would have researched this more 12 months ago! The below fix applies to the 4, 5, 8, etc bay enclosures

Here's the enclosure I use the most
http://www.sansdigit...lus/tr5mbp.html

The fix:
http://www.sansdigit...10&id=4250#4296

Problem / question: Currently the disk system is set up as just single pass through disks. Anytime there is extremely heavy I/O all the disks drop out and the controller cannot see them anymore. Restarting the TowerRAID system and unplugging the cables does not help, only a reboot of the server. This is getting frustrating as when all the drives were connected to a 3ware 9550sx-12 they never had an issue. Is there another HBA I can try to solve this issue? I do not want to use raid, just single disk pass through.

SOLVED
Turn off "Link State Power Management" under PCI Express in the power options menu. (in control panel)

#2 pcdoc

pcdoc

    HSS Legend

  • Moderators
  • 3,559 posts
  • LocationLos Angeles, California

Posted 15 July 2012 - 02:18 PM

Thanks for the update. I know that there are a few that use SD enclosures

Main Server - WHS 2011, Core I5-2500, 12T RAID 5 (5x3T) + 2T of Mirror + 2T of backup
Second Server - 2008R2, Core I5-2500, 12T RAID 5
Main Systems - Core I7-2600k, 16 Gigs DDR3-1600, 180 Gig Intel 330 SSD Max IOPS 240 Gig Vertex 3, 2T Sata 3 for local Backup
Other systems - Core I7-2600, Core I3-530's, Core I5-2500, Core I7-920, Core I3-2100, and G620 (see System List)
My Blogs - The Docs Blog and Tablet Resource
BYOB Videos - TheBYOBPodcast
For a complete system List: Computer Systems


#3 Renny

Renny

    HSS Pro

  • Members
  • 187 posts
  • LocationBrisbane, Australia

Posted 15 July 2012 - 06:08 PM

Thanks Jaz. I have not had this problem with my TRMB4 but I will implement the fix anyhow.

#4 JazJon

JazJon

    HSS Star

  • Members
  • 61 posts
  • LocationSan Francisco, California

Posted 15 July 2012 - 06:24 PM

I've had mystery eSata drops on my EX495 ever since WHS V1 and Then still on WHS 2011. I finally found the fix and am happy.

I let Matt know about this, he's the dev from the SMART (monitoring) WHS add-in. Here is his interesting response.

"You’ve made some interesting discoveries/observations here. It’s intriguing to me that you could reproduce the problem by subjecting the enclosure to intense I/O, or by allowing SMART tools to run against it under minimal I/O. The link state power management problem seems to affect a lot of different hardware. I have an OCZ Octane 128GB SSD in both my work-issued laptop (by day I’m a Microsoft SharePoint consultant for HP) and my personal laptop. Both exhibited a peculiar behavior of freezing for 30 seconds at seemingly random times during the day. The system would always become responsive, so it was more of an annoyance than anything. The system event log would show an error along the lines of “the device \\Harddisk0\ did not respond within the timeout period.”

In both cases the guilty party was an Intel ICH SATA/RAID controller and the fix was to go into the Registry and turn off the—you guessed it—link state power management!

According to the SATA specification, there are many different commands you can send to a device—things like DEVICE IDENTIFY, SMART READ DATA, etc. If a device doesn’t recognize a command, the device should return an error code so that the program that issued the command knows the operation failed.

Something I found out in developing WindowSMART and Home Server SMART, particularly when I got to the part where I started supporting device self-testing (short, extended, conveyance), was that there are a LOT of devices that don’t conform to the specification. Rather than returning an error code, the device just seems to die silently.

As an example, you can send a command to the device and it’ll return—as a number of minutes—the length of each test it supports. If the return value is zero, the test is not supported by the device. In the example of the OCZ Octane 128, this particular SSD doesn’t support any of the tests. Of course, in the UI, I dim the button to allow you to run a test if the device doesn’t support it. Out of curiosity, I did want to see what happens if I send the test command to the OCZ Octane. Doing to effectively renders the laptop inoperative. The resolution is to do a full power cycle on the laptop, and the device returns to normal operating.

The moral of the story here is that I’m guessing the enclosure and/or some (or all) of the devices within it don’t support the link state power management command, so if the host sends that command to the device(s), they die silently and stop responding to commands. And Windows eventually detects they’re no longer responding and finally takes them offline.

By the same token, when a SMART tool sends commands to the devices, it’s possible one of them doesn’t recognize a command and it locks up, and seemingly all of them follow suit. Probably because commands start queuing up in the I/O controller due to the locked-up device. And eventually the whole controller goes down.

Matt"

Edited by JazJon, 15 July 2012 - 06:25 PM.


#5 ikon

ikon

    HSS Genius

  • Donating Member
  • 8,530 posts

Posted 15 July 2012 - 08:09 PM

This is verrry interesting info. Looks like everyone should just disable Link State Power Management by default.

If at first you don't succeed, do it like your mother told you.


#6 JazJon

JazJon

    HSS Star

  • Members
  • 61 posts
  • LocationSan Francisco, California

Posted 15 July 2012 - 08:32 PM

I am trying to have the ultimate movie library via media center/extenders etc. My roommates are constantly telling me the video shares are not working, and the whole reason was this one stupid Link State setting! I dug in deep into google and there it was, finally answers.

I guess I should have posted this in the following area below but whoops. (is it ok where it's at?)




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users