Question : Proliant ML110 G4 Lost RAID and Randomly gives Error 0251

I'm working on a Proliant ML110 G4 with onboard RAID 1. The server had two RAID 1 arrays running on an Intel Storage Matrix and up until last week had no issues. It was running SBS 2003 R2 and was completely patched.

A SCSI Tape Drive went offline randomly. We opened the server and recabled the tape drive and on POST, the server gave message 0251 - System CMOS checksum bad -  default configuration used. The server booted to the OS but separated both of the RAID 1 arrays into individual (non-RAID) drives. We shut-down, booted into the BIOS and re-enabled the SATA RAID which had been defaulted to non-AHCI SATA (the default). On restart we got the dreaded NTSOKRNL.exe missing or not found message and could not boot the server. We tried running chkdsk /r in Server 2003 recovery console after consulting with Microsoft engineeers. They advised us to run this until there were no more errors found. We ran it 3 times. Each time it took around 3 hours to complete. 9 hours later we still did not have a bootable server. On a lark, we decided to try to boot with only one SATA drive attached to the server. We unlugged all but one drive - one of the system drives in the original RAID 1 array. On POST we again got error 0251 and the BIOS was defaulted to its original non-RAID config. We don't know why it keeps giving us this message - possibly a bad CMOS battery. But, we were shocked since the server actually booted. Many of the services weren't working and erroring when we tried to start them. Exchange was hosed - stores wouldn't mount and ESEUTIL /p errored with Jet Error -400. We did a system state restore from tape and that got all the services working except for WINS. No idea why. We restored the Exchange stores from tape and that got Exchange up and running but for some reason we had to recreate all user's Outlook profiles since new mail wouldn't show up in their Outlooks. That's a mystery as well. New mail showed in OWA but send/receive in Outlook 2003 and 2007 acted as if there was no new mail. I assume this was some cached Exchange mode issue and recreating the profiles worked.

We reattached one of the data drives in the second RAID 1 array and now everyone has access to mail and data.

So, we're back up and running but left with a server we have no RAID enabled on and no confidence in. I'm trying to determine if we should try upgrading the BIOS, the CMOS battery, replace the board, or replace the server. The server's warranty expired in Dec. 2009 so HP was little help. They shipped us a server in Dec. 2006 that had an Intel RAID Storage Matrix onboard, which we used successfully for 3.5 years, but now they're telling us that they issued a softpaq some time in 2007 that rebranded the Intel utility to an HP RAID and that they won't support the Intel RAID. If I use their Softpaq I'll need to recreate all the logical drives. It would have been an unreasonable thing to expect customers to do while the Intel RAID was working and having no issues. But since it's looking like I'll need to recreate the RAID I will try to use their softpaq if it comes to that.

I'm looking for advice from other admins. Since I work hourly, it will cost the company a lot for me to take an image of the server's OS and data drive, update the BIOS, rebrand the Intel utility to HP, recreate the arrays, load the images, and test the server. My dilemma is that after all that, the server may still give error 0251... since it appears randomly when we disconnect and reconnect hardware. Should I just suggest replacing the board, or possibly the server itself?
   
Anyway... how would you proceed?

Answer : Proliant ML110 G4 Lost RAID and Randomly gives Error 0251

In your situation I would recommend a new server.  The ML110 range is low end server so a similar new one won't cost too much money (in relative terms) compared to labor costs and a possible continuing hardware problem.  Also, they will get a new warranty with the new hardware.


Random Solutions  
 
programming4us programming4us