We've been experiencing a high rate of VRM problems with our IBM x235 (8647xxx) servers. IBM indicates that the earlier versions of the system board have higher rates of VRM problems.
An 8647 is an x225, not an x235, and has the VRMs built-in the planar. The x235 (8671) does have replaceable VRMs, and there are two versions: 9.05v (49p2129) for the early models - 1xx, 2xx, 3xx, and 4xx; and 9.1v (49p2010) for the later models - 6xx, 7xx, and 8xx. If you're installing the wrong part, bad things may happen.
My mistake, they are 225's. The VRM's are being replaced by IBM, so I'm sure they are the right part number. That is exactly the problem. If the VRM's weren't integrated on the planar, it would be simple to swap them out. But as it is, it requires a complete planar swap out.
Is the error you get from ibm director "vrm power not good" But then the machine runs ok? I have a customer who has 1000 of these servers,and we have been seeing this on about 20 machines in 3 states. But the error is not constant. it will kick an error,then run fine no errors for like 6 weeks,then it will kick the error 1-2 days in a row. Rich
I shall use google before asking stupid questions!
IBM Director has dependencies on the system BIOS, ASM firmware, and driver to be current to be considered reliable. Also, the change management strategy for IBM Director has changed from the eFix model used in v3.x to rolling fixes into the new version of the product for v4.x, so if you are not running the current version (v4.12), then your mileage may vary. It's pretty good practice to look for confirmation for any Director Event in the Service Processor log. In fact, any suspected hardware fault should be confirmed in two places before any hardware action is taken. These sources could be lightpath diagnostics, service processor logs, IBM Director events, etc.
1 of my service calls on these servers was cancelled today, Have no idea why. After pulling rsa logs on 4 of these machines, It was determined that their rsa firmware and system bios were way out of date. But was told that on machines they did the updates on, it was of no help. Was also told by support <IBM> that all calls regarding this customer were to have the system boards replaced for this issue,then my call was cancelled. Looks like a lot of phone calls tomorrow. Rich
I shall use google before asking stupid questions!
*update* Ok The data center has cancelled all calls to replace the system boards, and shipped a server to ibm product engineering. As system boards were replaced at some locations, but the error came back. Rich
I shall use google before asking stupid questions!
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.