hcclnoodles
IS-IT--Management
hi there
Just a quick question, I have been getting various soft and hard errors being generated on disks on various machines (as im sure we all do). i was wondering other than checking "iostat -En" for the number and frequency of errors, and "prtdiag" for general hardware status, what else is there I can do to further investigate disk errors without taking the box down to do diagnostics (these boxes are live and downtime is almost impossible). Any additional tools or tips that would help me would be great. Eg: I have htis error generated on a sparc box
Error for Command: write(10) Error Level: Retryable
scsi: [ID 107833 kern.notice] Requested Block: 3603180 Error Block: 36031808
scsi: [ID 107833 kern.notice] Vendor: SEAGATE Serial Number: 0327A23Y8
scsi: [ID 107833 kern.notice] Sense Key: Hardware Error
scsi: [ID 107833 kern.notice] ASC: 0x19 (defect list error), ASCQ: 0x0, FRU: 0x2
To me this looks serious, and "iostat" is reporting 1 hard error, but where do I go from here, what additional tools can i use