Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

imminent death of disk ?

Status
Not open for further replies.

hcclnoodles

IS-IT--Management
Jun 3, 2004
123
GB
Hi there

I have been recieving these messages reguarly from one of our boxes and wanted to know if this meant the disk was on its way out or something? I think what i really want know is what does this message actually tell you other than there has been a kern.warning on this particular disk ....how would i progress/investigate this ? message below...


3 in 3:10:39: Sep 21 02:19:19 my.machine.com scsi: [ID 107833 kern.warning] WARNING: /pci@1f,700000/scsi@2/sd@0,0 (sd1):


any help on this would be greatly appreciated
Gary
 
With my limited knowledge...
iostat -En will tell you the type of errors, either hard, soft or transport and whether these are recoverable or not.
If this disk is part of a mirror then metastat can give you the state of the mirror.
These messages could be an indication of problems ahead but do not always cause failure.

Bear in mind that the output of iostat only clears after a reboot.
 
OK thanks , I do seem to have a problem on one of my 4 disks here is the iostat -EN output for the problematic disk (its part of a metdisk mirror)


c1t0d0 Soft Errors: 180 Hard Errors: 35 Transport Errors: 0
Vendor: SEAGATE Product: ST373307LSUN72G Revision: 0507 Serial No: 3HZ7B7RJ00007442
Size: 73.40GB <73400057856 bytes>
Media Error: 28 Device Not Ready: 0 No Device: 3 Recoverable: 180
Illegal Request: 0 Predictive Failure Analysis: 0


Metastat -i comes back with everything "state = okay" but ive been getting these messages since tuesday night..

Do you reckon this is gonna die ? or is the metastat -i result more important .....any advice would be great
 
Hi, It is not unusual to see errors of this kind repeat themselves over certain periods. Check in var/adm/messages for the time that these are happening to see if you can isolate a particular process that may be writing or reading to this disk. A backup maybe?
Do a man on iostat. I believe you can set this up to take a snapshot of the i/o over a defined period, say every 1 minute. iostat -xtcn 5 (I think) will poll every 5 minutes.
This may give you an idea of the disk activity.

I'm sorry I can't help much more. I am fairly new to UNIX troubleshooting myself, but am very used to seeing these erorrs and only occasionaly does the disk actually fail.
Besides this, if you have a good mirror then it is easily recoverable.

Alternatively, get out your SUN contract and get a new disk in.

HTH...even it is only a little bit.
 
Sorry, just realised I didn't answer a part of your question.
The output of metastat is the greater truth.
Iostat only clears after a reboot, so some of the errors showing may be older than you believe.
As long as the metastat is showing OKAY and not NEEDS MAINTENANCE then you have good mirrors and consequently redundant data.
If this changes then you can use FORMAT as the root user and it will tell you if the disk has failed.
These are the tools I use when I have similar problems.

Do a man on metadb (look for the -i switch) and search this forum or the internerd for state database replicas information.

Hopefully (for both of us) when the bigger UNIX brains on this forum sign in we may get a more learned / experienced answer.

Cheers
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top