Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

HELP! default install of RH 7.3 kernel panics on restart!!!!!

Status
Not open for further replies.

shadedecho

Programmer
Oct 4, 2002
336
0
0
US
summary of my system:
dual xeon processor system (2.2.ghz each), with 1024MB ram. harddrive(s) is actually a (adaptec) hardware-based RAID 5 array of 4 SCSI harddisks. All is fine with the array, according to diagnostics.

i pop in the install cd for RH7.3, and away it goes. I first tried a custom install where I selected my own packages, but after this error happened, I just decided I'd rather go through and try the standard "out-of-the-box" installation, so i choose "Server" as the type of install. I manually partioned the drive (array) with disk druid just fine, setting up /, /usr, /tmp, /var, /home, and a swap partition. Set all the other settings, and the installation completed just fine. Also, i chose to use GRUB as the boot loader, and put it in the MBR, which appears to work fine.

On reboot after that install (choosing the default multi-processing boot selection, since its a SMP system and linux creates one for the multi-processing and one for just a single processor boot), the following is reported, every time on reinstall:

error: quote mismatch for Loading scsi_moD moDule
Inmod /lib/scsi_mod.o
(last line in /startup.rc missing \n -- skipping)

kmod: failed to exec /sbin/modprobe -s -k block-major-8, errno=2

VFS: Cannot open root devide "sda6" or 08:06
Please append a correct "root=" boot option
Kernel panic: VFS: Unable to mount root fs on 08:06

What I've done so far:
I rebooted to "linux rescue" from the cd, and it booted and mounted my drive just fine. as "root", i went snooping trying to find some of this stuff on the fs to see if i would have to manually fix things.

1st, I tried finding startup.rc that is referred to in the error messages, no such file exists that I could find. no luck on that front.

Next, I tried doing "/sbin/modprobe *", which reported to me it couldn't find "/lib/2.4.18-3BOOT/".

so i looked in lib, and there were both "/lib/2.4.18-3/" and "/lib/2.4.18-3smp/". I renamed the "2.4.18-3" to "2.4.18-3BOOT" as modprobe was looking for it, and then tried to run modprobe again... this time it reported it couldn't find "var". no luck on that rabbit trail.

so with that directory renamed, i tried a reboot and crossed my fingers. i got the same kernel panic, however notably missing from my errors was the whole set of errors about "Loading scsi_moD moDule" and "startup.rc". Improvement? I don't know.

Still though, I have the "failed to exec /sbin/modprobe" and the kernel panic about "VFS: Unable to mount root fs". I'm lost now thoroughly, and have no clue how to make a simple install of RH actually boot!

PLEASE, can someone help me? has anyone seen this behavior before?
 
oops, one correction of note... my references to the path "/lib/2.4.18....." were not complete... it was actually at "/lib/modules/2.4.18......" sorry for any confusion.
 
did you try installing with a separate set of hard disks? it may be that your hdd's are bad and linux is having a snowball effect of errors.
 
First of all, the disks are brand new from a factory. Secondly, I have installed another OS on them just fine (windows 2k) and the wiped it clean (including wiping out the partitions and such). I am almost completely positive the issue is not "bad hardware"... I at first thought it might have been some driver problem with linux and the scsi RAID array... but, clearly the driver RH was trying to use to communicate with it worked fine enough for an entire partition/formatting of and installation to the disk(s). I think something else is the root, not this, but thank you for your suggestion.

btw, the install CD's have been used before on other systems and worked just fine.
 
hmmm..... my expierence with Adaptec RAID cards is limited, but the experience i've had with them aren't very good. that's why we moved to another vendor. Concerning my experience, i remember using an adaptec specific driver. it may be included with the boxed software, or i believe you can download it from their website.

i need some more detailed information about your install.. could you possibly e-mail me your install.log which is located in your /root . To do so, load your system to linux rescue, mount a floppy, and copy the file to the floppy. then e-mail it to me as an attachment. to mount the floppy in rescue mode:

mkdir /mnt/floppy
mount /dev/fd0 /mnt/floppy
cp /root/install.log /mnt/floppy
umount /mnt/floppy

 
an appendature to my last note, before you do the cp line, do a:

chroot /mnt/sysimage

if that doesn't work replace this line with the cp line:

cp /mnt/sysimage/root/install.log /mnt/floppy

 
well, for some reason i'm unable to mount that floppy drive. i will deal with that problem later. besides, i looked at the install.log file, and there's nothing out of the ordinary except like 400 lines of text, with each module or program being installed and which version. there were no errors reported at all in it.

oh, and btw, the scsi adapter appears to be loading just fine, since it is reported as dpt_i2o, and both during the "linux rescue" AND the normal (troubled) boot, it makes reference to that driver being loaded successfully.

here's my current line of thinking:
I've been researching modprobe (and its associated depmod), and since they are referring to not being able to find a dependency file because of the incorrect path variable, i figured i would start there and found out why by default its pointing to the wrong lib dependency file path.

what i found is this: my kernel name (as reported by uname -r) is "2.4.18-3BOOT"... which got me wondering if there was a connection to that reference and the one that modprobe makes about not being able to find the dependency file in "lib/modules/2.4.18-3BOOT/"... sure enough, i found a webpage which states that the /etc/modules.conf file can be used to tailor the behavior of depmod and modprobe. in particular, they can be used to specify a path to the dependency file. if that path is NOT specifically stated, it assumes as a default path to look in "/lib/modules/" + uname -r... so that mystery solved.

So, I decided to go into modules.conf and manually specify the path to my dependency file. i specified it as the /lib/modules/2.4.18-3smp (since i am trying to do the SMP boot selection from GRUB) directory using the "depfile=" directive, as described on this page:

also, btw, the modules.conf file that i modified DOES make reference to my dpt_i2o driver as the scsi device handler.

i rebooted, to no avail, same behavior. which is curious to me, but it just means that i'm not really on the right track, even if i am solving problems. any more thoughts anyone??
 
during your install, did you hit F3, F4, and F5 to view your hardware status. i'm convinced that it is an incorrectly loaded module of some sort. of course, like i said, my experience with the adaptec card is limited, but the problems you are explaining are the same problems i was having at the time i was using an adaptec card. to fix them, i had to use a non-redhat provided driver for the adaptec card. of course, your setup is different than mine, so i may be wrong. have you done a fresh reinstall since your most recent posting?

sorry i can't help out anymore. i'd have to sit down and look at the box, but that obviously isn't going to happen.

good luck! :)
 

You've done the classic mistake, mate.

You *can not* have drivers for essential boot-up devices as modules.
What happens is that the kernel wants to mount the root partition on the raid array. To do this it has to load the driver for the raid adapter which is on the raid adapter... Hmmmm... See??

You have to compile a kernel with builtin support for the raid array.

Cheers Henrik Morsing
Certified AIX 4.3 Systems Administration
& p690 Technical Support
 
ok, in response to Morsing's response:

1. I did not do anything, except pop in the RH7.3 cd's and try to do a standard install. Adaptec claims that the dpt_i2o scsi disk driver that is bundled with 7.3 IS IN FACT the adaptec driver to be used with my scsi system. I didn't tell RH to load stuff in any particular order, or as modules as opposed to being part of the subsystem, and I wasn't building a custom kernel, i just wanted an out-of-the-box install.

2. The hardware itself is taking care of the RAID part of things. I went into the SCSI BIOS setup utility and created my RAID 5 Array, and configured all the SCSI id's and all that jazz. This was before I even TOUCHED the system with a OS cd.

3. The RH installation program CLEARLY has NO trouble in communicating with my harddrive(s) THROUGH the SCSI subsystem (therefore I think that must mean it has the correct driver in it) because RH install asks you to create and format your partitions (\, \var, etc) and to do so, it is recognizing one big harddrive, confirming to me that the hardware is doing the job of abstracting out the 4 drives in the array as 1 big drive and presenting that to the OS.

4. The "rescue linux" kernel has NO trouble communicating with AND mounting and booting to that scsi drive array. I am able to go in and navigate around my fs (including ALL my partitions) and make changes to configuration files.


So, bottom line, I don't understand why such a popular OS as RH7.3, on a standard SCSI RAID 5 array system, would out-of-the-box install in such a funky way (by using defaults) that i couldn't even get a clean boot. anyone got any thoughts on this? is my ONLY option to learn how to compile my own special kernel?

Thanks.
 
I have never even seen a RAID device before, but a did a few Googles and the prevailing "well try this then" is to go with a "native" Linux software RAID instead of hardware RAID.

Best I could come up with... --
JR
 

Shadedecho, you're missing the point. To address your comments:

1) You can't do it any other way, it's not your fault.

2) It's not a hardware problem.

3) Of course not, it has it's own root file system and can load the correct driver.

4) Of course not, it has it's own root file system and can load the correct driver.

5) You're completely mistaken!! And yes, unless you can chose a server install with the driver built-in, the only way out is to build your own kernel.

I can tell that you don't really belive me but you won't get anywhere before you've understood my point.
I didn't mean that you personally had done anything wrong other than failing to see what the problem was.

Try reading my message again, slowly. There's no mystery, it's not a funky way to do it (really) because 99.99% of people don't boot from a RAID adapter.

There no way any OS in the world (even the fantastic MS ones) can get a driver for an adapter from a disk on the adapter itself. It is simply not doable.

Cheers
Henrik Morsing
Certified AIX 4.3 Systems Administration
& p690 Technical Support
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top