Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Where to find logs for nfs? 3

Status
Not open for further replies.

khalidaaa

Technical User
Jan 19, 2006
2,323
BH
Hi all,

I have an nfs mount with a directory in the mainframe. This nfs sometimes goes down for unkown reason. So i thought of looking for the reason in nfs logs or something but i don't know if there is a log to record nfs stuff!!

when i did df that's what i got:

Code:
mainframe:cdb.test.mf.data.aa 
df: /cdb/test_mf_data_aa: A file, file system or message queue is no longer available

So is there a way to find out why this happen? a log to look for or a way to stop that?

Thanks

Regards,
Khalid
 
I do not know of a log - but the primary reason why this occurs on our servers is an incorrect setup of the NFS file system - such that it is not set to mount on reboot. This can be verified in the smitty menu - smitty NFS will get you there.
 
Thanks costiles for your reply but i've already remounted the nfs using smitty nfs!!

but i really need to know the reason for this being lost somehow!!!
 
Look at the command generated by smitty nfs and replicate this in your mount.
 
Thanks KenCunningham but i've already remounted the nfs and come over the problem! but my question was if this problem happen to be again where should i look for logs and what should i do!!!

i don't want to remount this every time it goes down!!!
 
IMHO your problem lies on the NFS server side: the filesystem you are trying to access is no longer available. Speak with your "mainframe" colleagues. Perhaps they restarted TCPIP and/or NFS on their server and they need to re-export the filesystem. See if they can configure it so that the filesystem is auto-exported on starting TCPIP/NFS on the mainframe.


HTH,

p5wizard
 
p5wizard,

The problem occured again :(

Last time when i told the mainframe guy he said that he didn't do any thing with the tcpip!!

moreover, this particular nfs is working on the other servers!!! so why this server only??? besides, i'm mounting two mounts from the mainframe, only one of them is dropping and not the other though both of these nfs are just one is a subdirectory of the other!! the subdirectory is disconnecting but the parent isn't!!!

see

Code:
mainframe:cdb.prod.mf.data   19531248    156248  100%   180000    90% /cdb/prod_mf_data
mainframe:cdb.prod.mf.data.aa 
df: /cdb/prod_mf_data_aa: A file, file system or message queue is no longer available.

Any help is appreciated.

Regards,
Khalid
 
Do the MF guys perhaps remove/recreate that MF file? If so, then they have to re-export it and you should not have to do anything on the AIX side.


HTH,

p5wizard
 
Anything in errpt, surely if a filesystem goes missing, even a NFS mount, AIX should post some thing when the mount fails.
 
p5wizard,

This file is located in mainframe (cdb.prod.mf.data), the other mount point is just a subdirectory inside this share! (cdb.prod.mf.data.aa)

so if nothing has been done to this (cdb.prod.mf.data) nothing should be changed in this (cdb.prod.mf.data.aa)

that's what the MF guy told me and it kind of make sense!!!

DukeSSD... No nothing in the errpt :(

Regards,
Khalid
 
Set up a cronjob that checks every minute via "df" or "ls", if the mount is still accessible and write it together with a "date" to a logfile (or add a sleep and have it looping inside 1 minute to get fragments of a minute). You will be able to determine the time so maybe you can notice it occurs always at the same time(s). Then you might be able to associate it with something running on your system or on the MF.
Maybe add "nfsstat" in the little script triggered by cron to see if any errors/stats might give a clue about what happens.

laters
zaxxon
 
oh yeah That's a possible way of monitoring it. Thanks zaxxon, i will try to do so and let you know :)

Regards,
Khalid
 
Are you talking partitioned data set (like aa is a member of cdb.prod.mf.data)? If not, then the two datasets on MF
cdb.prod.mf.data and cdb.prod.mf.data.aa don't have anything to do with each other, except that they are named alike.

Now I am by no means an OS390 specialist but I do know that MF datasets are not in a directory like in a UNIX filesystem (though you can create HFS -hierarchical file system- on MF also). A dataset on MF is allocated on a disk, and is cataloged with a name (up to 5 8char-strings separated by periods). A PDS - partitioned dataset is a file containing different members (8char-names) so you can go down to 6 levels:
MF.DATA.SET.NAME.ONE(MEMBER01)

So my suggestion about a re-created or re-allocated dataset on MF might still be a valid axplanation. But by all means try zaxxon's advice to pinpoint when exactly the mount fails...


HTH,

p5wizard
 
Thanks p5wizard.

aa is just a subdirectory within cdb.prod.mf.data on the MF

I'm not an MF admin as well and that's what have been said by the MF admin that the aa is just a subdirectory under cdb.prod.mf.data

any way, yesterday i lost both of the mounts!!! and i checked the errpt -a and found the following error message:

Code:
LABEL:          TS_LOC_DOWN_ST
IDENTIFIER:     173C787F

Date/Time:       Wed Jun 14 23:25:17 SAUST 2006
Sequence Number: 2216
Machine Id:      00CF359F4C00
Node Id:         s1cdbp
Class:           S
Type:            INFO
Resource Name:   topsvcs         

Description
Possible malfunction on local adapter

Probable Causes
Local adapter mal-functioned
Local adapter lost connection to network
Local adapter mis-configured

Failure Causes
Local adapter mal-functioned
Local adapter lost connection to network
Local adapter mis-configured

        Recommended Actions
        Verify adapter configuration
        Verify network connectivity

Detail Data
DETECTING MODULE
rsct,nim_control.C,1.39.1.2,4822              
ERROR ID 
6zV5DL.h05Y2/Gol1K4U1/0...................
REFERENCE CODE
                                          
Adapter interface name
en0
Adapter offset
           0
Adapter IP address
10.1.1.150

I beleive this was the reason behind lossing the nfs!!!

Any idea?

Regards
Khalid
 
Mabye you have it configured on AUTO NEGOTIATION and your LAN switches prefer fixed speeds, whatever. I think there are several possibilities.
If you can't find any local problem, talk to the LAN guys if they can see something going wrong at the switch and the port, you AIX box is attached to.

laters
zaxxon
 
yes both ents are configured as "Auto Negotiantion" and i beleive that was done for a reason!!! i don't remember whether this is something has to do with HACMP or with our switch... i will check with the LAN guys as you said tomorrow coz no one is here for now!!

Thanks Zaxxon again :)

Regards,
Khalid
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top