We are in a hosting environment.Everyday a FTP server on client side follows predefined schedules to get files from a file server on hosting side.
On the file server, the files are originally saved in /data folder; after the remote FTP server gets them, the files will be moved to /archive folder, while the filenames will have timestamps appended to show the time that the files were successfully processed by the FTP server.
Everything was ok until the next day of time change which was Nov 6th. Starting from Nov 7th, in the /data folder, we start to see files like .nfsxxxx. They have the same size as the good data files, accumulating in both file server and FTP server, consume a lot of space and we don't know whether there is any impact to the data files. We did do homework on those NFS files, but as far as we know, there is no known process or userid trying to open the files while they are being transferred by FTP server. We also can't figure out why the problem happened right after time change.
The 2nd problem happened since yesterday. The client FTP server can no longer get data files. The log shows "No Such File". However, on hosting side, the files were moved to /archive with a timestamp as if they were moved successfully by FTP server.
Below are what happened before the problem, hopefully can give some clue:
1. On client side, on Nov 6th, the date configuration related to the FTP transfer cycle was modified so that the transfer time is the same after time change. E.g., file was transferred at 6:00am before the time change; file is still transferred at 6:00am after the time change. We did the same config change twice on Nov 2010 and March 2011, no problems.
2. On hosting side, things usually are not transparent to us. We only know the file server is a Solaris. We requested to use "lsof" against a .nfsxxxx file, the result pointed to a kernel patch 144488-17 performed on Oct 16th.
Now no one admits it's their problem. My theory is that the kernel patch doesn't work correctly somehow after the time change but I can't prove it. I did review below links but can't tell what could be the potential problem.
Thanks in advance for any input!
Max
On the file server, the files are originally saved in /data folder; after the remote FTP server gets them, the files will be moved to /archive folder, while the filenames will have timestamps appended to show the time that the files were successfully processed by the FTP server.
Everything was ok until the next day of time change which was Nov 6th. Starting from Nov 7th, in the /data folder, we start to see files like .nfsxxxx. They have the same size as the good data files, accumulating in both file server and FTP server, consume a lot of space and we don't know whether there is any impact to the data files. We did do homework on those NFS files, but as far as we know, there is no known process or userid trying to open the files while they are being transferred by FTP server. We also can't figure out why the problem happened right after time change.
The 2nd problem happened since yesterday. The client FTP server can no longer get data files. The log shows "No Such File". However, on hosting side, the files were moved to /archive with a timestamp as if they were moved successfully by FTP server.
Below are what happened before the problem, hopefully can give some clue:
1. On client side, on Nov 6th, the date configuration related to the FTP transfer cycle was modified so that the transfer time is the same after time change. E.g., file was transferred at 6:00am before the time change; file is still transferred at 6:00am after the time change. We did the same config change twice on Nov 2010 and March 2011, no problems.
2. On hosting side, things usually are not transparent to us. We only know the file server is a Solaris. We requested to use "lsof" against a .nfsxxxx file, the result pointed to a kernel patch 144488-17 performed on Oct 16th.
Now no one admits it's their problem. My theory is that the kernel patch doesn't work correctly somehow after the time change but I can't prove it. I did review below links but can't tell what could be the potential problem.
Thanks in advance for any input!
Max