Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Unix File Status - when has an ftp to unix finished 3

Status
Not open for further replies.

UnixClueless

Technical User
Sep 13, 2001
10
AU
Hi All,

Currently we have quite large files being ftp'd to our unix server. Our unix server has a process which checks for the existence of the file and if it is found it is processed.

How can we be sure that the file transfer has completed and that the process is actually processing the completely copied file?

Is there any way of telling what the status of the file is?

Any assistance would be appreciated.
 
Could your ftp process be amended to create a flag file on the server once it is completed? Something along the lines of a file called 'finished' or similar would do. Your cron job could then check for the existence of this file and continue processing if it is present, but bail out if it isn't. It would probably be best to delete the 'finished' file at the beginning of the ftp process too, to prevent old files being processed if the ftp fails for some reason.

I'm sure someone had a more elegant solution than this recently, but haven't been able to find it in any of the forums. Perhaps someone else will remember.
 
I would suggest watching for the FTP connection to close. Once the ftp has started you can check the connection with netstat -an. The state will change from ESTABLISHED to TIME_WAIT and then just not show up in the netstat information, once the ftp session is closed (transfer complete).
Another option is to "watch" the file size that is being transfered. Lets say the file is checked every 30 seconds. If the size stays the same for 2 minutes, you could presume that the transfer is done.
If the transfer is always from the same system, starting about the same time, I would go with the first suggestion.

crowe
 
You could also use fuser in a loop. From the fuser man pages:

NAME
fuser - identify processes using a file or file structure

SYNOPSIS
/usr/sbin/fuser [ - [c | f ] ku ] files [ [ - [c | f
] ku ] files ] ...

DESCRIPTION
fuser displays the process IDs of the processes that are
using the files specified as arguments.

Each process ID is followed by a letter code. These letter
codes are interpreted as follows: if the process is using
the file as

c Indicates that the process is using the file as
its current directory.

m Indicates that the process is using a file mapped
with mmap(2). See mmap(2) for details.

o Indicates that the process is using the file as an
open file.

r Indicates that the process is using the file as
its root directory.

t Indicates that the process is using the file as
its text file.

y Indicates that the process is using the file as
its controlling terminal.

So that

<file>: <pid>o

Would indicate that <file> is currently open to process <pid> - ie: in the process of being written.

Cheers, NEIL
 
Hi Guys,

Thanks for all the feedback, it's been a great help. Will look into the options.

Cheers
 
I do it the lazy way :)


Say I want to create a file, called big_file.txt.

I don't. I create a file called big_file.tmp and then, in the ftp script when the transfer has finished, I rename it to big_file.txt.

That way you never get a partially written file, because it doesn't exist until it's finished. Mike
michael.j.lacey@ntlworld.com
Email welcome if you're in a hurry or something -- but post in tek-tips as well please, and I will post my reply here as well.
 
Mike may be lazy, but his is probably the most elegant solution. We have used both his suggested method and the flag file (like big_file.finished) in production applications.

Checking the ftp connection is fun for academics and expanding your Unix knowhow, but for real world, stick with the simple reliable methods.

- Steve StevieW85@go.com
 
Lazy=Good :)

attributes of a good programmer (c) Larry Wall

1 - lazy, can't be bothered to do anything too complicated, or to do anything twice

2 - impatient, it's got to run just that *bit* faster...

3 - proud, you don't want someone looking at your code in a year and saying &quot;what was he *thinking*?!&quot; Mike
michael.j.lacey@ntlworld.com
Email welcome if you're in a hurry or something -- but post in tek-tips as well please, and I will post my reply here as well.
 
I have had problems in the past where files have been incomplete.
If the file has a unique last line you could tail the file and check for that line, or if you know the line count you could check for that as well.

Ged Jones

Top man
 
MikeLacey and StevieW85 both have the simplest solution. However you should be careful if the file you rename to is not on the same filesystem as the transient file. In that scenario, the rename is no longer an atomic event, and you could again face the problems of seeing a partial file.

Cheers, NEIL
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top