Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Korn file processing script

Status
Not open for further replies.

heprox

IS-IT--Management
Dec 16, 2002
178
US
I have a filesystem called "datafiles" that occasionally has files with the structure "upload.0002" or "upload.0003", etc.. I need to create a Korn script that can be run via CRON that will do the following:

1. Check in my "datafiles" directory for any files that have those names.
2. Insure that any files found with that naming structure have completely uploaded.
3. Look at the extension that the file has (a.k.a. "0002", "0003", "0004", etc.) and capture it as a variable.
4. Re-name the file to the "str####.asc" structure (i.e. upload.0002 would become str0002.asc)
4. Run a command called "prep" followed by the correct variable for that file, so "str0002.asc" would have "/prep 0002" run for it and "str0003.asc" would have "/prep 0003" run for it...
5. Wait 300 seconds and repeat, until the process is killed by CRON

Something like:

processFile ()
{
local fileName=$1
local fileExtension=$2
# doStuffWithFile
}

# Main Processing

for fileName in uploadDirectory/upload.[0-9][0-9][0-9][0-9]
do
fileExtension=${fileName#*.}
fileTest=/tmp/$fileName.$(date +%d%m%y_%H%M%S)
cp -f $fileName $fileTest
sleep 60
test "$fileName" != "$(find $fileName -newer $fileTest)" > /dev/null && processFile fileName fileExtension
done


...should check for the file and make sure it is done processing (thanks Damian...), but how to get the script to capture the extension and run arguments against a file using that extension as a variable I'm puzzled?
 
Where are you stuck ? In the # doStuffWithFile part ?
Re-name the file to the "str####.asc" structure (i.e. upload.0002 would become str0002.asc)
mv $fileName ${fileName%/*}/str$fileExtension.asc
Run a command called "prep" followed by the correct variable for that file, so "str0002.asc" would have "/prep 0002" run for it and "str0003.asc" would have "/prep 0003" run for it...
/prep $fileExtension

Hope This Help, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884
 
Something like this ?

Code:
processFile ()
{
   local fileName=$1
   local fileExtension=$2
   local fileNewName="str${fileExtension}.asc"
   mv $fileName $fileNewName      # Add status check
   /prep $fileExtension
}

# Main Processing

while :
do
   for fileName in uploadDirectory/upload.[0-9][0-9][0-9][0-9]
   do
      fileExtension=${fileName#*.}
      fileTest=/tmp/$fileName.$(date +%d%m%y_%H%M%S)
      cp -f $fileName $fileTest
      sleep 60
      test "$fileName" != "$(find $fileName -newer $fileTest)" > /dev/null && processFile fileName fileExtension
   done
   sleep 300
done

Jean Pierre.
 
Why the sleep? It is not necessary. Do check the status after the cp as if it is locked/busy/nospace then it will fail.

You could do "cp -f $fn $ft &" and then do a wait till the process completed as a child process
 
After placing the following in usr/bin and running it from CRON I'm getting the error:

cp: /tmp//datafiles/upload.0100.090304_100854: A file or directory in the path name does not exist.
find: 0652-015 Cannot access file /tmp//datafiles/upload.[0-9][0-9][
0-9][0-9].090304_100754.
mv: 0653-401 Cannot rename fileName to strfileExtension.asc:
A file or directory in the path name does not exist.

...here is the modified script:

!/bin/ksh

processFile ()
{
local fileName=$1
local fileExtension=$2
local fileNewName="str${fileExtension}.asc"
mv $fileName $fileNewName # Add status check
cd /datafiles
prep /datafiles/$fileNewName $fileExtension
}

# Main Processing

while :
do
for fileName in /datafiles/upload.[0-9][0-9][0-9][0-9]
do
fileExtension=${fileName#*.}
fileTest=/tmp/$fileName.$(date +%d%m%y_%H%M%S)
cp -f $fileName $fileTest
sleep 30
test "$fileName" != "$(find $fileName -newer $fileTest)" > /dev/null && processFile fileName fileExtension
done
sleep 300
done


...it looks like the path to the filename is interfering with the test process for a "new" file?
 
heprox, take a look at this man pages:
man cd
man dirname
man basename
man test (option -f)
man ksh (diff between var name and var value)

Hope This Help, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884
 
heprox, execute your script at shell level before running it from Cron.

If you have problems, execute the script with the -x option :
[tt]ksh -x your_script[/tt]

To debug the [tt]processFile[/tt] function, insert the following statement at the begining of the function :
[tt]set -x[/tt]

Try this modification :
[tt]fileTest=/tmp/$(basename $fileName).$(date +%d%m%y_%H%M%S)[/tt]

cdlvj, in most cases you can copy the file while it is used by another process.
Try this :
[tt] > grep 'TEST' > file_to_copy
^Z Put process in background
> cp file_to_copy the_copy # succes
>[/tt]

Jean Pierre.
 
Got it working with the following:

#!/bin/ksh

processFile ()
{
local fileName=$1
local fileExtension=$2
local fileNewName="/datafiles/str${fileExtension}.asc"
mv $fileName $fileNewName # Add status check
cd /datafiles
chmod 667 $fileNewName
prep $fileNewName $fileExtension
}

# Main Processing

while :
do
for fileName in /datafiles/upload.[0-9][0-9][0-9][0-9]
do
fileExtension=${fileName#*.}
fileTest=/tmp/$(basename $fileName).$(date +%d%m%y_%H%M%S)
cp -f $fileName $fileTest
sleep 10
test "$fileName" != "$(find $fileName -newer $fileTest)" > /dev/null && processFile $fileName $fileExtension
done
sleep 60
done

...but it still has a couple of problems:

1. When I leave in the "sleep 10" command inside the loop, it sometimes processes new files incorrectly. If someone is placing a new "upload.####" file in the /datafiles directory during that 10 second pause then the script creates a /tmp/upload.[0-9][0-9][0-9][0-9]."date"."time" file with the partially uploaded file.

2. If you remove the "sleep 10" command it no longer creates the /tmp/upload.[0-9][0-9][0-9][0-9]."date"."time" files however it ends up processing some files that are partially uploaded to the directory.

...the script needs to verify that the file is not still uploading, then process it accordingly; I need this to happen while other "upload.####" files are being created simultaneously?


 
man fuser

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
The function 'is_file_arrived' verify that a file is not used by another process and that its size is stable.

Code:
#
# Function : is_file_arrived file
# Arg(s)   : file = file to verify
# Output   : None
# Status   : 0 = yes file arrived, 1 = no
# Env.     : IFA_WAIT : interval (secs) for file size check (def=5)
#
is_file_arrived() {
   [ -z "$1" ] && return 1
   local file=$1
   local arrived=1
   local size1 size2
   if [ -f "$file" -a -z "$(fuser $file 2> /dev/null)" ] ; then
      size1=$(ls -l $file 2>/dev/null | awk '{print $5}')
      sleep ${IFA_WAIT:-5}
      size2=$(ls -l $file 2>/dev/null | awk '{print $5}')
      [ ${size1:-1} -eq ${size2:-2} ] && arrived=0
   fi
   return $arrived
}

processFile ()
{
   local fileName=$1
   local fileExtension=$2
   local fileNewName="str${fileExtension}.asc"
   mv $fileName $fileNewName      # Add status check
   /prep $fileExtension
}

# Main Processing

while :
do
   for fileName in uploadDirectory/upload.[0-9][0-9][0-9][0-9]
   do
      fileExtension=${fileName#*.}
      is_file_arrived "$fileName" && processFile fileName fileExtension
   done
   sleep 300
done

Jean Pierre.
 
I just tried this on AIX. Ftp'd a large file, and tried to do a cp. It returned "cannot open a file containing a running program" and returned a status of 1.

So you need run a test and see if your UNIX works the same way.

Then you can add a while trying the cp with the sleep to wait till the transfer is done and release the file.

 
The function worked like a charm, thanks for your help. Now all I have to do is get the environment set correctly for CRONTAB to run this and call the "prep" script by name. Thanks again.
 
heprox, For the benefit of the board, what fixed the problem?
 
I apologize I'm still wrestling with the environment for crontab issue: The final script is:

#!/bin/ksh


#
# Function : is_file_arrived file
# Arg(s) : file = file to verify
# Output : None
# Status : 0 = yes file arrived, 1 = no
# Env. : IFA_WAIT : interval (secs) for file size check (def=5)
#

is_file_arrived() {
[ -z "$1" ] && return 1
local file=$1
local arrived=1
local size1 size2
if [ -f "$file" -a -z "$(fuser $file 2> /dev/null)" ] ; then
size1=$(ls -l $file 2>/dev/null | awk '{print $5}')
sleep ${IFA_WAIT:-15}
size2=$(ls -l $file 2>/dev/null | awk '{print $5}')
[ ${size1:-1} -eq ${size2:-2} ] && arrived=0
fi
return $arrived
}


processFile ()
{
local fileName=$1
local fileExtension=$2
local fileNewName="/datafiles/str${fileExtension}.asc"
mv $fileName $fileNewName # Add status check
chmod 666 $fileNewName
prep $fileNewName $fileExtension
}

# Main Processing

while :
do
for fileName in /datafiles/upload.[0-9][0-9][0-9][0-9]
do
fileExtension=${fileName#*.}
is_file_arrived "$fileName" && processFile $fileName $fileExtension
done
sleep 60
done

...I had to place the "$" in front of the arguments for the processFile function and modify the permissions on hte "upload" files with chmod to get "prep" to run correctly. The only problem I'm having now is it will only run "prep" correctly when the script is run with my environment, not crontab's?
 
The only problem I'm having now is it will only run "prep" correctly when the script is run with my environment, not crontab's?
Assign and export the needed environment variables in front of your script.

Hope This Help, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top