Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Korn shell script question 2

Status
Not open for further replies.

rbeech23

Programmer
Dec 22, 2002
14
GB
Hello, I've written a script in korn shell which does the following:

I’ve got a process that’s pulling data from a database via middleware but missing odd records. The only apparent way of checking is via a log file which for every new record increments by one, so something like:
4010203040
4010203041
4010203042
4010203043
4010203044
4010203045
4010203046
4010203047
4010203048
4010203049

When I get missing records the log file looks something like:
4010203040
4010203041
4010203043
4010203044
4010203047
4010203048
4010203049

Looking at it manually is clear which records are missing but the log file can contain hundreds of thousands of records. What I’ve been doing is splitting the file into 10000 records and doing a simple sum to check if the expected number of records is right based on last record – first record. This works okay and is fast but the bit to identify which records are missing in each file is taking ages to run

Could anyone provide a clue to the fastest utility to trawl through a file and highlight the missing records when each number in the sequence is supposed to be incremented by 1 ?

Cheers
Rob
 
I am assuming the values will be in sorted order. Else, sort the file in ascending order and lets call it mylogfile.txt.

Then you can write a sequence generator as given below and generate a file of sequences and call it seqs.txt.

Code:
#!/bin/ksh
awk -v START=$1 -v FINISH=$2 'BEGIN { for (i=START; i <= FINISH; i++) { print i
} }'
If you called the above script seqgen.ksh you could invoke it as

Code:
ksh seqgen.ksh 4010203040 4010203049 > seqs.txt

Then do a grep as below
Code:
grep -vf mylogfile.txt seqs.txt

The above will print out sequence numbers that are missing in mylogfile.txt

I hope this helps you. Some of the gurus may come up with more elegant solutions.
 
A starting point:
Code:
awk 'NR==1{o=$1;next}$1!=o+1{for(i=o+1;i<$1;++i)printf "%9.0f\n",i}{o=$1}' /path/to/logfile >missings

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Great replies, my problem is solved. Thanks very much
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top