help - A better way?

fidge · Nov 21, 2006

Hello

I have the following which although works, runs very slowly. Can anybody help with a better/faster approach?
i'm trying to find out the concurrent rate of transactions

Thanks.

input file
==========
start date/time , end date/time
0610060000580,0610060001000
0610060000580,0610060001000
0610060000590,0610060001000
0610060001020,0610060001040
0610060001030,0610060001040
0610060001040,0610060001060
0610060001040,0610060001060
Script
=======
cc=1
oldstart=o
for REC in $(cat $1) ; do
start=$(echo ${REC} | cut -d"," -f1)
end=$(echo ${REC} | cut -d"," -f2)

if [ $start != $oldstart ]
then
cc=1
fi

if [ $start = $end -o $start = $oldstart ]
then
let cc=cc+1
fi
echo "Concurrent rate at $start = $cc" >> $1.res
oldstart=$(echo ${start})
done

fidge · Nov 21, 2006

Sorry wrong forum. Moved to unix scripting

Annihilannic · Nov 21, 2006

I would use awk for this:

Code:

awk -F, '
        BEGIN { cc=1 }
        $1 != oldstart { cc=1 }
        $1 == $2 || $1 == oldstart { cc=cc+1 }
        { print "Concurrent rate at " $1 " = " cc ;  oldstart=$1 }
' inputfile

Annihilannic.

fidge · Nov 21, 2006

Thanks, that worked in under 30 seconds for 90k records.

olded · Nov 21, 2006

It might be an interesting technical exercise to change the script and let the shell do the parsing:

Code:

#!/bin/ksh

cc=1
oldstart=o
while IFS="," read start end
do
        if      [ $start != $oldstart ]
        then
                cc=1
        fi

        if      [ $start = $end -o $start = $oldstart ]
        then
                let cc=cc+1
        fi
        echo "Concurrent rate at $start = $cc"  >> $1.res
        oldstart=$(echo ${start})
done < "$1"

Annihilannic · Nov 21, 2006

You might want to change this:

[tt]oldstart=$(echo ${start})[/tt]

to this:

[tt]oldstart=${start}[/tt]

I would also quote all of the variables in the tests to make it 'safer', e.g.

[tt][ "$start" = "$end" -o "$start" = "$oldstart" ][/tt]

Annihilannic.

fidge · Nov 22, 2006

Thanks olded and annihilannic, but i'm going to stick with the awk option. I need to run this against 30 files containing between 80-120k records, and awk is so much faster.
thnaks again.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

help - A better way?

fidge

Technical User

fidge

Technical User

Annihilannic

MIS

fidge

Technical User

olded

Programmer

Annihilannic

MIS

fidge

Technical User

Similar threads

Part and Inventory Search

Sponsor