Hello
I write AIX korn shell scripts with awk (no nawk/gawk) and am trying to further my understanding of awk arrays as awk is so good at data maniplulation. How might I deal with the problem below using awk?
I have two files, proposed and actual
The files are timings with associated results for a group of students
how do i compare the two files?
actual file has complete time data and complete results.
proposed file may have gaps in the timing data.
i need to look at the timing( fields 2 3 and 4 ) for each student in the actual file
and
if the timing fields are in the proposed file, i need to copy proposed line out.
if the timing fields are not in the proposed file, i need to copy actual line out.
STUDENT|JOHN|
GRADES|2002-04-11 00:00|2002-04-11 00:30|3|100|100|100|10.2|
GRADES|2002-04-11 00:00|2002-04-11 00:30|2|15|15|18.23|10.23|
GRADES|2002-04-11 00:00|2002-04-11 00:30|1|144|144|18.22|10.22|
GRADES|2002-04-11 00:00|2002-04-11 00:30|-1|-1|-1|18.21|8.21|
GRADES|2002-04-11 00:00|2002-04-11 00:30|-2|-55|-55|18.20|7.85|
GRADES|2002-04-11 00:00|2002-04-11 00:30|-3|-500|-500|18.19|7.85|
GRADES|2002-04-11 00:30|2002-04-11 01:00|3|475|475|31.70|10.32|
GRADES|2002-04-11 00:30|2002-04-11 01:00|2|15|15|18.31|10.31|
GRADES|2002-04-11 00:30|2002-04-11 01:00|1|73|73|18.30|10.30|
GRADES|2002-04-11 00:30|2002-04-11 01:00|-1|-1|-1|18.29|8.29|
GRADES|2002-04-11 00:30|2002-04-11 01:00|-2|-127|-127|18.28|7.93|
GRADES|2002-04-11 00:30|2002-04-11 01:00|-3|-500|-500|18.27|7.93|
STUDENT|MARY|
GRADES|2002-04-11 00:00|2002-04-11 00:30|3|485|485|31.66|10.49|
GRADES|2002-04-11 00:00|2002-04-11 00:30|2|1|1|18.48|10.48|
GRADES|2002-04-11 00:00|2002-04-11 00:30|1|1|1|18.47|10.47|
GRADES|2002-04-11 00:00|2002-04-11 00:30|-1|-1|-1|18.46|8.46|
GRADES|2002-04-11 00:00|2002-04-11 00:30|-2|-1|-1|18.45|7.96|
GRADES|2002-04-11 00:00|2002-04-11 00:30|-3|-500|-500|18.44|7.46|
GRADES|2002-04-11 00:30|2002-04-11 01:00|3|485|485|31.66|10.57|
GRADES|2002-04-11 00:30|2002-04-11 01:00|2|1|1|18.56|10.56|
GRADES|2002-04-11 00:30|2002-04-11 01:00|1|1|1|18.55|10.55|
GRADES|2002-04-11 00:30|2002-04-11 01:00|-1|-1|-1|18.54|8.54|
GRADES|2002-04-11 00:30|2002-04-11 01:00|-2|-1|-1|18.53|8.04|
GRADES|2002-04-11 00:30|2002-04-11 01:00|-3|-500|-500|18.52|7.54|
I attempted the code below but all it does it prints out the actual data, not proposed. Thanks in advance.
Ash
I write AIX korn shell scripts with awk (no nawk/gawk) and am trying to further my understanding of awk arrays as awk is so good at data maniplulation. How might I deal with the problem below using awk?
I have two files, proposed and actual
The files are timings with associated results for a group of students
how do i compare the two files?
actual file has complete time data and complete results.
proposed file may have gaps in the timing data.
i need to look at the timing( fields 2 3 and 4 ) for each student in the actual file
and
if the timing fields are in the proposed file, i need to copy proposed line out.
if the timing fields are not in the proposed file, i need to copy actual line out.
STUDENT|JOHN|
GRADES|2002-04-11 00:00|2002-04-11 00:30|3|100|100|100|10.2|
GRADES|2002-04-11 00:00|2002-04-11 00:30|2|15|15|18.23|10.23|
GRADES|2002-04-11 00:00|2002-04-11 00:30|1|144|144|18.22|10.22|
GRADES|2002-04-11 00:00|2002-04-11 00:30|-1|-1|-1|18.21|8.21|
GRADES|2002-04-11 00:00|2002-04-11 00:30|-2|-55|-55|18.20|7.85|
GRADES|2002-04-11 00:00|2002-04-11 00:30|-3|-500|-500|18.19|7.85|
GRADES|2002-04-11 00:30|2002-04-11 01:00|3|475|475|31.70|10.32|
GRADES|2002-04-11 00:30|2002-04-11 01:00|2|15|15|18.31|10.31|
GRADES|2002-04-11 00:30|2002-04-11 01:00|1|73|73|18.30|10.30|
GRADES|2002-04-11 00:30|2002-04-11 01:00|-1|-1|-1|18.29|8.29|
GRADES|2002-04-11 00:30|2002-04-11 01:00|-2|-127|-127|18.28|7.93|
GRADES|2002-04-11 00:30|2002-04-11 01:00|-3|-500|-500|18.27|7.93|
STUDENT|MARY|
GRADES|2002-04-11 00:00|2002-04-11 00:30|3|485|485|31.66|10.49|
GRADES|2002-04-11 00:00|2002-04-11 00:30|2|1|1|18.48|10.48|
GRADES|2002-04-11 00:00|2002-04-11 00:30|1|1|1|18.47|10.47|
GRADES|2002-04-11 00:00|2002-04-11 00:30|-1|-1|-1|18.46|8.46|
GRADES|2002-04-11 00:00|2002-04-11 00:30|-2|-1|-1|18.45|7.96|
GRADES|2002-04-11 00:00|2002-04-11 00:30|-3|-500|-500|18.44|7.46|
GRADES|2002-04-11 00:30|2002-04-11 01:00|3|485|485|31.66|10.57|
GRADES|2002-04-11 00:30|2002-04-11 01:00|2|1|1|18.56|10.56|
GRADES|2002-04-11 00:30|2002-04-11 01:00|1|1|1|18.55|10.55|
GRADES|2002-04-11 00:30|2002-04-11 01:00|-1|-1|-1|18.54|8.54|
GRADES|2002-04-11 00:30|2002-04-11 01:00|-2|-1|-1|18.53|8.04|
GRADES|2002-04-11 00:30|2002-04-11 01:00|-3|-500|-500|18.52|7.54|
I attempted the code below but all it does it prints out the actual data, not proposed. Thanks in advance.
Ash
Code:
awk ' BEGIN { FS = "|" ; OFS = "|" ; RS = "\n"} {
if ( $1 ~ /^STUDENT/ ) {
newstudent = $2
}
if ( $1 ~ /^GRADES/ ) {
a = split ($0, myarr, "|" )
s = myarr[1]"|"myarr[2]"|"myarr[3]"|"myarr[4]"|"
}
while ((getline line < "actual") > 0) {
b = split (line, newline, "|" )
if (newline[0] = STUDENT ) {
origstudent = newline[2]
}
if ( myarr[0] ~ /^GRADES/ ) {
t = newline[1]"|"newline[2]"|"newline[3]"|"newline[4]"|"
}
if ( ( s == t ) && ( origstudent == newstudent ) ) {
printf ("%s\n" , myarr[a])
count = 1
}
else
count = 1
for (item in newline) {
count++
}
count = count - 2
for ( i = 1;i<count;i++ ) {
printf ("%s|" , newline[i] )
}
}}' proposed