awkerfeller
IS-IT--Management
Having an issue parsing some data. Below is how my data looks
[pre]Jul 2014: data disk -delim :
0:Sample_0:0:mapsnline:0:Size 40GB15k:20.00GB:segment:3:location::A000000000000030:1:1:empty:1:no:0
1:Sample_1:0:mapsnline:0:Size 40GB15k:20.00GB:segment:4:location::A000000000000031:1:1:empty:1:no:0
2:Sample_2:0:mapsnline:0:Size 40GB15k:20.00GB:segment:5:location::A000000000000032:1:1:empty:1:no:0
Jul 2014: data network -delim :
0:Sample_3:0:mapsnline:0:Size 60GB15k:10.00GB:segment:3:location::A000000000000030:1:1:empty:1:no:0
1:Sample_4:0:mapsnline:0:Size 60GB15k:10.00GB:segment:4:location::A000000000000031:1:1:empty:1:no:0
2:Sample_5:0:mapsnline:0:Size 60GB15k:10.00GB:segment:5:location::A000000000000032:1:1:empty:1:no:0[/pre]
The MOST important piece to note it "data disk" I am looking to go through a file (usually 6k lines long) and say: "look if you see this variable (data disk), read from beginning to end, and give me a sum of field 8" For example:
[pre]awk 'BEGIN { 2014 = ""} { if ($8 == “[0-9]GB") size = sum += $8"GB"; else … blah blah blah [0-9]MB}’[/pre]
I thought about using RS e.g.:
awk -RS "2014" -F":" '/data disk/{ sum += $8 }' myfile
But it would never work because sometimes the amount may be GB, TB, MB. Any thoughts on this?
[pre]gawk -F: '
$1 ~ /^[[:alpha:]]+ [[:digit:]]+$/ {
if (sum) print sum
sum=0
do_sum = ($2 ~ /disk/)
if (do_sum) printf "%s", $0
}
/^[[:blank:]]*$/ {next}
do_sum {
match($8,/([[:digit:].]+)(|GB|TB)/, a)
if (a[2] == "GB") { sum += a[1]*1000 }
else if (a[2] == "TB") { sum += a[1]*1000*1000 }
else { sum += a[1] }
}
END {if (do_sum) print sum}
' << END
MYDATA
END[/pre]
Also does not work. Any pointers?
[pre]Jul 2014: data disk -delim :
0:Sample_0:0:mapsnline:0:Size 40GB15k:20.00GB:segment:3:location::A000000000000030:1:1:empty:1:no:0
1:Sample_1:0:mapsnline:0:Size 40GB15k:20.00GB:segment:4:location::A000000000000031:1:1:empty:1:no:0
2:Sample_2:0:mapsnline:0:Size 40GB15k:20.00GB:segment:5:location::A000000000000032:1:1:empty:1:no:0
Jul 2014: data network -delim :
0:Sample_3:0:mapsnline:0:Size 60GB15k:10.00GB:segment:3:location::A000000000000030:1:1:empty:1:no:0
1:Sample_4:0:mapsnline:0:Size 60GB15k:10.00GB:segment:4:location::A000000000000031:1:1:empty:1:no:0
2:Sample_5:0:mapsnline:0:Size 60GB15k:10.00GB:segment:5:location::A000000000000032:1:1:empty:1:no:0[/pre]
The MOST important piece to note it "data disk" I am looking to go through a file (usually 6k lines long) and say: "look if you see this variable (data disk), read from beginning to end, and give me a sum of field 8" For example:
[pre]awk 'BEGIN { 2014 = ""} { if ($8 == “[0-9]GB") size = sum += $8"GB"; else … blah blah blah [0-9]MB}’[/pre]
I thought about using RS e.g.:
awk -RS "2014" -F":" '/data disk/{ sum += $8 }' myfile
But it would never work because sometimes the amount may be GB, TB, MB. Any thoughts on this?
[pre]gawk -F: '
$1 ~ /^[[:alpha:]]+ [[:digit:]]+$/ {
if (sum) print sum
sum=0
do_sum = ($2 ~ /disk/)
if (do_sum) printf "%s", $0
}
/^[[:blank:]]*$/ {next}
do_sum {
match($8,/([[:digit:].]+)(|GB|TB)/, a)
if (a[2] == "GB") { sum += a[1]*1000 }
else if (a[2] == "TB") { sum += a[1]*1000*1000 }
else { sum += a[1] }
}
END {if (do_sum) print sum}
' << END
MYDATA
END[/pre]
Also does not work. Any pointers?