hello everyone;
i'm very new in awk and as i was reading on i wondered if awk be the solution to my problem.
first a sample of my xml file:
-- xml --
<dts:dtservices xmlns:dts="uri:dts_namespace">
<dts:tally_output>
<dts:tally dts:name="test_lrfs">
<dts:tally dts:name="cState">
<dts:tally dts:name="dc_range">
<dts:value>00-06 </dts:value>
<dts:count>1</dts:count>
</dts:tally>
<dts:tally dts:name="dc_range">
<dts:value>07-12 </dts:value>
<dts:count>2</dts:count>
</dts:tally>
<dts:tally dts:name="dc_range">
<dts:value>13-24 </dts:value>
<dts:count>3</dts:count>
</dts:tally>
<dts:value>AK</dts:value>
<dts:count>6</dts:count>
</dts:tally>
<dts:tally dts:name="cState">
<dts:tally dts:name="dc_range">
<dts:value>07-12 </dts:value>
<dts:count>5</dts:count>
</dts:tally>
<dts:tally dts:name="dc_range">
<dts:value>13-24 </dts:value>
<dts:count>6</dts:count>
</dts:tally>
<dts:value>AL</dts:value>
<dts:count>11</dts:count>
</dts:tally>
<dts:tally dts:name="cState">
<dts:tally dts:name="dc_range">
<dts:value>00-06 </dts:value>
<dts:count>3</dts:count>
</dts:tally>
<dts:tally dts:name="dc_range">
<dts:value>07-12 </dts:value>
<dts:count>2</dts:count>
</dts:tally>
<dts:tally dts:name="dc_range">
<dts:value>13-24 </dts:value>
<dts:count>1</dts:count>
</dts:tally>
<dts:value>AR</dts:value>
<dts:count>6</dts:count>
</dts:tally>
</dts:tally>
</dts:tally_output>
</dts:dtservices>
---end---
my plan is to extract data from this file ordered by "cState" producing seven coma separated columns.
col 1 = cState
col 2 = the count for 00-06 dc_range
col 3 = % (value from col 2 divided by sum of all 00-06s multiplied by a 100)
col 4 = the count for 07-12 dc_range + value from col 2
col 5 = % (value from col 4 divided by sum of all 00-06s and 07-12s multiplied by a 100)
col 6 = the count for 13-24 dc_range + value from col 4
col 7 = % (value from col 6 divided by sum of all 00-06s, 07-12s and 13-24 multiplied by a 100).
Finally, list the totals at the end of the file.
Therefore, based on the above xml, the output should look something like:
AK,1,11.11,3,23.08,6,26.09
AL,0,00.00,5,38.46,11,47.83
AR,3,33.33,5,38.46,6,26.09
Total: 9,13,23
is this possible to achieve???
btw, i have tried to process the xml via xslt transform but is very slow.
Thanks in advance for any input and assistance.
daula
i'm very new in awk and as i was reading on i wondered if awk be the solution to my problem.
first a sample of my xml file:
-- xml --
<dts:dtservices xmlns:dts="uri:dts_namespace">
<dts:tally_output>
<dts:tally dts:name="test_lrfs">
<dts:tally dts:name="cState">
<dts:tally dts:name="dc_range">
<dts:value>00-06 </dts:value>
<dts:count>1</dts:count>
</dts:tally>
<dts:tally dts:name="dc_range">
<dts:value>07-12 </dts:value>
<dts:count>2</dts:count>
</dts:tally>
<dts:tally dts:name="dc_range">
<dts:value>13-24 </dts:value>
<dts:count>3</dts:count>
</dts:tally>
<dts:value>AK</dts:value>
<dts:count>6</dts:count>
</dts:tally>
<dts:tally dts:name="cState">
<dts:tally dts:name="dc_range">
<dts:value>07-12 </dts:value>
<dts:count>5</dts:count>
</dts:tally>
<dts:tally dts:name="dc_range">
<dts:value>13-24 </dts:value>
<dts:count>6</dts:count>
</dts:tally>
<dts:value>AL</dts:value>
<dts:count>11</dts:count>
</dts:tally>
<dts:tally dts:name="cState">
<dts:tally dts:name="dc_range">
<dts:value>00-06 </dts:value>
<dts:count>3</dts:count>
</dts:tally>
<dts:tally dts:name="dc_range">
<dts:value>07-12 </dts:value>
<dts:count>2</dts:count>
</dts:tally>
<dts:tally dts:name="dc_range">
<dts:value>13-24 </dts:value>
<dts:count>1</dts:count>
</dts:tally>
<dts:value>AR</dts:value>
<dts:count>6</dts:count>
</dts:tally>
</dts:tally>
</dts:tally_output>
</dts:dtservices>
---end---
my plan is to extract data from this file ordered by "cState" producing seven coma separated columns.
col 1 = cState
col 2 = the count for 00-06 dc_range
col 3 = % (value from col 2 divided by sum of all 00-06s multiplied by a 100)
col 4 = the count for 07-12 dc_range + value from col 2
col 5 = % (value from col 4 divided by sum of all 00-06s and 07-12s multiplied by a 100)
col 6 = the count for 13-24 dc_range + value from col 4
col 7 = % (value from col 6 divided by sum of all 00-06s, 07-12s and 13-24 multiplied by a 100).
Finally, list the totals at the end of the file.
Therefore, based on the above xml, the output should look something like:
AK,1,11.11,3,23.08,6,26.09
AL,0,00.00,5,38.46,11,47.83
AR,3,33.33,5,38.46,6,26.09
Total: 9,13,23
is this possible to achieve???
btw, i have tried to process the xml via xslt transform but is very slow.
Thanks in advance for any input and assistance.
daula