Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

first time with Python

Status
Not open for further replies.

subok

MIS
Feb 21, 2005
37
BE
[tt]Hi,

I trying to write a script in python to parse an xml file.
The xml file contains, measurements taken every 5 minutes. I need a script to parse the xml file and convert it to a csv formatted text.

Any help would highly appreciated :

example input file, i need to parse and take the values in the <measInfo measInfoId="TrafficMS">.

<measInfo measInfoId=" ControlCPM">
<granPeriod duration="PT300S" endTime="2013-12-06T10:40:37+01:00" />
<measType p="1">r5Ipv4Peers</measType>
<measType p="2">r5Ipv6Peers</measType>
<measType p="3">r8Ipv4Peers</measType>
<measType p="4">r8Ipv6Peers</measType>
<measType p="5">rxIpv4Peers</measType>
<measValue measObjLdn="KCI=ControlPlane,GroupName=CPM,slot=11">
<r p="1">3</r>
<r p="2">0</r>
<r p="3">0</r>
<r p="4">0</r>
<r p="5">0</r>
<suspect>false</suspect>
</measValue>
</measInfo>
<measInfo measInfoId=" ControlVP">
<granPeriod duration="PT300S" endTime="2013-12-06T10:40:37+01:00" />
<measType p="1">ripUtilization</measType>
<measType p="2">numOfHssStaticSubscribersPerVprn</measType>
<measValue measObjLdn="KCI=ControlPlane,GroupName=CPM,slot=11,vprnRouterName=Base,vrId=1">
<r p="1">43.44</r>
<r p="2">0</r>
<suspect>false</suspect>
</measValue>
<measValue measObjLdn="KCI=ControlPlane,GroupName=CPM,slot=11,vprnRouterName=vprn765,vrId=5">
<r p="1">36.01</r>
<r p="2">0</r>
<suspect>false</suspect>
</measValue>
</measInfo>
<measInfo measInfoId="SystemCP">
<granPeriod duration="PT300S" endTime="2013-12-06T10:45:37+01:00" />
<measType p="1">CpuUtilization</measType>
<measType p="2">MemoryUtilization</measType>
<measType p="3">ripv4SdfUtilization</measType>
<measType p="4">ripv6SdfUtilization</measType>
<measValue measObjLdn="KPI=System,GroupName=CP-ISA,group=1,slot=6,mda=1">
<r p="1">2</r>
<r p="2">28</r>
<r p="3">9.88</r>
<r p="4">0.00</r>
<suspect>false</suspect>
</measValue>
</measInfo>
<measInfo measInfoId="ManagementCP">
<granPeriod duration="PT300S" endTime="2013-12-06T10:45:37+01:00" />
<measType p="1">Successful</measType>
<measType p="2">Failures</measType>
<measType p="3">FromNonSuccessful</measType>
<measType p="4">FromNonFailures</measType>
<measType p="5">WithPiggyBackingSuccessful</measType>
<measType p="6">mmProcedureSuccessful</measType>
<measValue measObjLdn="KPI=BearerManagement,GroupName=CP-ISA,group=1,slot=6,mda=1">
<r p="1">0</r>
<r p="2">0</r>
<r p="3">0</r>
<r p="4">0</r>
<r p="5">0</r>
<r p="6">0</r>
<suspect>false</suspect>
</measValue>
</measInfo>
<measInfo measInfoId="TrafficMS">
<granPeriod duration="PT300S" endTime="2013-12-06T10:45:37+01:00" />
<measType p="1">r5uGiUlPackets</measType>
<measType p="2">r5uGiUlBytes</measType>
<measType p="3">r5uGiUlDropPackets</measType>
<measType p="4">riS5uDlPackets</measType>
<measType p="5">riS5uDlBytes</measType>
<measType p="6">riS5uDlDropPackets</measType>
<measValue measObjLdn="KPI=BearerTraffic,GroupName=MSM,group=1,slot=6,mda=1">
<r p="1">9874210</r>
<r p="2">1527737770</r>
<r p="3">0</r>
<r p="4">11283018</r>
<r p="5">10144475091</r>
<r p="6">205558</r>
<suspect>false</suspect>
</measValue>
</measInfo>
<measInfo measInfoId="SystemCPM">
<granPeriod duration="PT300S" endTime="2013-12-06T10:50:37+01:00" />
<measType p="1">CpuUtilization</measType>
<measType p="2">maxCpuUtilization</measType>
<measType p="3">MemoryUtilization</measType>
<measType p="4">maxMemoryUtilization</measType>
<measType p="5">priFlashUtilization</measType>
<measType p="6">recFlashUtilization</measType>
<measValue measObjLdn="KPI=System,GroupName=CPM,slot=11">
<r p="1">2</r>
<r p="2">3</r>
<r p="3">43</r>
<r p="4">43</r>
<r p="5">0</r>
<r p="6">0</r>
<suspect>false</suspect>
</measValue>
</measInfo>
<measInfo measInfoId="TrafficMS">
<granPeriod duration="PT300S" endTime="2013-12-06T10:50:37+01:00" />
<measType p="1">r5uGiUlPackets</measType>
<measType p="2">r5uGiUlBytes</measType>
<measType p="3">r5uGiUlDropPackets</measType>
<measType p="4">riS5uDlPackets</measType>
<measType p="5">riS5uDlBytes</measType>
<measType p="6">riS5uDlDropPackets</measType>
<measValue measObjLdn="KPI=BearerTraffic,GroupName=MSM,group=1,slot=6,mda=1">
<r p="1">8874210</r>
<r p="2">2227737880</r>
<r p="3">0</r>
<r p="4">99283018</r>
<r p="5">33144475091</r>
<r p="6">445558</r>
<suspect>false</suspect>
</measValue>
</measInfo>

Output needed (csv):
output csv:
endtime,r5uGiUlPackets,r5uGiUlBytes,r5uGiUlDropPackets,riS5uDlPackets,riS5uDlBytes,riS5uDlDropPackets
2013-12-06T10:45:37+01:00,9874210,1527737770,0,11283018,10144475091,205558
2013-12-06T10:50:37+01:00,8874210,2227737880,0,99283018,33144475091,445558


 
Hi

First of all, is that XML a fragment from a single file or there are 7 distinct XML files ?

On this site we help fellow members solve their problems, but we are not doing their job. So please post what you tried so far and tell us where are you stuck.

Next time please post your preformatted text between [tt][ignore][pre][/ignore][/tt] and [tt][ignore][/pre][/ignore][/tt] ( or [tt][ignore]
Code:
[/ignore][/tt] and [tt][ignore]
[/ignore][/tt] ) TGML tags.


Feherke.
feherke.ga
 
Hi Feherke,

sorry my bad. I did not know how to start earlier but i tried.
what i have now, is i'm able to extract the data i want in the xml file but i'm stuck in formatting it the way i want.

Thanks a lot for any help or advice.

Python:
import xml.sax
collection = []
class HandleCollection (xml.sax.ContentHandler):
    def __init__ (self):
        self.flag = 0
    def startElement (self, name, attributes):
        if ((name == 'measInfo') and (attributes.get('measInfoId') == "KPIBearerTrafficMSM")):
            self.flag = 1    
    def endElement (self, name):
        if name == 'measInfo':
            self.flag = 0   
    def characters (self, content):
        if (self.flag):
            print (content)
parser = xml.sax.make_parser()
parser.setContentHandler (HandleCollection())
parser.parse ('super40.xml')

My output for this code is as follow :

[pre]






U.r5uGiUlPackets



U.r5uGiUlBytes



U.r5uGiUlDropPackets



U.gtR5uDlPackets



U.gtR5uDlBytes



U.gtR5uDlDropPackets






9974677



1589104745



0



11464368



10224276856



202108



false












U.r5uGiUlPackets



U.r5uGiUlBytes



U.r5uGiUlDropPackets



U.gtR5uDlPackets



U.gtR5uDlBytes






9974677



1589104745



0



11464368



10224276856



202108



false







[/pre]

but i'm stuck in trying to manipulate the output into a csv file having the "endtime" in the first column:

[pre]
endTime,U.r5uGiUlPackets,U.r5uGiUlBytes,U.r5uGiUlDropPackets,U.gtR5uDlPackets,U.gtR5uDlBytes,U.gtR5uDlDropPackets
2013-12-06T10:05:37+01:00,9974677,1589104745,0,11464368,10224276856,202108,false
2013-12-06T10:10:37+01:00,9974677,1589104745,0,11464368,10224276856,202108,false
[/pre]

The XML file
[pre]
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="DataCollection.xsl"?>
<measCollecFile
xmlns=" <fileHeader fileFormatVersion="32.401 V8.0.0"
vendorName="SUPER40">
<fileSender
localDn="MCC=222,MNC=15,ManagedElement=SUPER40"
elementType="net instance 1" />
<measCollec beginTime="2013-12-06T10:00:37+01:00" />
</fileHeader>
<measData>
<managedElement
localDn="MCC=222,MNC=15,ManagedElement=SUPER40"
swVersion="C-9.0.S8" />
<measInfo measInfoId="ISA">
<granPeriod duration="PT300S" endTime="2013-12-06T10:05:37+01:00" />
<measType p="1">U.avgCpuUtilization</measType>
<measType p="2">U.avgMemoryUtilization</measType>
<measType p="3">U.ipv4SdfUtilization</measType>
<measType p="4">U.ipv6SdfUtilization</measType>
<measValue measObjLdn="KPI=System,GroupName=ISA,group=1,slot=6,da=1">
<r p="1">2</r>
<r p="2">28</r>
<r p="3">9.79</r>
<r p="4">0.00</r>
<suspect>false</suspect>
</measValue>
</measInfo>
<measInfo measInfoId="KPIBearerTrafficMSM">
<granPeriod duration="PT300S" endTime="2013-12-06T10:05:37+01:00" />
<measType p="1">U.r5uGiUlPackets</measType>
<measType p="2">U.r5uGiUlBytes</measType>
<measType p="3">U.r5uGiUlDropPackets</measType>
<measType p="4">U.gtR5uDlPackets</measType>
<measType p="5">U.gtR5uDlBytes</measType>
<measType p="6">U.gtR5uDlDropPackets</measType>
<measValue measObjLdn="KPI=BearerTraffic,GroupName=MSM,group=1,slot=6,da=1">
<r p="1">9974677</r>
<r p="2">1589104745</r>
<r p="3">0</r>
<r p="4">11464368</r>
<r p="5">10224276856</r>
<r p="6">202108</r>
<suspect>false</suspect>
</measValue>
</measInfo>
<measInfo measInfoId="KPISystemCPM">
<granPeriod duration="PT300S" endTime="2013-12-06T10:05:37+01:00" />
<measType p="1">U.avgCpuUtilization</measType>
<measType p="2">U.maxCpuUtilization</measType>
<measType p="3">U.avgMemoryUtilization</measType>
<measType p="4">U.maxMemoryUtilization</measType>
<measType p="5">U.priFlashUtilization</measType>
<measType p="6">U.secFlashUtilization</measType>
<measValue measObjLdn="KPI=System,GroupName=CP,slot=11">
<r p="1">2</r>
<r p="2">3</r>
<r p="3">43</r>
<r p="4">43</r>
<r p="5">0</r>
<r p="6">0</r>
<suspect>false</suspect>
</measValue>
</measInfo>
<measInfo measInfoId="KPIBearerTrafficMSM">
<granPeriod duration="PT300S" endTime="2013-12-06T10:10:37+01:00" />
<measType p="1">U.r5uGiUlPackets</measType>
<measType p="2">U.r5uGiUlBytes</measType>
<measType p="3">U.r5uGiUlDropPackets</measType>
<measType p="4">U.gtR5uDlPackets</measType>
<measType p="5">U.gtR5uDlBytes</measType>
<measValue measObjLdn="KPI=BearerTraffic,GroupName=MSM,group=1,slot=6,da=1">
<r p="1">9974677</r>
<r p="2">1589104745</r>
<r p="3">0</r>
<r p="4">11464368</r>
<r p="5">10224276856</r>
<r p="6">202108</r>
<suspect>false</suspect>
</measValue>
</measInfo>
</measData>
<fileFooter>
<measCollec endTime="2013-12-06T11:00:37+01:00" />
</fileFooter>
</measCollecFile>
[/pre]
 
Hi

A SAX parser. Not my number one choice for so small XML files, but not a problem.
Python:
[b]import[/b] xml.sax

[b]class[/b] HandleCollection(xml.sax.ContentHandler):

    [b]def[/b] __init__([b]self[/b]):
        [b]self[/b].flag = 0
        [b]self[/b].row = []
        [b]self[/b].header_sent = 0

    [b]def[/b] startElement([b]self[/b], name, attributes):
        [b]if[/b] name == [green][i]'measInfo'[/i][/green] [b]and[/b] attributes.get([green][i]'measInfoId'[/i][/green]) == [green][i]"KPIBearerTrafficMSM"[/i][/green]:
            [b]self[/b].flag = 1
        [b]elif[/b] [b]self[/b].flag [b]and[/b] name == [green][i]'granPeriod'[/i][/green]:
            [b]self[/b].end_time = attributes.get([green][i]'endTime'[/i][/green])
            [b]self[/b].row = [[green][i]'endTime'[/i][/green]]
        [b]elif[/b] self.flag [b]and[/b] name == [green][i]'measValue'[/i][/green]:
            [b]if[/b] [b]not[/b] self.header_sent:
                [b]self[/b].header_sent = 1
                [b]print[/b] [green][i]','[/i][/green].join(self.row)
            [b]self[/b].row = [[b]self[/b].end_time]

    [b]def[/b] endElement([b]self[/b], name):
        [b]if[/b] self.flag [b]and[/b] name == [green][i]'measInfo'[/i][/green]:
            [b]self[/b].flag = 0
            [b]print[/b] [green][i]','[/i][/green].join(self.row)

    [b]def[/b] characters([b]self[/b], content):
        [b]if[/b] [b]self[/b].flag [b]and[/b] content.strip():
            [b]self[/b].row.append(content.strip())

parser = xml.sax.make_parser()
parser.setContentHandler(HandleCollection())
parser.parse([green][i]'super40.xml'[/i][/green])
Note that you may prefer to implement quoting and/or escaping for values that contain comma ( , ).


Feherke.
feherke.ga
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top