GufranVanjara
Programmer
Hello,
I have an XML file, for which I need to do some analysis. However, there are some records in the XML that I would like to remove before doing an analysis. Below is the sample xml file to better explain what I would like to achieve.
As you can see below, the V3Document element encomposses other nested elements and attributes. There are times when the nested element "Case" could appear more than once. These are the records (everything between <V3Document> and </V3Document> that I would like to remove before importing in excel for analysis. Since I am so new to XML, what process/tools would you guys recommend to do this exercise.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<V3MetadataAttributes xmlns="Some Namespace">
<V3Document dms_doc_id="12345">
<DocumentId>123456</DocumentId>
<Case case_id="12345">
<CaseNumber>12345</CaseNumber>
<CaseCategory>Some Category</CaseCategory>
<CaseSecurityLevel>1</CaseSecurityLevel>
<CaseHistorySecurityLevel>1</CaseHistorySecurityLevel>
</Case>
<DocumentSecurityLevel>1</DocumentSecurityLevel>
<ExternalSourceCode>Some Source code</ExternalSourceCode>
<ExternalSourceId>12345</ExternalSourceId>
<FiledDate>Some date</FiledDate>
<DocumentSource>Some source</DocumentSource>
</V3Document>
<V3Document dms_doc_id="12345">
<DocumentId>12345</DocumentId>
<Case case_id="12345">
<CaseNumber>12345</CaseNumber>
<CaseCategory>Sample</CaseCategory>
<CaseSecurityLevel>1</CaseSecurityLevel>
<CaseHistorySecurityLevel>2</CaseHistorySecurityLevel>
</Case>
<Case case_id="12345">
<CaseNumber>12345</CaseNumber>
<CaseCategory>Small Claims</CaseCategory>
<CaseSecurityLevel>1</CaseSecurityLevel>
<CaseHistorySecurityLevel>2</CaseHistorySecurityLevel>
</Case>
<DocumentName>sample</DocumentName>
<DocumentSecurityLevel>1</DocumentSecurityLevel>
<ExternalSourceCode>Source</ExternalSourceCode>
<ExternalSourceId>12345</ExternalSourceId>
<FiledDate>Date</FiledDate>
<DocumentSource>Source</DocumentSource>
</V3Document>
I have an XML file, for which I need to do some analysis. However, there are some records in the XML that I would like to remove before doing an analysis. Below is the sample xml file to better explain what I would like to achieve.
As you can see below, the V3Document element encomposses other nested elements and attributes. There are times when the nested element "Case" could appear more than once. These are the records (everything between <V3Document> and </V3Document> that I would like to remove before importing in excel for analysis. Since I am so new to XML, what process/tools would you guys recommend to do this exercise.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<V3MetadataAttributes xmlns="Some Namespace">
<V3Document dms_doc_id="12345">
<DocumentId>123456</DocumentId>
<Case case_id="12345">
<CaseNumber>12345</CaseNumber>
<CaseCategory>Some Category</CaseCategory>
<CaseSecurityLevel>1</CaseSecurityLevel>
<CaseHistorySecurityLevel>1</CaseHistorySecurityLevel>
</Case>
<DocumentSecurityLevel>1</DocumentSecurityLevel>
<ExternalSourceCode>Some Source code</ExternalSourceCode>
<ExternalSourceId>12345</ExternalSourceId>
<FiledDate>Some date</FiledDate>
<DocumentSource>Some source</DocumentSource>
</V3Document>
<V3Document dms_doc_id="12345">
<DocumentId>12345</DocumentId>
<Case case_id="12345">
<CaseNumber>12345</CaseNumber>
<CaseCategory>Sample</CaseCategory>
<CaseSecurityLevel>1</CaseSecurityLevel>
<CaseHistorySecurityLevel>2</CaseHistorySecurityLevel>
</Case>
<Case case_id="12345">
<CaseNumber>12345</CaseNumber>
<CaseCategory>Small Claims</CaseCategory>
<CaseSecurityLevel>1</CaseSecurityLevel>
<CaseHistorySecurityLevel>2</CaseHistorySecurityLevel>
</Case>
<DocumentName>sample</DocumentName>
<DocumentSecurityLevel>1</DocumentSecurityLevel>
<ExternalSourceCode>Source</ExternalSourceCode>
<ExternalSourceId>12345</ExternalSourceId>
<FiledDate>Date</FiledDate>
<DocumentSource>Source</DocumentSource>
</V3Document>