Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

sorting paragraphs - using specific field in the paragraph

Status
Not open for further replies.

arunrr

Programmer
Oct 2, 2009
103
US
Input is similar to the following...

------------------------------------------------------------
<item>
<title>Pietersen set for England return</title>
<description>England batsman Kevin Pietersen is likely to be fit for both the one-day and Test series in South Africa as he recovers from an Achilles problem.</description>
<link><pubDate>Fri, 30 Oct 2009 18:16:52 GMT</pubDate>
</item>
<item>
<title>Lee joins Australia injury list</title>
<description>Fast bowler Brett Lee becomes Australia's latest injury casualty after an elbow injury rules him out of the remainder of the one-day series against India. </description>
<link><pubDate>Wed, 28 Oct 2009 14:38:56 GMT</pubDate>
</item>
<item>
<title>Discarded Tudor bids to play on</title>
<description>Ex-England paceman Alex Tudor, released by Surrey along with Pedro Collins, bids to continue his cricketing career.</description>
<link><pubDate>Thu, 29 Oct 2009 18:01:08 GMT</pubDate>
</item>
<item>
<title>Powell agrees move to Lancashire</title>
<description>West Indies Test bowler Daren Powell joins Lancashire on a two-year contract - subject to visa and registration clearance.</description>
<link><pubDate>Thu, 29 Oct 2009 08:09:24 GMT</pubDate>
</item>
<item>
<title>Wage demands hamper Worcs coach</title>
<description>Worcestershire's director of cricket Steve Rhodes says his attempts at signing new players are being hampered by their wage demands.>
</description>
<link><pubDate>Fri, 30 Oct 2009 09:45:09 GMT</pubDate>
</item>
-----------------------------------------------------------

The paragraphs are delimited with the <item> and </item> tags. I need to sort the above (most recent first) based the date shown within the <pubDate> tags. This is a sample input. The normal input has several paragraphs.

Thanks,
AR
 
This would be an ad-hoc java-Solution:
Code:
import javax.xml.parsers.SAXParserFactory;
import javax.xml.parsers.SAXParser;

import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.*;
import java.io.*;

import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;

/**
<item>
	<title>Wage demands hamper Worcs coach</title>
	<description>Worcestershire's director of cricket Steve Rhodes says his attempts at signing new players are being hampered by their wage demands.>
	</description>
	<link>[URL unfurl="true"]http://news.bbc.co.uk/go/rss/-/sport1/hi/cricket/counties/worcestershire/8331304.stm</link>[/URL]
	<pubDate>Fri, 30 Oct 2009 09:45:09 GMT</pubDate>
</item>
*/
class Item implements Comparable <Item>
{
	public static final SimpleDateFormat sdf = new SimpleDateFormat ("EEE, dd MMM yyyy HH:mm:ss z", Locale.US);

	String title;
	String description;
	String link;
	Date pubDate;

	public String toString ()
	{
		return title + "\n\t" + description + "\n\t" + link + "\n\t" + pubDate;
	}

	public void setPubDate (String s)
	{
		try
		{
			if (s != null) 
				pubDate = sdf.parse (s);
		}
		catch (ParseException e)
		{
			pubDate = null;
			// e.printStackTrace();
			System.err.println (e.getMessage ());
		}
	}

	@Override
	public int compareTo (Item o)
	{
		return pubDate.compareTo (o.pubDate);
	}
}

public class SampleXMLparser extends DefaultHandler
{
	public static void main (String argv[])
	{
		if (argv.length != 1)
		{
			System.err.println ("Usage: java SampleXMLparser filename.xml");
			System.exit (1);
		}
		DefaultHandler handler = new SampleXMLparser ();
		SAXParserFactory factory = SAXParserFactory.newInstance ();
		try
		{
			SAXParser saxParser = factory.newSAXParser ();
			saxParser.parse (new File (argv[0]), handler);
		}
		catch (Throwable t)
		{
			t.printStackTrace ();
		}
		System.exit (0);
	}

	private StringBuffer textBuffer;
	private Item item;
	private List<Item> items = new ArrayList<Item> (50);

	@Override
	public void startElement (String namespaceURI, String simple, String qualified, Attributes attrs) throws SAXException
	{
		String eName = qualified; // not namespace-aware
		if (eName.equals ("item")) item = new Item ();
	}

	@Override
	public void endElement (String namespaceURI, String simple, String qualified) throws SAXException
	{
		String s = getText ();
		String eName = qualified; // not namespace-aware
		if (eName.equals ("item"))
			items.add (item);
		else if (item != null)
		{
			if (eName.equals ("title"))
				item.title = s;
			else if (eName.equals ("description"))
				item.description = s;
			else if (eName.equals ("link"))
				item.link = s;
			else if (eName.equals ("pubDate")) item.setPubDate (s);
		}
	}
	
	@Override
	public void characters (char buf[], int offset, int len) throws SAXException
	{
		String s = new String (buf, offset, len);
		if (textBuffer == null)
			textBuffer = new StringBuffer (s);
		else
			textBuffer.append (s);
	}
	
	private String getText () throws SAXException
	{
		if (textBuffer == null) return null;
		String s = "" + textBuffer;
		textBuffer = null;
		return s.trim ();
	}
	
	public void endDocument () throws SAXException
	{
		Collections.sort (items);
		for (Item i : items)
		{
			System.out.println (i);
		}
	}
}
You will have to tranform the output, I guess.

don't visit my homepage:
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top