Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

parsing a file 1

Status
Not open for further replies.

megabyte214

Programmer
Nov 7, 2001
40
US
I am using StringTokenizer to parse a text file. The file contains items with different attributes, delimitated by commas.

An example line is:

0001,"Abbot, Baker, and Miller”,4444,"fiction",,06182000

I have 2 questions. First, using StringTokenizer, is there a way to differentiate between a comma in an attribute and a comma between attributes?

Secondly, StringTokenizer doesn’t seem to like it when there are 2 commas together. This happens when an attribute is blank. If I put a space between the commas, it is fine, but unfortunately, I cannot change the file.

I realize that I can write my own parser to do all of this, but I am trying to take advantage of an existing class if there is one that can help me.
Would StreamTokenizer be better for what I am trying to do? I have never used it before.

Thanks in advance.
 
import java.util.StringTokenizer;
public class str
{
public static void main(String args[])
{
String inputString = "this,,is,,a,test"; //read from file
String inputString2 = inputString.replaceAll(",,",", ,"); // empty string will be replace by a space
StringTokenizer st = new StringTokenizer(inputString2, ",");
while (st.hasMoreTokens())
{
System.out.println(st.nextToken());
}
}
}
 
Another solution for the ',,' Problem is, to tell the Tokenizer to return the separters too (normally they are skipped):

st = new StringTokenizer (",", true);

The embedded commas aren't solved with this, and
0004, "Evender \"the real deal\" Holiday", ... might be an additional problem.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top