Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

streamtokenizer class

Status
Not open for further replies.

Guigui

IS-IT--Management
Mar 7, 2001
5
0
0
FR
hi,<br>i have a file with following lines:<br>&quot;0.12 45.2 56.0 48.2<br>&nbsp;45.2 56.1 48.1 10.0...<br>so the file contains numbers separate by white space<br>i want to &quot;extract&quot; each number separatly but i don't<br>know how indicate that the white separator is a white space.<br>If someone can help me.<br>Thanks
 
Is this input file one that you built yourself? If so then change the delimiters to ¦¦ or some odd character set like that? If not as I suspect then you could use a StringTokenizer object to parse each line:<br><br>StringTokenizer t = new StringTokenizer(line, &quot; &quot;)<br><br>where line is the current line being read from the file.<br><br>You would use the StringTokenizer object in a while loop so that it keeps reading till there is no more data left to read in the file.<br><br>Hope this helps.... <p>Troy Williams B.Eng.<br><a href=mailto:fenris@hotmail.com>fenris@hotmail.com</a><br><a href= > </a><br>
 
StreamTokenizer should work fine for this; probably better than StringTokenizer (where you'd have to assemble the string before even beginning the parsing). However, I think any kind of whitespace is a delimiter by default in StreamTokenizer. Please let me know what kind of errors you're getting, and I should be able to help. <p>Liam Morley<br><a href=mailto:lmorley@wpi.edu>lmorley@wpi.edu</a><br><a href=] :: imotic :: website :: [</a><br>"light the deep, and bring silence to the world.<br>
light the world, and bring depth to the silence.
 
Liam, you're right, any kind of whitespace is a delimiter by default for streamtokeniser (from the doc : The stream tokenizer is initialized to the following default state : ... All byte values '\u0000' through '\u0020' are considered to be white space. )

Everyone, can you help ? because i **don't** want the whitespace to be a delimiter - i want the &quot;,&quot; to be - and i want the white spaces as parts of the &quot;words&quot; i get
(a line in my file looks like : &quot;the name, the surname, the adress ...&quot; and i don't want to get : &quot;the/name/the/surname...&quot; )

i've tried (st here is a streamtokenizer) :
st.quoteChar(',');
or st.whitespaceChars(',', ',');
to declare that the &quot;,&quot; should be a delimiter and this is fine

st.wordChars(' ', ' ');
to declare that the whitespace should be part of the &quot;tokens&quot; but it doesn't seem to be working ...

so then i've add (before) the : st.resetSyntax() but it resets so well that i get each single char as a token :-(

i'm sure it's an easy and quick thing to do but i just can't figure out how to
oh and i've tried to modify the source file as &quot;the name&quot;, &quot;the surname&quot;, ... but then i get only &quot;, &quot;, &quot; ...

 
sorry, i finally fixed it
but i still don't understand exactly *how*, because here are the remaining lines, enough to do what i wanted to :
st.quoteChar(',');
st.wordChars(' ', ' ');
st.whitespaceChars(',', ',');
that's what i was using at first !! does someone know if the order matters here or is it pure magic that it didn't work and now it does ??
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top