Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations John Tel on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

reading data from a file, array manipulation

Status
Not open for further replies.

lmbylsma

Programmer
Nov 11, 2003
33
US
So basically what I'm trying to do is read data from a file and organize it into a table that I can later open in Excel. I wrote some code that I thought would work but its just not doing anything and I'm not getting any error messages so I have no idea what's wrong.

Here's a shortened version of what the data file looks like:

"INFORMATION"
"b2jun0302"
""
""
""
""
""

"SUMMARY"
"13" "Evt-" "correct" 1
"14" "Evt-" "incorrect" 1
"19" "Evt-" "rcor" 1
"20" "Evt-" "rincor" 1
"32" "Marker" "untitled"

"CHANNEL" "13"
"Evt-"
"No comment"
"correct"

49.89185
53.72692
57.82880
61.91400



"CHANNEL" "14"
"Evt-"
"No comment"
"incorrect"

214.23345
238.11100
287.04997
379.90897


So what I want for the output would look like this:

49.89185 1
53.72692 1
57.82880 1
61.91400 1
214.23345 0
238.11100 0
287.04997 0
379.90897 0


(All the ones from Channel 13 have a 1 in the 2nd column, the ones from Channel 14 have a 0 in the 2nd column)


Here's the code I have sofar:


use diagnostics;


$datafile='monkey6-03.txt';

open(DATA, $datafile);


$/ = "";

while (<DATA>) {

if (index ($_, '&quot;CHANNEL&quot; &quot;13&quot;') > -1)
{ @Array13 = split (/\n/, <DATA>); }


if (index ($_, '&quot;CHANNEL&quot; &quot;14&quot;') > -1)
{ @Array14 = split (/\n/, <DATA>); }

}


$&quot;= &quot;\n&quot;;

print &quot;Array 13:\n@Array13\n\nArray 14:\n@Array14\n\n&quot;;

push (@Final, map (&quot;$_ 1&quot;, @Array13), map (&quot;$_ 0&quot;, @Array14));


@Final = sort { $a <=> $b } @Final;

print &quot;FINAL:\n@Final&quot;;



But when it prints out the arrays, it just prints the headers like &quot;Final:&quot; but not the numbers underneath. Anybody have any idea what's wrong?
 
Here's what I got when I ran your code, with the data as
you posted it.

Array 13:

&quot;CHANNEL&quot; &quot;14&quot;
&quot;Evt-&quot;
&quot;No comment&quot;
&quot;incorrect&quot;

214.23345
238.11100
287.04997
379.90897

Array 14:


FINAL:
&quot;CHANNEL&quot; &quot;14&quot; 1
&quot;Evt-&quot; 1
&quot;No comment&quot; 1
&quot;incorrect&quot; 1
1
1
214.23345 1
238.11100 1
287.04997 1
379.90897 1

I'm surprised you say you didn't get any error messages.
I did, all having to do with using a numeric comparison
operator (&quot;<=>&quot;) in the sort when you should have used
&quot;cmp&quot; instead, but granted, that doesn't really tell you
why the program's not working.

&quot;$/&quot; = &quot;&quot; sets the input record separator to be one or
more blank lines. A &quot;blank line&quot; cannot contain spaces
or tabs. The first truly blank line in your data is
before &quot;CHANNEL 14&quot;. (The earlier apparently blank lines
DO contain spaces or tabs.)

The first read of <DATA> takes everything up until the blank line before &quot;CHANNEL 14&quot;, not just the numbers
you're after. Then you say
{ @Array13 = split (/\n/, <DATA>); }
<DATA> at this point contains the UNREAD portion of your
data file, so this is what ends up in Array13.
You probably meant to say split (/\n/, $_),
which would have stored everything READ until that point,
but since that means everything up until that blank line
before &quot;CHANNEL 14&quot;, that still wouldn't have been all
that great.

Your data file is now exhausted, since the <DATA>
part of your first split statement read it to the
end. This is why Array14 ends up empty. There's nothing
left to read, and so we've dropped out of the while loop
before the second if statement is ever executed.

After that the only error I see is using the numeric
comparison operator &quot;<=>&quot; instead of the string operator
&quot;cmp&quot;, but by then the damage is done.

Here's my fix on the code, which I think does what you
want.

Code:
#!perl  
use diagnostics; 

$datafile='monkey6-03.txt';
 
open(DATA, $datafile) || die qq(Can't open &quot;$datafile&quot; for input\n);
  
while (<DATA>) {
  chomp;  #remove newline from $_
 
  #set $col2 depending on whether we've seen '&quot;CHANNEL&quot; &quot;13&quot;' or '&quot;CHANNEL&quot; &quot;14&quot;'
  #this tells us which array to push to, and is also the data we want in the second column

  if (/^&quot;CHANNEL&quot; &quot;13&quot;/) {
    $col2 = 1;
  } elsif (/^&quot;CHANNEL&quot; &quot;14&quot;/) {
    $col2 = 0;
  }

  #if we've seen proper numeric pattern, push $_ to appropriate array based on $col2

  if (/^\d+\.\d+/) {
    if ($col2) {
      push @Array13, &quot;$_\t$col2&quot;;
    } else {
      push @Array14, &quot;$_\t$col2&quot;;
    }
  }
}
 

$&quot;= &quot;\n&quot;;
 
print &quot;Array 13:\n@Array13\n\nArray 14:\n@Array14\n\n&quot;;
 
push (@Final, @Array13, @Array14);
  
print &quot;FINAL:\n@Final&quot;;

As you can see, I've given up on the &quot;$/&quot;=&quot;&quot;
&quot;paragraph mode&quot; bit, and am just reading the file
line-by-line. If I see &quot;CHANNEL&quot; &quot;13&quot; or &quot;CHANNEL&quot; &quot;14&quot; I'm setting variable $col2 appropriately. Then if I see the numeric pattern I'm interested in, I used $col2 to tell me which array I should push to. $col2 is also, conveniently, the data I want to use in the second column of the data. I've also gone with a tab &quot;\t&quot; as the delimiter between the number in the file data and the &quot;1&quot; or &quot;0&quot;, since this will import more easily into Excel, which is where you say you want to put it.

I didn't bother with the sorting bit, since the list
is already sorted the way you want it.

Hope this helps.




 
Thanks for your help. That almost works but I'm not getting the second column with the 0's and 1's. Also that's a shortened version of the data file, the one I will actually use is much longer - longer lists of numbers plus some other irrelevant info afterwards. If I ran this script on the longer version of the data file I get some error messages like this:

Use of uninitialized value in concatenation (.) or string at monkey5.pl line
25, <DATA> line 22 (#1)
Use of uninitialized value in concatenation (.) or string at monkey5.pl line
25, <DATA> line 23 (#1)
Use of uninitialized value in concatenation (.) or string at monkey5.pl line
25, <DATA> line 24 (#1)
Use of uninitialized value in concatenation (.) or string at monkey5.pl line
25, <DATA> line 25 (#1)
Use of uninitialized value in concatenation (.) or string at monkey5.pl line

Then the printed arrays afterwards.

Also, I do need the sort thing, this was just an example of a data file, but it won't always happened to already be in order. You said the sort thing I did wrong he first time? So how would I do that properly?


Thanks,

Lauren
 
The &quot;uninitialized value&quot; is probably $col2, which I
didn't initialize outside the loop. I recommend running
your scripts with &quot;use strict&quot; and declaring variables
with &quot;my&quot; before they're used, but didn't do it here since
I figured that was best left for another time, and I didn't
want to change your script more than I had to.

If you're going to sort strings, you should use &quot;cmp&quot;
rather than &quot;<=>&quot; as the comparison operator.

If you're getting that message about &quot;uninitialized value&quot;,
there's probably at least one line before the first
occurrence of &quot;CHANNEL&quot; &quot;13&quot; or &quot;CHANNEL&quot; &quot;14&quot; that matches
that numeric pattern &quot;^\d+\.\d+/&quot;. This wasn't the case
in your example data, where the numbers were always
preceded by one or the other.

I changed your script to make it work with the data you
posted. Here's the output of the script running on that
data:

Array 13:
49.89185 1
53.72692 1
57.82880 1
61.91400 1

Array 14:
214.23345 0
238.11100 0
287.04997 0
379.90897 0

FINAL:
49.89185 1
53.72692 1
57.82880 1
61.91400 1
214.23345 0
238.11100 0
287.04997 0
379.90897 0

I think that was exactly what you said you wanted. If
you're now running it on different data, it's not going to
work the same.

Look into using &quot;my&quot; and declaring all your variables
before you use them. Then look at the data you're
actually using and see how you can modify the code I
posted to make it work with that data.
 
Oh I see what its doing now, the 0's and 1's are there but it didn't put the tab in so it looks like this:

Array 13:
49.891851
53.726921
57.828801
61.914001

Array 14:
214.23340
238.11100
287.04990
379.90897 0

FINAL:
49.891851
53.726921
57.828801
61.914001
214.23340
238.11100
287.04990


Strangely it did put the tab in for one of them though. I see you have the \t in there so I don't know why it's not doing the tab in between the columns?


Thanks,

Lauren
 
Turns out putting 2 tabs works \t\t ...except for that one number at the end of Array 14, that prints out with 2 tabs now. This is fine for me because I only really care about the Final array anyway, but I wish I understood why its doing that...

 
It looks like it's because the tab stop is in the 9th column - 379.90897 is 9 characters, so it puts the 0 at the next stop.. Using the printf function would help, unless you're planning on writing this out to a tab delimited file.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top