Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations John Tel on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Split problem

Status
Not open for further replies.

lmbylsma

Programmer
Nov 11, 2003
33
US
I have a data file that looks something like this:

1 ABABAB 123456 2223 0.626903553299492
1 ABABAB 123465 60 0.0169204737732657
2 ABABAB 123546 60 0.0169204737732657
1 ABABAB 123564 9 0.00253807106598985
7 AABBAB 123645 8 0.00225606316976875
1 ABABAB 123654 13 0.00366610265087422
10 AAABBB 124356 131 0.0369430344049633

What I want to do is put all this information into an array, splitting it on \t but then further splitting columns 2 and 3 (e. g. ABABAB, 123456) by character so that the letters and numbers will each be in a separate element of the array.

I know how I can use split to split up everything by where the tabs are @array=(/\t/, $line) or everything by character @array=(//, $line), but I can't figure out how to tell it to split differently depending on which column its on.

Is there a simple way to do this?
 
It is simple to do, but not in a single operation. I would do something like this:

my ($col1,$col2,$col3,$col4,$col5) = split/\s+/,$line;
push @array, $col1;
push @array, split//,$col2;
push @array, split//,$col3;
push @array, $col4;
push @array, $col5;
print "$_\n" foreach @array;
 
That would work if I just wanted to print it out, but I want to be able to work with the individual elements later on. With that example it puts everything from a line into one element resulting in a single dimensional array, but I need to refer to specific elements like the "A" in the first line by referring to @array[0]->[1], etc.
 
Sure so just do this then

push $array[1], split//,$col2;
push $array[2], split//,$col3;

Assign the array of element to the array. An array of arrays.


You may have to twiddle that syntax a bit , its conceptual, I did not test it.
 
How about splitting the lines in the usual way and then
splitting the relevant columns of that array later (or
even using substr) to access the individual characters?

Example:
Code:
#!perl
use strict;

open (FILE, "ibys.txt") || die qq(Can't open "ibys.tst" for input\n);

my @array;

while (defined($_=<FILE>)) {
	chomp;
	push @array, [ split /\s+/, $_ ];
}

for (@array) {
        #print chars in 2nd col., comma-separated
	print join(&quot;,&quot;, split(//, $_->[1])), &quot;\n&quot;;

        #print chars in 3rd col., blank-separated, a
        #different way
	for my $i (0..length($_->[2] - 1)) {
		print substr($_->[2], $i, 1), &quot; &quot;;
	}
	print &quot;\n&quot;;
}

Output using example code and data you posted:
A,B,A,B,A,B
1 2 3 4 5 6
A,B,A,B,A,B
1 2 3 4 6 5
A,B,A,B,A,B
1 2 3 5 4 6
A,B,A,B,A,B
1 2 3 5 6 4
A,A,B,B,A,B
1 2 3 6 4 5
A,B,A,B,A,B
1 2 3 6 5 4
A,A,A,B,B,B
1 2 4 3 5 6

 
mikevh, Sure that works for printing but that's not what I want to do with it. I'm going to be doing some more complicated things for which I need everything to be in one array where I can access specific elements.

What Siberian suggested is too simple to work, but if I went along those lines I'd actually have to do something like this because each piece that is split off of $col2 and $col3 needs to go into a separate element:

while (<>) {

chomp;

my ($col1,$col2,$col3,$col4,$col5) = split/\t/,$_;
my @col2=split //, $col2;
my @col3=split //, $col3;

$array[$i]->[0]=$col1; #first column

$array[$i]->[1]=$col2[0]; #2nd column split into 6 elements
$array[$i]->[2]=$col2[1];
$array[$i]->[3]=$col2[2];
$array[$i]->[4]=$col2[3];
$array[$i]->[5]=$col2[4];
$array[$i]->[6]=$col2[5];

$array[$i]->[7]=$col3[0]; #3rd column split into 6 elements
$array[$i]->[8]=$col3[1];
$array[$i]->[9]=$col3[2];
$array[$i]->[10]=$col3[3];
$array[$i]->[11]=$col3[4];
$array[$i]->[12]=$col3[5];

$array[$i]->[13]=$col4; #4th column
$array[$i]->[14]=$col5; #5th column

$i++;

}


Which seems really inefficient. It works for what I want to do, but isn't there simpler more efficient way? If I had a lot more columns than this, this would be really hard to do in this manner.
 
You may find this a bit neater.
Code:
#!perl
use strict;

my @array;
my $i=0;

while (<>) {
 
 chomp;
 
 my ($col1,$col2,$col3,$col4,$col5) = split/\s+/,$_;
 
 $array[$i]->[0]=$col1;  #first column
 
 $array[$i]->[1]=[ split //, $col2 ]; #2nd col split into 6 elems 
 
 $array[$i]->[2]=[ split //, $col3 ]; #3rd column split into 6 elements
 
 $array[$i]->[3]=$col4; #4th column
 $array[$i]->[4]=$col5; #5th column
 
 $i++;

}

for (@array) {
	#print contents of 2nd column
	for (@{$_->[1]}) {
		print qq($_,);
	}
	print &quot;\n&quot;;
	#print contents of 3rd column
	for (@{$_->[2]}) {
		print qq($_ );
	}
	print &quot;\n&quot;;
}

There's a nice module called Data::Dumper which I've recently discovered. If you have this on your machine,
you can say

use Data::Dumper;

near the top of your program, and

print Dumper(@array);

to see what @array looks like. It's a very helpful debugging tool when you're working with complicated data
structures.
 
Hmm, even more concise and perhaps also clearer:
Code:
while (<>) {
 chomp; 
 my ($col1,$col2,$col3,$col4,$col5) = split/\s+/,$_;
 
 push @array, [ 
                $col1, 
                [ split //, $col2 ],
                [ split //, $col3 ],
                $col4,
                $col5,
               ]; 
}

or, with $i:
Code:
my $i=0;

while (<>) {
 chomp;
 my ($col1,$col2,$col3,$col4,$col5) = split/\s+/,$_;
 
 $array[$i++] = [ 
                  $col1, 
                  [ split //, $col2 ],
                  [ split //, $col3 ],
                  $col4,
                  $col5,
                 ]; 
}

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top