Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

convert format

Status
Not open for further replies.

noyk

MIS
Feb 20, 2009
23
MY
hi,
i need to change this output

821 940 0.31 - 0 id "g1";
989 1205 0.53 - 1 id "g1";
2141 2621 0.37 - 2 id "g1";
2681 2847 0.89 - 1 id "g1";
2921 2925 0.71 - 0 id "g1";
5897 6076 0.55 + 0 id "g2";
6129 6202 0.58 + 0 id "g2";
6265 6381 0.57 + 1 id "g2";

into something like this. i.e. i need to rearrange the first two columns in this way if the id is identical. Can someone suggest how this can be done?

g1 (821..940,989,1205,2141..2621,2681..2847,2921..2925)
g2 (5424..5801,5897..6076,6129..6202,6265..6381)

i am thinking something along the line, but couldnt really get it to work.. thanks for any help

my $file = shift;
open FILE, $file;
while (my $line = <FILE>)
{
my $id = "id \"(.*)\"";
@line = split /\t/,$line;
if ($line =~ m/$id/)
{
our $coord = "$line[0]..$line[1],";
print "$coord";
}
}
 
Hi

As far as I understand your requirement, this would be enough :
Code:
perl -F'/[\s";]+/' -nae 'if($l ne$F[6]){print")\n"if$l;$l=$F[6];print"$l ("}else{print","}print "$F[0]..$F[1]";END{print")\n"}' /input/file

Feherke.
 
According to the output of this code, there is a typo in line 1 of your expected output...
Perl:
use strict;
use warnings;

my %ids;

while (<DATA>) {
    chomp;
    my ($from, $to, undef, undef, undef, $id) = split(/[\s+\-\"]+/);
    push @{$ids{$id}}, $from . '..' . $to;
}

print "$_\t(" . join(',', @{$ids{$_}}) . ")\n" foreach (sort keys %ids);

__DATA__
821    940    0.31    -    0    id "g1";
989    1205    0.53    -    1    id "g1";
2141    2621    0.37    -    2    id "g1";
2681    2847    0.89    -    1    id "g1";
2921    2925    0.71    -    0    id "g1";
5897    6076    0.55    +    0    id "g2";
6129    6202    0.58    +    0    id "g2";
6265    6381    0.57    +    1    id "g2";
which results in
Code:
g1      (821..940,[red]989..1205[/red],2141..2621,2681..2847,2921..2925)
g2      (5897..6076,6129..6202,6265..6381)
Note that it assumes your data are already sorted so the accumulated ranges build up nicely in ascending sequence. If not, you will have to sort the collected data in the arrays...

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
feherke, while respect is obviously due, I really can't handle your cryptic one-liners at this time of the morning.

I clearly need some more coffee... [smile]

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
Hi

Steve said:
I really can't handle your cryptic one-liners at this time of the morning.
There is no big difference, just less elegant than your code. My code is faster in case of huge files, as it not stores the data in the memory. In change I had to make the assumption that data is sorted by id.
Perl:
[b]while[/b] [teal]([/teal][green][i]<>[/i][/green][teal])[/teal] [teal]{[/teal]
  [navy]@field[/navy][teal]=[/teal][b]split[/b] [green][i]/[\s";]+/[/i][/green][teal];[/teal]

  [b]if[/b] [teal]([/teal][navy]$last[/navy] ne [navy]$field[/navy][teal][[/teal][purple]6[/purple][teal]])[/teal] [teal]{[/teal]
    [b]print[/b] [green][i]")\n"[/i][/green] [b]if[/b] [navy]$last[/navy][teal];[/teal]
    [navy]$last[/navy][teal]=[/teal][navy]$field[/navy][teal][[/teal][purple]6[/purple][teal]];[/teal]
    [b]print[/b] [green][i]"$last ("[/i][/green][teal];[/teal]
  [teal]}[/teal] [b]else[/b] [teal]{[/teal]
    [b]print[/b] [green][i]","[/i][/green][teal];[/teal]
  [teal]}[/teal]

  [b]print[/b] [green][i]"$field[0]..$field[1]"[/i][/green][teal];[/teal]
[teal]}[/teal]

[b]print[/b] [green][i]")\n"[/i][/green][teal];[/teal]

Feherke.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top