Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Placing while loop

Status
Not open for further replies.

arun10427

Programmer
Oct 24, 2009
29
0
0
US
Hey all,
Need help with this problem
1. I have a huge data file which consists of millions of records.
2. There is a chunk column which has chunks of counters in it

Ex -

Chunk Processor
1 1
1 2
1 2
1 3
1 0
1 2
f 1
f 2
f 4
f 4
s 6
s 3
s 2
s 1
f 3
g 2
g 6

I wanna do some calculations in processor column based on chunk column ..dat is for all 1s, I perform a function in the processor column..for all fs I perform function in processor column..I am confused as to where to place the while loop?
while( the data belongs to one chunk)
{ } ?

Could u please help me with it? I would really appreciate it. :)
 
Not sure what you want to do with your data. But from what I understand, you want to parse the file, and perform different functions depending on the value in the chunk column:

Code:
while (<FILE>) {
     chomp;
     my ($chunk, $processor) = split(/\t/,$_);
     if ($chunk = 1) {
          #Do something
     }
}

If you need to perform a function which incorporates multiple lines, you'll need to parse your data into a hash table.

It usually helps us if you provide an example of your expected output.

Chris
 
This line:
if ($chunk = 1) {

Should have been
if ($chunk == 1) {

Or possibly:
if ($chunk eq '1') {

Chris
 
He'll have to use eq because he has a mixed data type

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
Your right, not sure why I even mentioned comparing numerical values with ==, let alone saying it was possibly eq when it most definately is eq.

Chris
 
I've modified Zhris's code to collect the processors into a hash of arrays. All you need to do now is write your processor function for the calculations that you 'wanna' do...
Perl:
use strict;
use warnings;
use Data::Dumper;

my %chunks;

while (<DATA>) {
    chomp;
    my ($chunk, $processor) = split;
    push @{$chunks{$chunk}}, $processor;
}

print Dumper(%chunks);

foreach my $chunk (sort keys %chunks) {
   # perform your function on the processor array for that chunk here...
}
__DATA__
1                 1
1                  2
1                 2
1                 3
1                  0
1                 2
f                    1
f                  2
f                   4
f                   4
s                   6
s                   3
s                    2
s                   1
f                   3
g                  2
g                  6

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
Thanks Steve for the reply!

But what do you mean by <DATA>? in the while statement
 
__DATA__ (and the <DATA> construct) are a way of letting Perl read some data that's embedded in the source file. Anything after the __DATA__ is treated as data, so I just used this as a way of posting a complete working example for you.

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
Thank you Steve..but how is the $processor accessed inside given its global?
 
$processor is a local variable, not a global, so as you guessed, you would not be able to access it.

Here is an example of accessing the contents of the hash stevexff constructed:

Code:
[olive][b]foreach[/b][/olive] [url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [blue]$chunk[/blue] [red]([/red][url=http://perldoc.perl.org/functions/sort.html][black][b]sort[/b][/black][/url] [url=http://perldoc.perl.org/functions/keys.html][black][b]keys[/b][/black][/url] [blue]%chunks[/blue][red])[/red] [red]{[/red]
        [gray][i]# perform your function on the processor array for that chunk here...[/i][/gray]
        [url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [red]"[/red][purple]values for chunk [blue]$chunk[/blue] are:[/purple][red]"[/red][red];[/red]
        [olive][b]foreach[/b][/olive] [black][b]my[/b][/black] [blue]$value[/blue] [red]([/red][blue]@[/blue][red]{[/red][blue]$chunks[/blue][red]{[/red][blue]$chunk[/blue][red]}[/red][red]}[/red][red])[/red] [red]{[/red]
                [black][b]print[/b][/black] [red]"[/red][purple] [blue]$value[/blue][/purple][red]"[/red][red];[/red]
        [red]}[/red]
        [black][b]print[/b][/black] [red]"[/red][purple][purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
[red]}[/red]

Annihilannic.
 
Because %chunks is outside the while loop, its scope is effectively global. Each hash key has a value that is a reference to an array of processor values. Although $processor is inside the loop and goes out of scope when the loop ends, the transient values it holds on each iteration get added to the correct arrays held by the keys of %chunks.

The reason I threw in the print Dumper(%chunks); statement was so that you could see the structure that was being created. Perhaps I should have explained that earlier...

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top