Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

GREPing large amounts of data.

Status
Not open for further replies.

ericse

Programmer
Apr 19, 2007
32
US
Hi all-

I'm basically trying to write a script that will generate statistics for each configuration setting we have in our application.

There is about 70 settings. There's about 300 configuration files. I need to generate reports for each of the 70 settings. Basically: Setting A is "true" for 125 out of the 300 configuration files; Setting A is "false" for 175.

And so forth; my biggest concern isn't how to accomplish this, which is relatively painless for each individual setting, but more of, how to go about getting the aggregate in an efficient manner.

Anyone have any ideas?

Thanks
~Eric
 
I'm not sure I understand your question :)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those Who Say It Cannot Be Done Are Usually Interrupted by Someone Else Doing It; Give the wrong symptoms, get the wrong solutions;
 
Well. I have 300 different config files. Right now, the easiest way to get my answer is something like this:

my @results =
`grep -c 'HidePrivacyStatement] t' /some/file/here.config`;
foreach (@results) {
if ($_ =~ /1$/) {
$cnt += 1;
}
}

print "$cnt\n";

But this seems costly do run in a loop 70 times over (1 time for each setting)...

I'd like some input on another way to approach this.

Thanks
~Eric
 
How about just using unix grep and word count.

grep -c 'HidePrivacyStatement] t' /some/file/here.config | grep '1$' | wc -l

Definitely, more efficient than making an array and browsing through it.

thanks,
 
One setting per line ?
format of ..
[configuration] value
??

Use a hash of hashes:

Pseudo-code:
Code:
%confighash={}
foreach config file {
  foreach line {
    /^[(.*)]\s+(.*)$/;
    $confighash{$1}{$2}++;
  }
}



 
bad regexing by me there.... didn't escape the square brackets

But the idea is that you only scan each line once.
 
As brigmar suggests, just use a complex data structure. Adding up the cummulative values one config file at a time.

I'd suggest that you use one of the many Config modules out there for parsing. Not know your exact format, we can't exactly give specifics. However, I'm betting the following would probably work:

Code:
[url=http://perldoc.perl.org/functions/use.html][black][b]use[/b][/black][/url] [green]Config::Tiny[/green][red];[/red]
[black][b]use[/b][/black] [green]Data::Dumper[/green][red];[/red]

[black][b]use[/b][/black] [green]strict[/green][red];[/red]

[url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [blue]@configfiles[/blue] = [red]([/red][red]'[/red][purple]file.conf[/purple][red]'[/red][red])[/red][red];[/red]

[black][b]my[/b][/black] [blue]%totals[/blue][red];[/red]

[olive][b]foreach[/b][/olive] [black][b]my[/b][/black] [blue]$file[/blue] [red]([/red][blue]@configfiles[/blue][red])[/red] [red]{[/red]
	[black][b]my[/b][/black] [blue]$config[/blue] = Config::Tiny->[maroon]read[/maroon][red]([/red] [blue]$file[/blue] [red])[/red][red];[/red]
	
	[olive][b]while[/b][/olive] [red]([/red][black][b]my[/b][/black] [red]([/red][blue]$section[/blue], [blue]$settings[/blue][red])[/red] = [url=http://perldoc.perl.org/functions/each.html][black][b]each[/b][/black][/url] [blue]%$config[/blue][red])[/red] [red]{[/red]
		[olive][b]while[/b][/olive] [red]([/red][black][b]my[/b][/black] [red]([/red][blue]$key[/blue], [blue]$value[/blue][red])[/red] = [black][b]each[/b][/black] [blue]%$settings[/blue][red])[/red] [red]{[/red]
			[blue]$totals[/blue][red]{[/red][blue]$section[/blue][red]}[/red][red]{[/red][blue]$key[/blue][red]}[/red][red]{[/red][blue]$value[/blue][red]}[/red]++[red];[/red]
		[red]}[/red]
	[red]}[/red]
[red]}[/red]

[gray][i]# Print Totals[/i][/gray]
[url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [maroon]Dumper[/maroon][red]([/red]\[blue]%totals[/blue][red])[/red][red];[/red]

[fuchsia]1[/fuchsia][red];[/red]

[teal]__END__[/teal]

[teal]Tested using the following for file.conf[/teal]

[teal]# In your configuration file[/teal]
[teal]rootproperty=blah[/teal]

[teal][section][/teal]
[teal]one=twp[/teal]
[teal]three= four[/teal]
[teal]Foo =Bar[/teal]
[teal]empty=[/teal]
[tt]------------------------------------------------------------
Pragmas (perl 5.8.8) used :
[ul]
[li]strict - Perl pragma to restrict unsafe constructs[/li]
[/ul]
Core (perl 5.8.8) Modules used :
[ul]
[li]Data::Dumper - stringified perl data structures, suitable for both printing and eval[/li]
[/ul]
Other Modules used :
[ul]
[li]Config::Tiny[/li]
[/ul]
[/tt]

- Miller
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top