Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to Grep array element substrings using Hash element patterns?

Status
Not open for further replies.

makwise

Technical User
Apr 24, 2008
4
0
0
US
Hi.

I'm new to perl, and not expert in any particular programming language.

I've written a script that does the following:
1) Takes a list in FileA as input to a hash (%GrepPatternHash).
2) Takes a list in FileB as input to an array (@TargetList).
2) Uses the elements of that hash as the search pattern to Grep for in the array.
3) Outputs matched strings to FileC.

Problem:
Grep returns matches only if the whole element in the hash matches the whole element in the array precisely.
I need it to return matches of substrings within the array elements, e.g.

abc in FileA matches/returns abc in FileB.
abc in FileA does not match/return zabcd in FileB.

I need both of these conditions to return a match.

Code:
@GrepResults = grep ($GrepPatternHash{$_}, @TargetList);

Efforts so far:
a) I've tried concatenating wildcard characters to $GrepPatternHash{$_}, such as '.*' and '\b'. They've returned errors or many false matches (respectively).
b) I've tried including same wildcards in FileA ahead of time. They seem to have been treated as literals rather than meta characters.

It seems like there should be an easy way to do this, but I've had no luck finding a solution.

Any help is greatly appreciated.

makwise.
 
try:

Code:
grep (/$GrepPatternHash{$_}/, @TargetList);

I think what you have is equivalent to:

Code:
grep ($GrepPatternHash{$_} eq $_, @TargetList);

which must be an exact match string to string.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Thanks Kevin.

I tried your suggestion and it returned inverse results to what I was getting, i.e. I got everything back except precise matches.
It behaved as though I had specified ! like so:

Code:
grep (! $GrepPatternHash{$_}, @TargetList);

So, for kicks, I tried your suggestion again, but with ! i.e.

Code:
grep (!/$GrepPatternHash{$_}/, @TargetList);

I ended up with the same problem I started with, i.e. only precise matches.

Any other thoughts?

Thanks.

Mark.
 
It returned everything "except" precise matches? You mean it returned zabcd but didn't return abc ? Maybe you need to show us some real data.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
Rather than posting the actual data at this time, how about this:

filea contains...
abc
def
ghi

fileb contains...
abc
zabcd
yza

When my original script is run, it returns:
abc

With Kevin's suggested modification,

Code:
grep (/$GrepPatternHash{$_}/, @TargetList);

...it returns...
zabcd
yza

The desired return is...
abc
zabcd

Below is my original script up to the grep statement. Print statements have been removed for conciseness.

Code:
#!/usr/bin/perl

###Set scalar variables for input filenames.###
open (GrepPatternListFileName, "filea");
open (TargetListFileName, "fileb");

###Read Contents of Input Files into Arrays.###
@GrepPatternList = <GrepPatternListFileName>;
@TargetList = <TargetListFileName>;

###Read Contents of GrepPatternList Array into GrepPatternHash.###
%GrepPatternHash = ();
foreach (@GrepPatternList){
	$GrepPatternHash{$_}=1;
};

###Grep GrepPatternHash elements against TargetFileList.###
@GrepResults = grep ($GrepPatternHash{$_}, @TargetList);

Thanks,

Mark.
 
Don't you need something like
for my $key (keys %GrepPatternHash) {
my @GrepResults = grep /$GrepPatternHash{$key}/ @TargetList;
print "@GrepResults\n";
}
instead of


###Grep GrepPatternHash elements against TargetFileList.###
@GrepResults = grep ($GrepPatternHash{$_}, @TargetList);

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
Are you chomping the lines from filea before you load them into the hash? A trailing newline on each one would give the symptoms you are seeing...

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
I'm not sure how your code finds anything, all of the search patterns are assigned the number one:

Code:
foreach (@GrepPatternList){
    $GrepPatternHash{$_}=[red]1[/red];
};

change the above to:

Code:
foreach (@GrepPatternList){
    $GrepPatternHash{$_}=[red]$_[/red];
};

and hopefully that works. You should also "chomp" the intput files as has been mentioned.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Thank you all for your help. I really appreciate it.

The following got me where I needed to be...

Code:
foreach $an ( keys %GrepPatternHash ) {
 chomp($an);
 @GrepResults = grep (/$an/, @TargetList);

Mark.
 
There ya go, that does essentially the same thing my suggestion would have because now you are using the keys in the hash instead of the values to find the patterns. But now if you do it that way you really don't even need to use a hash, you could just use an array, unless the hash is used for other things somewhere else in the script. The array looping should also be faster than the hash looping.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top