Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Lost among hashes of arrays 2

Status
Not open for further replies.

blues77

Programmer
Jun 11, 2002
230
0
0
CA
Ok I'm having absolutely no luck trying to get this hash of array thing to work. As you can see, from the code below, I've tried several different ways to try and input several values for a specific key, if the $keyExists == 1 and $valueExists == 0 conditions are met. I thought it might be working but that I wasn't printing out the values properly so therefore it looked liek nothing was in the hash but I'm not sure. I've tried several different techniques but none seem to work. I'm now back to the point where I just have a normal hash. Below is the code I have with the different attempts I've tried. Can anyone make a suggestion as to how I can get this to work?

Thanks

Code:
use strict;

#print "$ARGV[0]";
open(MY_FILE1,$ARGV[0]); 
open(OUTPUT, ">>output.txt");

while(<MY_FILE1>)
{
	chomp;
	my $line1 = $_;
	my %hash = ();
	my $flag = 0;
	my $view = 0;
	my $noCAASsource = 0;

	open(MY_FILE2,$ARGV[1]);
	while(<MY_FILE2>)
	{	
		chomp;
		my $line2 = $_;

		if($line2 =~ /^\s*$/)
		{
			#do nothing
		}
		elsif($line2 =~ /^($line1)\s+/)
		{
			if($line2 =~ /^($line1)\s+CAAS.+\.([^.]+)\.([^.]+)/ )
			{
				$flag = 1;
				my $keyExists = 0;
				my $valueExists = 0;
				while ( my ($key, $value) = each(%hash) ) 
				{
			        	if($key EQ $2)
					{
						$keyExists = 1;
						if($value  EQ $3)
						{
							$valueExists = 1;
						}
					}
				}
				
				if($keyExists == 1 && $valueExists == 0)
				{
					print "key exists adding new value \n";
					#push @{ $hash{$2} },  $3;
					$hash{$2} = $3;
				}
				if($keyExists == 0 && $valueExists == 0)
				{
					print "nothing exists adding new value\n";
					$hash{$2} = $3;
				}
				
				$keyExists = 0;
				$valueExists = 0;
				
				#$hash{$2} = $3;
				#print ("Found the string $line1 in the file $ARGV[1]\n");			
				#print ("Segment = $2 and COBOL Field Name = $3\n");
			}
			if($line2 =~/^($line1)\s+View/)
			{
				$view = 1;
			}
			if($line2 =~/^($line1)\s+NO CAAS Source/)
			{
				$noCAASsource = 1;
			}
		}
		
	}
	if($flag == 0 && $view == 0 && $noCAASsource == 0)
	{
		print "nothing found\n";
		print OUTPUT "nothing found\n";
	}
	elsif($flag == 0 && $view == 1)
	{
		print "view\n";
	}
	elsif($flag == 0 && $view == 0 && $noCAASsource == 1)
	{
		print "NO CAAS Source\n"
	}
	elsif($flag == 1 && $view == 0)
	{
		print keys %hash, ;
		print "\n";
		print "$line1 has a segment and COBOL field\n";
	}
	elsif($flag == 1 && $view == 1)
	{	
		foreach (keys %hash) 
		{
			#remove all whitespace using search and replace
			$_ =~ s/ //g;
			chomp;
			print OUTPUT "$_,";
		}
		
		print OUTPUT "\t";

		#foreach $key ( keys %hash ) 
		#{
		#     print OUTPUT "@{$hash{$key}}\n"
		#}
		

		#foreach (keys %hash)
		#{
		#	my $key = $_;
		#	foreach (0 .. @{ $hash{$key} } - 1) 
		#	{
		#		print "$hash{$key}[$_]\n";
		#	}
		#}
		
			
		foreach (values %hash) 
		{
			#remove all whitespace using search and replace
			$_ =~ s/ //g; 
			chomp;
			print OUTPUT "$_,";
		}
		print OUTPUT "\n";
		
		print "View $line1 has a segment and COBOL field\n";
	}

	$noCAASsource = 0;
	$flag = 0;
	$view = 0;
	close MY_FILE2;
}

close OUTPUT;
close MY_FILE1;
close MY_FILE2;
 
I think that you're going around the houses a bit. Perl data structures auto-vivify which means that, if they don't exist when referenced, they spring into existance.

You seem to want to add something to a list whether or not the list exists. That's easy: assume it exists and add to it. If it did, you've added your value. If it didn't, it gets created as an empty list and your value still getts added.

Let's use %h as a hash of array references.
Code:
my %h;
push @{ $h{'key_one'} }, $value1;

When perl sees [tt]$h{'key_one'}[/tt] it checks %h for the key [tt]key_one[/tt] and creates it if it doesn't exist. From the context ([tt]@{ .. }[/tt]) it can tell that [tt]$h{'key_one'}[/tt] is being used as an array reference, so it creates a new array and points [tt]$h{'key_one'}[/tt] to it.

Only then does [tt]push[/tt] get to see it's arguments, which are, by now, simply an array and a scalar. You didn't need to check for existance of the key at all and have saved two variables, $keyExists and $valueExists.

If you ever do need to check for existance, there's no need to loop through the keys. Use the syntax
Code:
if ( exists $h{'key_one'} ) {
  ...
It's easier on the eyes as well as the CPU and it guarantees not to create [tt]$h{'key_one'}[/tt] if it didn't exist, unlike [tt]defined()[/tt].

Yours,

fish



[&quot;]As soon as we started programming, we found to our surprise that it wasn't as easy to get programs right as we had thought. Debugging had to be discovered. I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs.[&quot;]
--Maur
 
Oh my goodness. ROLF. I can't understand why I've never heard that one before.

Next thing you'll be telling me I'm a star, fish. geddit? starfish? geddit?

oh never mind

Have a good weekend, mate.

f

[&quot;]As soon as we started programming, we found to our surprise that it wasn't as easy to get programs right as we had thought. Debugging had to be discovered. I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs.[&quot;]
--Maur
 
Hi thanks for the tips. I understand what you're saying. The reason why I did it that way was becuase I'm reading my values from a text file that might look like this.

abc keyA.12
efg keyB.17
abc keyA.12
abc KeyA.22

The string abc in this example appears twice. The string following the whitespace on each line represents the key, which is the first part before the period, and the value for that key which is the second part (after the period) The outer while loop goes through a file that contains strings like 'abc' and 'efg'and then the inner while loop checks in another file and examines to see if that string has been found using a regular expression. If it is then I wanted to check to see that there is a key (i.e. KeyA) that contains that value (i.e. 12). If so then I don't want to add it again to the hash. So in the example above when the script reaches the line with the second 'abc' sequence of chars I don't want anything to be added. In other words I don't want the key (KeyA) and the value(12) added to the hash.
That was what I was trying to accomplish with the $keyExists and $valueExists variables. I'll try that exists function you recommended. But will that check all the key value pairs in a hash of arrays? Futher to my previous question. In the example above when the script comes across the third abc sequence I WOULD want to add the value (ie. 22)that to the hash under the key KeyA. I hope I've explained this well.

Thanks again to both of you for all your outstanding help. It is greatly appreciated!
 
The exists function you suggested looks like it will only check for the existance of the key in the hash. Will this also check for the existance of a key/value pair even when a key may have multiple values since I will be using a hash of arrays.
 
Ok here is what I have now but it still doesn't appear to be adding more than one value for a particular key.

[cod]
if($line2 =~ /^($line1)\s+CAAS.+\.([^.]+)\.([^.]+)/ )
{
$flag = 1;

if (exists $hash{$2}[$3] )
{
#do nothing
}
elsif (exists $hash{$2})
{
push @{ $hash{$2} }, $3;
}
else
{
$hash{$2} = $3;
}

}
if($line2 =~/^($line1)\s+View/)
{
$view = 1;
}
if($line2 =~/^($line1)\s+NO CAAS Source/)
{
$noCAASsource = 1;
}
[/code]
 
I noticed another strange behavior. in the code
Code:
	if($line2 =~ /^($line1)\s+CAAS.+\.([^.]+)\.([^.]+)/ )
			{
				$flag = 1;
			
				if (exists $hash{$2}[$3] ) 
				{
					print "key value pair already exists for elemetn $line1\n";
					#do nothing
				}
				elsif (exists $hash{$2})
				{
					print "We're adding a second value.  Examining with $line1\n";
					push @{ $hash{$2} }, $3;
				}
				else
				{
					print "We're adding a brand new key\n";
					$hash{$2} = $3;
				}

			}
			if($line2 =~/^($line1)\s+View/)
			{
				$view = 1;
			}
			if($line2 =~/^($line1)\s+NO CAAS Source/)
			{
				$noCAASsource = 1;
			}

The third condition is never being met. Meaning the one where the last else is executed.
Code:
	else
				{
					print "We're adding a brand new key\n";
					$hash{$2} = $3;
				}

Even through the hash is starting off empty so this else statement should be run several times for each new key that is encountered. Instead the second part
Code:
	elsif (exists $hash{$2})
				{
					print "We're adding a second value.  Examining with $line1\n";
					push @{ $hash{$2} }, $3;
				}
is being run when a new key is being added. I know this from the print statements. Any idea why this is happening? This may be the reason why I'm only getting keys that can contain one value at a time instead on several which is what I'm shooting for.

 
This might be what you want but I haven't tested it:

Code:
if($line2 =~ /^($line1)\s+CAAS.+\.([^.]+)\.([^.]+)/ )
            {
                $flag = 1;
            
                if (exists $hash{$2} and defined $hash{$2} )
                {
                    print "We're adding to an existing key/value pair\n";
                    push @{$hash{$2} },  $3;
                }
                else
                {
                    print "We're adding a new key/value pair.  Examining with $line1\n";
                    push @{ $hash{$2} }, $3;
                }
            }
            if($line2 =~/^($line1)\s+View/)
            {
                $view = 1;
            }
            if($line2 =~/^($line1)\s+NO CAAS Source/)
            {
                $noCAASsource = 1;
            }

if you really wanted to check if the key had a value you could do that seperately too.
 
I understand your problem better now. After checking for the key, you then want to check for the value. I'd junk the hash of arrayrefs plan and go for a hash of hashrefs instead.

Code:
if($line2 =~ /^($line1)\s+CAAS.+\.([^.]+)\.([^.]+)/ ) {
   $flag = 1;
   if (exists $hash{$2}) {
     $hash{$2}{$3} = 1; # or anything else
   }
...

This checks whether you've seen the key before and, if you have, marks the value as seen. You would then use [tt]keys %{ $hash{keyA} }[/tt] rather than [tt]@{ $hash{keyA} }[/tt] to recover the list of values.

What we're doing here is using the keys of a hash as a special sort of array that only stores distinct values, which is a common trick. I set the value to 1 but could just as easily have used undef or the string "present" as all I want to do is create the key (using undef would probably save a few bytes of memory for each key, but it reads as if you are turning something off rather than turning it on, which I don't like).

Does this get you anywhere?

f

[&quot;]As soon as we started programming, we found to our surprise that it wasn't as easy to get programs right as we had thought. Debugging had to be discovered. I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs.[&quot;]
--Maur
 
Yeah that does help. So using this technique if later I encounter the same key with the same value when it tries to add the key with the code $hash{$2}{$3} = 1; nothing wil be added because the value already exists. Is this correct? This is a nice little trick thanks!
 
You've got it, and it's a great trick to have in the armoury. For example, to remove duplicates in an array, you could create a hash with the array values as keys and then use the output of keys() as the unique list.

Yours,

fish

[&quot;]As soon as we started programming, we found to our surprise that it wasn't as easy to get programs right as we had thought. Debugging had to be discovered. I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs.[&quot;]
--Maur
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top