Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Please help me to understand how Dumper() works. 1

Status
Not open for further replies.

whn

Programmer
Oct 14, 2007
265
US
At first, please take a look at the piece of code:
Code:
#!/usr/bin/perl

use Data::Dumper;

my @a1 = ('a','b');
my %h1;
for(my $i = 0; $i <= $#a1; $i++) {
    $h1{$i} = \@a1;
}
my @a = (\%h1);
print Dumper(\@a);
exit;

A test run:
Code:
% tt.pl
$VAR1 = [
          {
            '1' => [
                     'a',
                     'b'
                   ],
            [b]'0' => $VAR1->[0]{'1'}[/b]
          }
        ];

What I expected:
Code:
$VAR1 = [
          {
            '1' => [
                     'a',
                     'b'
                   ],
            '0' => [
                     'a',
                     'b'
                   ]
          }
        ];

Question: Why is the output different from what I expected? Is it a bug in Dumper()? Or my expectation is wrong?

Thanks.
 
So, first, in order to get the result you expect, you could use this instead:

Code:
for(my $i = 0; $i <= $#a1; $i++) {
    $h1{$i} = [@a1]; #\@a1;
}

The difference there is that $h1{$i} is assigned a reference to an anonymous array that contains all the elements in @a1 (so there will be a unique anonymous array for each element in %h1.)

In the original code (using \@a1) $h1{$i} points to the location where @a1 exists - which means, more or less, @a1, @{$h1{0}} and @{$h1{1}} all point at the same memory address.
 
Wow, I have never realized they are different, I mean, [@a1] vs. \@a1.

I tested it again using the following piece of code:

Code:
my @a1 = ('a','b');
my (%h1, %h2);
for(my $i = 0; $i <= $#a1; $i++) {
    $h1{$i} = \@a1;
    $h2{$i} = [@a1];
    print "$i, \$h1{$i} = #$h1{$i}#, \$h2{$i} = #$h2{$i}#\n";
}
exit;

And the output:
Code:
./tt.pl 
0, $h1{0} = #ARRAY(0x1d621c0)#, $h2{0} = #ARRAY(0x1d62130)#
1, $h1{1} = #ARRAY(0x1d621c0)#, $h2{1} = #ARRAY(0x1e28808)#

The addresses are different! Thank you, rharsh, for pointing out this to me.

Now, I have another related question. In practice, [@a1] vs. \@a1, which way is better? Or it does not really matter at all?

Or in the cases below:

I. sub getList {return \@list;} # I always use this way.
II. sub getList {return [@list];} # I have never used this way.

Is there a preference we should pick one over the other?
 
Now, I have another related question. In practice, [@a1] vs. \@a1, which way is better? Or it does not really matter at all?
What are you trying to achieve?

Do you want to create a new anonymous array ref, that contains the values in @a1, thus now having two separate arrays or do you want to pass the reference to @a1, so there is only one array with two variables referencing it?

Do what ever is most efficient and meets the requirements of what it is you are trying to achieve.

Never create duplicate arrays / hashes for the sake of it, it uses memory unnecessarily.

I use anonymous arrays as return/input if I have scalar values and the argument / return wants an array ref...

eg.

Code:
sub my_sub {
    my $array_ref = shift;
}

my_sub(['a','b','c']);

Also if you pass references you can use 'byref' parameter logic, but it depends if you want to allow a method / sub to alter the content of a variable being passed in or not.





"In complete darkness we are all the same, it is only our knowledge and wisdom that separates us, don't let your eyes deceive you."

"If a shortcut was meant to be easy, it wouldn't be a shortcut, it would be the way!"
Free Electronic Dance Music
 
Thank you, 1DMF, for your explanation.

I guess I did not make myself clear.

Before I read the answers in this thread, I did not know there is a difference between [@a] and \@a. I thought they are identical.

So my question #1:
When I pass an array into a function:
Code:
&my_sub([@a]);
vs.
Code:
&my_sub(\@a);
Which one is the better practice? I always used &my_sub(\@a). But I guess you would use &my_sub([@a])?
If so, would you please explain to me one more time why you'd prefer &my_sub([@a]) to &my_sub(\@a)

Question #2:
When I need to return an array from a function:
Code:
sub my_sub {return [@a];}
vs
Code:
sub my_sub {return \@a;}
Which way would you recommend and why?

Many thanks for your instructions!

 
Hi Whn,

Best practice to achieve what?

Do you want to pass an array to your sub and allow it to alter it so when the code returns from your sub, the array is different?

The first thing I recommend is you stop using ampersand in front of calls to subroutines, that is deprecated.

Personally if I am passing in the array to a sub that wants a ref and then returning it again,I use the reference syntax.

Keep wrapping your array in anonymous array refs is creating copies of the same array all over the place, which is not very efficient.

You also imply in your sub routine you are taking an array ref, converting to an array then returning back as an array ref.

I tend not to do that, I pass in the array ref, work with it as it is (de-referencing as required) and pass back the reference.

EG..

Code:
my $a = my_sub([1,2,3,4,5]);
print "@{$a}";

sub my_sub 
{
    my $a = shift;
    
    for(my $i=0;  $i<@{$a}; $i++)
    {
       @{$a}[$i]++;
    }

    $a;
}

The above alters the value of the array passed in!

Though of course you could use...

Code:
my @a = (1,2,3,4,5);
my_sub(\@a);
print "@a";

sub my_sub 
{
    my $a = shift;
    
    for(my $i=0;  $i<@{$a}; $i++)
    {
       @{$a}[$i]++;
    }
}

You don't even have to use array references or anonymous arrays refs....(but you are creating multiple arrays)

Code:
my @a =(1,2,3,4,5);
my @b = my_sub(@a);

print "a = @a\n";
print "b = @b\n";

sub my_sub 
{
    my @a = @_;
    
    foreach(@a)
    {
      $_++;       
    }
    
    @a;
}

Are you trying to keep to some paradigm, so don't want to alter input arguments but still use references?

Perhaps you could use...

Code:
my @a = (1,2,3,4,5);
my @b = @{my_sub(\@a)};

print "a = @a\n";
print "b = @b\n";

sub my_sub 
{
    my $a = shift;  
    my $b = [@$a];  
    
    foreach(@$b)
    {
       $_++;       
    }
    
    $b;
}

or may be to keep it simple with minimum number of variables / arrays and no return...
Code:
my @a = (1,2,3,4,5);
my_sub(\@a);
print "a = @a\n";

sub my_sub 
{
    my $a = shift;
    
    foreach(@$a)
    {
       $_++;       
    }    
}

but be careful, because of the way referencing and de-referencing works, this gives you non-intuitive results...
Code:
my @a =(1,2,3,4,5);
my $d = my_sub(\@a);

print "a = @a\n";
print "d = @$d\n";

sub my_sub 
{
    my $b = shift;
    my @c;
    
    foreach my $i (@{$b})
    {     
       push (@c,++$i);       
    }
    
    \@c;
}

yet simply adding a scalar copy fixes it..
Code:
my @a =(1,2,3,4,5);
my $d = my_sub(\@a);

print "a = @a\n";
print "d = @$d\n";

sub my_sub 
{
    my $b = shift;
    my @c;
    
    foreach my $i (@{$b})
    {     
[highlight #FCE94F]       my $x = $i;        
       push (@c,++$x);[/highlight]       
    }
    
    \@c;
}

As you can see as this is Perl... TIMTOWTDI = There is more than one way to do it!

Just think about what your subroutine parameters are, how they will operate on the values they receive and what they may return.

Consider whether they should alter the argument values byref or use them byval and try not to keep creating new duplicate arrays unnecessarily which is what [@array] does.

Hope that helps.

1DMF















"In complete darkness we are all the same, it is only our knowledge and wisdom that separates us, don't let your eyes deceive you."

"If a shortcut was meant to be easy, it wouldn't be a shortcut, it would be the way!"
Free Electronic Dance Music
 
Thank you so much, 1DMF, for your detailed explanation.

I understand the difference now.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top