Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Parsing a file describing the structure of a navigation system 2

Status
Not open for further replies.

Zhris

Programmer
Aug 5, 2008
254
GB
Hello,

I am trying to create a navigation system which parses a text file that describes the structure using tabs at the beginning of each line.

I don't understand how I can use perl to identify when there is one extra tab, to add another dimension to the hash, and when there one less tab, to go back to the previous dimension. The dimension of the text file is limitless.

I was hoping somebody could provide me with something to get me started as this is currently beyond me and seems far more complex than I originally thought.

Code:
______________________________________

#Input file...

One
	One
	Two
		One
		Two
	Three
Two
	One
Three
Four
	One
		One
		Two
		Three
	Two
	Three
	Four
		One
		Two
		Three
	Five
	Six
Five
	One
	Two

______________________________________

#Expected hash [first 10 lines]...

L1 - $hash{'One'}
L2 - $hash{'One'}{'One'}
L3 - $hash{'One'}{'Two'}
L4 - $hash{'One'}{'Two'}{'One'}
L5 - $hash{'One'}{'Two'}{'Two'}
L6 - $hash{'One'}{'Three'}
L7 - $hash{'Two'}
L8 - $hash{'Two'}{'One'}
L9 - $hash{'Three'}
L10 - $hash{'Four'}

______________________________________

Thank you,

Chris
 
You'll want to put some more error checking in this, but something like this might work for you.

Code:
my %h;
{
	my @temp;
	while (<DATA>) {
		chomp;
		my $count = s/\t//g;
		$temp[$count] = $_;
		&store_values(\%h, \@temp, 0, ($count || 0));
	}
}
use Data::Dumper;
print Dumper(\%h);

sub store_values {
	my ($h_ref, $a_ref, $current_i, $last_i) = @_;
	if ($current_i != $last_i) {
		unless (defined $h_ref->{$a_ref->[$current_i]}) {
			$h_ref->{$a_ref->[$current_i]} = {};
		}
		&store_values($h_ref->{$a_ref->[$current_i]}, $a_ref, $current_i+1, $last_i);
	} else {
		$h_ref->{$a_ref->[$current_i]} = undef unless $h_ref->{$a_ref->[$current_i]};
	}
}
 
Hi rharsh,

Sorry for a late response, but just from briefly checking the code, that looks exactly what I was looking for. The data per line will include e.g. text to display, url, target etc and I will throw in any error checks then. Didn't expect anyone to write the full code, but thank you very much for taking the time to provide a full example. Ill let you know how I get on.

Chris
 
Is it correct to use hashes with multiple levels of indexing? I had a go at a solution for this but rharsh beat me to it... but the reason I ask is that while doing so, with use strict enabled, this code gave me:

Code:
use strict;
my %hash;
$hash{'a'} = 1;
$hash{'b'} = 2;
$hash{'b'}{'c'} = 3;

Code:
Can't use string ("2") as a HASH ref while "strict refs" in use at ./Zhris line 6.

Or is it just one of those things that you can allow yourself to do as long as you realise you're doing it. :)

Annihilannic.
 
Annihilannic, using hashes for multiple levels of indexing (as long as the order isn't important) works just fine. The code I supplied runs correctly with both the warnings and strict pragmas in use.

Looking at the code you posted: [ul]
[li][highlight]$hash{'b'} = 2[/highlight] stores a scalar (2) in $hash{'b}[/li]
[li][highlight]$hash{'b'}{'c'} = 3[/highlight] presupposes $hash{'b'} contains a reference (or it would create one with autovivification) to the second hash where the key 'c' is going to be stored. Instead, it's trying to use '2' as a hash ref -- which leads into the soft references discussion which has come up many times before.[/li][/ul]

That's not a very good explanation, but does it help enough?
 
Yep, that makes perfect sense, thanks!

Armed with that information I've tried to understand your code above but my brain is just tying itself in knots...

Annihilannic.
 
Recursion can be tough to wrap your head around, have you had any luck?
 
Yes, I find recursion much easier to write than to read! :) Here's my interpretation:

Code:
[url=http://perldoc.perl.org/functions/sub.html][black][b]sub[/b][/black][/url] [maroon]store_values[/maroon] [red]{[/red]
    [url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [red]([/red][blue]$h_ref[/blue], [blue]$a_ref[/blue], [blue]$current_i[/blue], [blue]$last_i[/blue][red])[/red] = [blue]@_[/blue][red];[/red]
    [gray][i]# if we have not reached the depth of the current value[/i][/gray]
    [olive][b]if[/b][/olive] [red]([/red][blue]$current_i[/blue] != [blue]$last_i[/blue][red])[/red] [red]{[/red]
        [gray][i]# if this level is not already defined, create a new hash reference[/i][/gray]
        [olive][b]unless[/b][/olive] [red]([/red][url=http://perldoc.perl.org/functions/defined.html][black][b]defined[/b][/black][/url] [blue]$h_ref[/blue]->[red]{[/red][blue]$a_ref[/blue]->[red][[/red][blue]$current_i[/blue][red]][/red][red]}[/red][red])[/red] [red]{[/red]
            [blue]$h_ref[/blue]->[red]{[/red][blue]$a_ref[/blue]->[red][[/red][blue]$current_i[/blue][red]][/red][red]}[/red] = [red]{[/red][red]}[/red][red];[/red]
        [red]}[/red]
        [gray][i]# descend into the new or existing hash[/i][/gray]
        [maroon]&store_values[/maroon][red]([/red][blue]$h_ref[/blue]->[red]{[/red][blue]$a_ref[/blue]->[red][[/red][blue]$current_i[/blue][red]][/red][red]}[/red], [blue]$a_ref[/blue], [blue]$current_i[/blue]+[fuchsia]1[/fuchsia], [blue]$last_i[/blue][red])[/red][red];[/red]
    [red]}[/red] [olive][b]else[/b][/olive] [red]{[/red]
        [gray][i]# this is the depth of the current value, create the key[/i][/gray]
        [gray][i]# unless it already exists[/i][/gray]
        [blue]$h_ref[/blue]->[red]{[/red][blue]$a_ref[/blue]->[red][[/red][blue]$current_i[/blue][red]][/red][red]}[/red] = [url=http://perldoc.perl.org/functions/undef.html][black][b]undef[/b][/black][/url] [olive][b]unless[/b][/olive] [blue]$h_ref[/blue]->[red]{[/red][blue]$a_ref[/blue]->[red][[/red][blue]$current_i[/blue][red]][/red][red]}[/red][red];[/red]
    [red]}[/red]
[red]}[/red]

Annihilannic.
 
I'm also struggling to get to grips with whats actually going on 1 step at a time, although i'm getting there. I'm also not sure if I quite understand soft referencing and how I can get around it, as I would like to store other data at each level. Maybe simplifying the hash so that it doesn't exist on multiple levels E.g.

Code:
my $key = 'One|Two|One';
$hash{$key} = 'target=_blank|url=http://www.url.com';

Chris
 
Annihilannic, looks good to me!

Chris, here's the code modified to deal with storing multiple values (in this case, only a URL) for each 'node' -- the 'child' keys are references to the child nodes.

Code:
sub store_values {
	my ($h_ref, $a_ref, $current_i, $last_i) = @_;
	my ($text, $url) = split /\|/, $a_ref->[$current_i];
	
	if ($current_i != $last_i) {
		unless (defined $h_ref->{$text}->{child}) {
			$h_ref->{$text}->{url} = $url;
			$h_ref->{$text}->{child} = {};
		}
		&store_values($h_ref->{$text}->{child}, $a_ref, $current_i+1, $last_i);
	} else {
		unless (defined $h_ref->{$text}) {
			$h_ref->{$text}->{url} = $url;
			$h_ref->{$text}->{child} = undef;
		}
	}
}

__DATA__
One|One URL
	One|One URL
	Two|Two URL
		One|One URL
		Two|Two URL
	Three|Three URL
Two|Two URL
	One|One URL
Three|Three URL
Four|Four URL
	One|One URL
		One|One URL
		Two|Two URL
		Three|Three URL
	Two|Two URL
	Three|Three URL
	Four|Four URL
		One|One URL
		Two|Two URL
		Three|Three URL
	Five|Five URL
	Six|Six URL
 
Thank you,

Thats cleared up everything. I find the new subroutine more readable and I have a better understanding of its inner workings.

One final question, if I wanted to access a specific level given an array i.e. qq(One Two One), because the dimension of the hash is unlimited, would eval be practicle e.g.

Code:
my @input = ('One', 'Two', 'Three');
my $evalstring = '$h';
my $inputcount = @input;
my $currentcount = 1;
foreach (@input) {
	my $childorurl = ($currentcount == $inputcount) ? 'url' : 'child';
     $evalstring .= "{$_}{$childorurl}";
     $currentcount++;
}
print eval($evalstring);

Or would another rucursive subroutine be more practicle?

Chris
 
If I were planning on printing the entire data structure, I'd probably use a recursive algorithm to do it.

If you know a specific node/hash/object you want to access, you can do so with notation similar to this:
Code:
print $h{One}->{child}->{Two}->{child}->{One}->{url}, "\n";
 
I understand that I could access a specific level using the code above. I should of really had a go before asking that question. I think that what I require could be done within your store_values subroutine (I need to think about the best way to do it). What I meant was if I were to click a link i.e. script.pl?a="One|Two|One" or script.pl?a="One|Three|Five|One|Six" recieved via param (any given number of elements), then I would need build the string $h{One}->{child}->{Three}->{child}->{Five}->{child}->{One}->{child}->{Six}->{url}. Only the most inner levels are urls, whilst their parents are sub menu links to access their inner level(s).

Chris
 
I think a similar routine called get_values would be a better way to go about it.

Annihilannic.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top