Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

sorting an array 2

Status
Not open for further replies.

marsd

IS-IT--Management
Apr 25, 2001
2,218
US
Hi,
Sorting algorithms are not my strongpoint.

Is there a way to do something like this:

proc by_matching_elem {arr} {
upvar #0 $arr loc_arr
set s_list {}
foreach N [array names loc_arr c:/*] {
foreach X [array names loc_arr $loc_arr(*,file)] {
if {[string match $X $N]} {
lappend s_list "$N=[gtime $loc_arr($X)]"
#gtime is a function that returns an index
#derived value
}
}
}
return
}

Any help would be appreciated , a sample sort
that would produce an acceptable result would be better;)

TIA
M
 
Here is a code to sort an array:

Code:
array set a { one 1 two 2 three 3 four 4 five 5 six 6 }
set l [lsort [array names a]]
set l2 {}
foreach i $l { lappend l2 $i [set a($i)] }
array set a2 $l2
puts [join [array get a2] \n]

The
Code:
l
list is a sorted list of all the names of the array.
The
Code:
foreach
line creates a second list with paired names and values in the sorted order.

ulis
 
Thanks Ulis,
I'd already experimented with that idea in hopes that
the sort would result in matched values.

Basically what I want is to match the contents of
[array names loc_arr c:/*] and the contents of
[array names loc_arr *,file].

If I do the same kind of thing with:
set array a1 {
b 12
c 13
d 14
}

set array a2 {
f 13
g 12
h 14
}

proc comp_arr {arr1 arr2} {
upvar #0 $arr1 arr_loc1
upvar #0 $arr2 arr_loc2
foreach name [array names arr_loc2] {
foreach elem [array names arr_loc1] {
if {[string match $elem $name]} {
puts "$elem:$name."
}

}
}
return
}
I can get the results I want without resorting to
reformatting the arrays as lists and sorting them.

My problem is I am comparing:
if {[string match $elem $arr_loc($name)]}
An index and value comparison which does not seem to
work correctly, I just cannot figure out why?
 
I must admit, I'm getting rather confused as to what the keys and values in your array look like. Without a firm grip on that, I'm at a loss in coming up with an algorithm for you. And bringing up sorts is confusing me completely, because I can't see where sorting fits into it at all.

If you could provide a small example of the format of your array, I might be able to come up with something for you. - Ken Jones, President
Avia Training and Consulting
866-TCL-HELP (866-825-4357) US Toll free
415-643-8692 Voice
415-643-8697 Fax
 
#key:value1
arra(c:/filename) [file mtime file]
#keyvalue2
arra(1,file) [c:/filename]
so:
foreach val [array name arra c:/*] {
foreach val2 [array name arra *,file] {
if {[string match $val $arr($val2)]} {
do_next_thing...

One record is generated from the current fs state: the
other is loaded into the array by parsing a db file.


The alpha sort simply helped me with tshooting the
matching, it revealed that my algorithm here is no
good.
So: If I have to convert to lists and lsearch through
thats fine, but it seems to me there should be a way
to make this work.

Thank You.
 
Hm. I'm still a little confused as to what's up. Your general matching approach seems okay, although I've noticed minor syntax errors (e.g., misspelled variable names, etc.) in each example you posted. I assumed that they were simply transcription errors, but that might be your problem.

Here's a cleaned up example of your previous code fragment, demonstrating that it matches up the keys properly:

[tt]% parray arra
arra(1,file) = c:/filename
arra(2,file) = c:/file2
arra(3,file) = c:/abcfile
arra(4,file) = c:/toast
arra(5,file) = c:/waffles
arra(6,file) = c:/pancakes
arra(c:/abcfile) = 5434534
arra(c:/file2) = 58098245
arra(c:/filename) = 1234123
arra(c:/pancakes) = 12465436
arra(c:/toast) = 6345
arra(c:/waffles) = 47313
% foreach val [array names arra c:/*] {
foreach val2 [array names arra *,file] {
if {[string match $val $arra($val2)]} {
puts [format "%-7s %-12s : %s" $val2 $val $arra($val)]
}
}
}

3,file c:/abcfile : 5434534
6,file c:/pancakes : 12465436
5,file c:/waffles : 47313
1,file c:/filename : 1234123
4,file c:/toast : 6345
2,file c:/file2 : 58098245[/tt]

Now, if what you're really trying to do is process the files "1,file 2,file...", you should approach this a bit differently. First, I'd use array names to get the list of "*,file" keys. However, because Tcl arrays are unordered data structures (the information is actually stored in a hash table internally), if you want to process these values in the order "1,file 2,file...", you need to pass this list to lsort; this is a case where you should realy use the lsort -dictionary option to correctly sort using the numerical prefix, rather than simply treating the numerals as ASCII characters.

For each index, you can then test to see if a corresponding element exists in your array using info exists, and if it does, process it. Putting all that together, we've got:

[tt]% foreach key [lsort -dictionary [array names arra *,file]] {
if {[info exists arra($key)]} {
puts [format "%-7s %-12s : %s" $key $arra($key) $arra($arra($key))]
}
}

1,file c:/filename : 1234123
2,file c:/file2 : 58098245
3,file c:/abcfile : 5434534
4,file c:/toast : 6345
5,file c:/waffles : 47313
6,file c:/pancakes : 12465436[/tt]

Did that accomplish more or less what you were looking for? If not, feel free to clarify, and I'll give it another go.

And now a couple more general comments...

In general, if the order of elements is important, I tend to store information in a list rather than an array. In Tcl, a list is an ordered structure by nature, whereas imposing order on array elements requires additional work. As of Tcl 8.0, list access performance is roughly the same as array access until you start dealing with thousands of elements. Tcl lists are also "first class" data types that you can pass by value to procedures, whereas Tcl arrays aren't, and require the use of upvar like you used above. But sometimes the benefits of using arrays (for example, being able to use the glob pattern argument to array names to filter elements) outweigh the inconveniences.

Another comment, for simple string comparisons, use string equal insead of string match (as long as you're using Tcl version 8.1.1 or later, where string equal was introduced). It's faster, and you don't need to worry about wildcard characters like "*" and "?" possibly appearing in your comparison string. Use string match only when you really want glob (shell-style) pattern matching. - Ken Jones, President
Avia Training and Consulting
866-TCL-HELP (866-825-4357) US Toll free
415-643-8692 Voice
415-643-8697 Fax
 
The problem is very simple.
The initial proc is recursive and builds from the entire FS. The search pattern c:/*only matches root filenames.
I had assumed that c:/* would match all pathnames...

Any ideas or examples of how to resolve this are welcome.
 
That's what I would have assumed as well. In fact, it works on my system:

[tt]% parray arra
arra(1,file) = c:/filename
arra(2,file) = c:/file2
arra(3,file) = c:/abcfile
arra(4,file) = c:/toast
arra(5,file) = c:/waffles
arra(6,file) = c:/pancakes
arra(7,file) = c:/program files/tclpro1.4
arra(8,file) = c:/tcltk/intro/solutions
arra(c:/abcfile) = 5434534
arra(c:/file2) = 58098245
arra(c:/filename) = 1234123
arra(c:/pancakes) = 12465436
arra(c:/program files/tclpro1.4) = 415134
arra(c:/tcltk/intro/solutions) = 6234
arra(c:/toast) = 6345
arra(c:/waffles) = 47313
% foreach val [lsort [array names arra *,file]] {
foreach val2 [array names arra c:/*] {
if {[string equal $arra($val) $val2]} {
puts [format "%-7s %-12s : %s" $val $val2 $arra($val2)]
}
}
}

1,file c:/filename : 1234123
2,file c:/file2 : 58098245
3,file c:/abcfile : 5434534
4,file c:/toast : 6345
5,file c:/waffles : 47313
6,file c:/pancakes : 12465436
7,file c:/program files/tclpro1.4 : 415134
8,file c:/tcltk/intro/solutions : 6234[/tt]

I'm quite puzzled. So, some speculation here... In all of your examples, I noticed that you showed "/" as the directory delimiter. However, I also noticed the "c:" volume reference, which indicates that this is running on Windows. The usual Windows directory delimiter is "\". If in fact, the directory delimiters in your actual code are "\", then you're going to run into problems with the glob patters in array names (and string match, if you're still using that). Although Tcl can handle platform-native directory delimiters (such as "\") in addition to the Tcl-native ones ("/"), the "\" is used as an escape character for both the Tcl interpreter and the glob pattern matcher.

If in fact this is the problem, you can convert your paths to Tcl-native just by giving them as the only argument to file join:

[tt]% file join {C:\Program Files\TclPro1.4}
C:/Program Files/TclPro1.4[/tt]

The return value of file join is always Tcl-native. So if you pass only one argument to it, it is converted (if necessary) to the Tcl-native representation.

By the way, to convert a Tcl-native path to a platform-native path (for example, if you need to dispaly it to the user or right it to a file), use the file nativename command:

[tt]% file nativename C:/tcltk/intro/solutions
C:\tcltk\intro\solutions[/tt] - Ken Jones, President
Avia Training and Consulting
866-TCL-HELP (866-825-4357) US Toll free
415-643-8692 Voice
415-643-8697 Fax
 
Excellent ideas, I have been using the tcl native delimiters
as you can see.
The problem may be in the read from the db file.
It may be that the array was not getting populated
with all the elements that I can see via puts.

Here is the function.
proc do_build_recs {arr db} {
set x 0
upvar #0 $arr perts
while {[gets $db line]} {
if {[regexp "^.*\t" $line fname] && [regexp "\t.*" $line time]} {
puts "$fname:$time."

set perts([incr x],file) $fname
set perts($x,time) $time
} else {
puts "Non standard line. AT: [tell $db]."
seek $db 16 current
if {[gets $db line] > 0} {
continue
} else {
puts &quot;Assuming <EOF>&quot;
return
}

}

}
return
}

I can see the elements as they unwind so maybe the array
assignment is buggy. The kludge with the file read is
to beat some nasty gaps for testing in the db. It seemed
to work.

Thanks for your help!
 
Okay, we might be onto something with your regexp commands in your if condition:

[tt]regexp &quot;^.*\t&quot; $line fname[/tt]
[tt]regexp &quot;\t.*&quot; $line time[/tt]

In both of these statement, &quot;\t&quot; is part of your pattern, and so will be included in the characters stored in fname and time respectively. I suspect that this is causing problems with your later comparisons.

A better way to parse the line with regexp is to use subpatterns. Subpatterns are portions of the regular expression contained within parentheses, and can serve several functions:
[ul][li]Making a quantifier apply to a group of patterns
[tt]% regexp &quot;(na)+&quot; &quot;Banana&quot; text
1
% puts $text
nana[/tt][/li]
[li]Delimiting a set of alternation branches
[tt]% regexp &quot;(this|that) car&quot; &quot;Don't take that car!&quot; text
1
% puts $text
that car[/tt][/li]
[li]Capturing substrings of characters matching the entire regular expression[/li][/ul]
When using subpatterns to capture substrings of the entire matching pattern, you provide additional variable names as arguments to your regexp command. The first variable provided gets the entire matching string; the characters matching the first subpattern are stored in the second variable, the characters matching the second subpattern are stored in the third variable, etc. Thus:

[tt]% regexp &quot;(this|that) car&quot; &quot;Don't take that car!&quot; text word
1
% puts $text
that car
% puts $word
that[/tt]

Applying this technique to your code, you code change the if condition to the following:

[tt]if {[ignore][[/ignore]regexp &quot;^(.*)\t(.*)&quot; $line match fname time[ignore]][/ignore]} ...[/tt] - Ken Jones, President
Avia Training and Consulting
866-TCL-HELP (866-825-4357) US Toll free
415-643-8692 Voice
415-643-8697 Fax
 
That helps, I'm kind of embarrassed that I missed that, I use the regexp matches like that with expect all the time...

Okay, Now it works.
I have just to optimize it with the sort now so the search
is reasonably fast. The lsort -dictionary should do this for
me. Just a few minor tweaks and a couple of text files and I have a decent tripwire clone in less than a hundred lines of code,very nice. Thanks Again.

M



 
It happens. It's incredible how many &quot;unsolvable&quot; bugs I immediately figure out as soon as I try to explain my code to someone. &quot;I do this, then this, and... then this obviously doesn't work. Thank you very much.&quot; :)

Hope you didn't find my explanations too pedantic. In this kind of forum, it can be difficult to assess the experience of a person who requests help, so I tend to err on the side of providing too much explanation rather than too little.

And, should you know of anyone who's looking for some Tcl/Tk or Expect training or consulting, you know who to send them to... ;-) - Ken Jones, President
Avia Training and Consulting
866-TCL-HELP (866-825-4357) US Toll free
415-643-8692 Voice
415-643-8697 Fax
 
No, It's quite alright, the explanations are very helpful,
especially those that deal with performance issues,and
more conservative,efficient ways to do things.
As far as the other, you have my vote.;)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top