Searching within Files

dcomit · Nov 3, 2005

I have three files which I need to merge into one. I need to read a unique id from the first file and then search the other two files for entries which match the unique id (they may or may not exist).

It works ok for the first record but goes haywire with subsequent records.

Here is a fragment of my code:

. . open all files first
set str1 [read $hrs]
set str2 [read $hry]
# Read file rz line by line
while {[gets $hrz linrz] >=0} {
set rzid [lindex $linrz 0]
# Look for entries in file rs
set inds [string first $rzid $str1]
incr inds
seek $hrs $inds start
set anyrs [gets $hrs rsdata]
if {$anyrs <=0} {
set rsapp $sep$sep
} else {
set rsapp [lindex $rsdata 0]$sep[lindex $rsdata 5]$sep[lindex $rsdata 6]
}
# Look for entries in file ry
set indy [string first $rzid $str2]
incr indy
seek $hry $indy start
set anyry [gets $hry rydata]
if {$anyry <=0} {
set ryapp $sep$sep
} else {
set ryapp [lindex $rydata 0]$sep[lindex $rydata 4]$sep[lindex $rydata 5]
}
}

I would appreciate any help.
Thanks,
Dave

Bong · Nov 3, 2005

Well, here's what it appears to me you're doing:
. . open all files first
set str1 [read $hrs]
set str2 [read $hry]
now the entire contents of 2 of the files (referred to as 'hrs' and 'hry') are in 2 strings, 'str1' and 'str2', respectively.

# Read file rz line by line
while {[gets $hrz linrz] >=0} {
set rzid [lindex $linrz 0]
here you treat each line of 'rz' (='hrz'?) as a list, assuming that it comprises words separated by spaces.
the variable, 'rzid', is set to the first word of the line

# Look for entries in file rs
set inds [string first $rzid $str1]
now you have searched for the first occurance of that word in 'str1' (aka, the 'hrs' file)

incr inds
seek $hrs $inds start
now you've gone back into the file, 'hrs', and set the file pointer to just at that word. But why? You already have the entire file contents in a string. Why go back and read the file again?

set anyrs [gets $hrs rsdata]
here you read from that word to the next linefeed into a string

if {$anyrs <=0} {
set rsapp $sep$sep
} else {
set rsapp [lindex $rsdata 0]$sep[lindex $rsdata 5]$sep[lindex $rsdata 6]
}
here you pull specific words out of that string and separate them with some separation string

# Look for entries in file ry
set indy [string first $rzid $str2]
incr indy
seek $hry $indy start
set anyry [gets $hry rydata]
if {$anyry <=0} {
set ryapp $sep$sep
} else {
set ryapp [lindex $rydata 0]$sep[lindex $rydata 4]$sep[lindex $rydata 5]
}
}
Now you've done the same thing for 'hry'.

I assume that when you read 'rz' the next time, you get another first word and go looking for it in str1 and str2. I don't see why it shouldn't work but it may have something to do with the file structure. Alternatively, it might be easier to code it and also to see what's going wrong if you don't go back into the files, 'hrs' and 'hry'. You can use "lsearch" on strings, str1 and str2, which will return the index of the first occurence of the search string. Then you can "lindex" the elements after that that you want (as you do now with your "set rsapp ..." and "set ryapp ...")

_________________
Bob Rashkin

dcomit · Nov 4, 2005

Bob,

I've got it working now. Thanks very much for your help. Here's a fragment:

# Look for entries in rs
set inds [lsearch $str1 $rzid]
if {$inds == -1} {
set rsdata $sep$sep
} else {
set rsdata [lindex $str1 [expr {$inds+5}]]$sep[lindex $str1 [expr {$inds+6}]]
}

Dave

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Searching within Files

dcomit

Technical User

Bong

Programmer

dcomit

Technical User

Similar threads

Part and Inventory Search

Sponsor