Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Reading directory in windows.

Status
Not open for further replies.

fabien

Technical User
Sep 25, 2001
299
AU
Hi,

I have noticed that if I do the following in windows:
open a file that contains a directory name like:
C:\MyDocuments\Programming\TclTkwith
while {[gets $fd l] != -1} {
# only take the first column & store in a list
set swdirlist($nbdir) [lindex $l 0]
set nbdir [expr $nbdir+1]
in swdirlist I get C:MyDocumentsProgrammingTclTk without the slashes, I could modify the input file and add a "/" in front of each "\" but this would not work in UNIX.. How can I get around this?

Thanks!
 
I don't think you are getting what you ask for anyway.
If you do [llength $l] I think you'll find that your
list is 1 element long.

You could try this:
set num 0
while {[gets $fd l] > -1 } {
set arra([incr num]) [file split [file nativename $l]]
}

Good Luck.


 
Hm. I'm seeing several things going on here, not all of which are related to your question...

For starters, let's look at your code comment "# only take the first column & store in a list". However, you're not storing the information in a list, you're storing it in a Tcl array, which is quite a different data structure. You might find it instructive to read my comments from thread287-250855 "Array uniq, without list translation," to learn more about the distinctions between a list and an array in Tcl. (marsd should find this a familiar thread. :) )

Continuing to review your code, I notice that you're trying to store the filenames in an integer-indexed array. If you read the thread I referenced above, you'll realize that this is a situation where a true Tcl list is far more efficient, and so if at all possible, you'll want to convert your code to using a true list instead of an array.

Another comment, I notice that you used the following to increment a variable:

Code:
set nbdir [expr $nbdir+1]

As marsd showed in his code, it's much better to use Tcl's incr command to increment the value of a variable (as long as both the orginal value and the increment are integers). The "gotcha" to look out for with incr is that it takes the name of a variable as an argument (no $ in front of the variable name). So the following is equivalent to the line above:

Code:
incr nbdir

(The code fragment provided by marsd also takes advantage of the fact that the return value of incr is the new value that was assigned to the variable.)

As for the core question you asked, you're encountering a classic problem of trying to treat an arbitrary string as a list. In Tcl, a list is a whitespace sequence of elements, and an element can be any string value. So what do you do if an element contains whitespace characters? You quote it, using the exact same quoting rules as when quoting a Tcl command string. So:

Code:
France Germany United Kingdom

is a 4 element list, but:

Code:
France Germany {United Kingdom}

is a 3 element list, where the third element is the string "United Kingdom". Notice that the quoting characters themselves are not considered part of the element value:

[tt]% set countries {France Germany {United Kingdom}}
France Germany {United Kingdom}
% lindex $countries 2
United Kingdom[/tt]

However, all of the Tcl quoting rules apply to lists. So look what happens in the following case:

[tt]% [ignore]set dir [file nativename [pwd]][/ignore]
D:\tcltk
% lindex $dir 0
D: cltk[/tt]

In this case, the "\" in front of the "t" was treated as a Tcl escape character, which caused the "\t" sequence to be translated into a literal Tab character when parsed as a list. You've also got problems if the string isn't a well-formed Tcl list, for example, if it contains unbalanced quoting characters:

[tt]% set test "This {is a test"
This {is a test
% lindex $test 0
unmatched open brace in list[/tt]

Using your current approach, you'd also run into problems if you have spaces in your file names. So, if at all possible, you should really come up with another way of extracting the data from your file. But it's difficult to suggest alternatives without knowing what your data format looks like.

Once you've got a value that you'd like to add to a list, there are a few Tcl commands you could use to add the value as an element of a list. You might find lappend useful, which adds one or more elements to the end of a list already stored in a variable. The first argument is the variable name (once again, you won't want a "$" in front of the variable name), and each subsequent argument is added as a distinct element at the end of the list. A nice feature of lappend is that it automatically creates the variable for you if it doesn't already exist. So, here's a quick example:

[tt]% lappend vals France Germany "United Kingdom"
France Germany {United Kingdom}
% lappend vals "bad {value"
France Germany {United Kingdom} bad\ \{value
% lindex $vals 3
bad {value[/tt] - Ken Jones, President
Avia Training and Consulting
866-TCL-HELP (866-825-4357) US Toll free
415-643-8692 Voice
415-643-8697 Fax
 
Hi, Avia,
Yeah, I noticed this last behavior(path separators) also, a little late to be useful.
Thanks for the concise illustration of the problem.
 
Thanks Marsd and Avia for your suggestions. I have tried Marsd's
set fd [ open $owdirdat "r"]
AddText "Reading dir.dat..\n"
set nbdir 0
set swdirlist(0) ""
while {[gets $fd l] != -1} {
set swdirlist([incr nbdir]) [file split [file nativename $l]]
# debug##
AddText "PRj: $nbdir : $swdirlist($nbdir)\n"
##

and I got the following answer:

PRj: 1 : C:/ MyDocuments Programming TclTk {sgyloader }

My file only contains one line which is:
C:\MyDocuments\Programming\TclTk\sgyloader

Any ideas why?

Thanks!
 
Well, I've got a few ideas of how to approach the problem, but I don't know which might work without knowing the data format. If you could describe the format of the data you're reading, I might be able to offer some suggestions for parsing it. - Ken Jones, President
Avia Training and Consulting
866-TCL-HELP (866-825-4357) US Toll free
415-643-8692 Voice
415-643-8697 Fax
 
Hi Avia,

The input file is really simple, one directory per line, i.e.,

C:\MyDocuments\Programming\TclTk\sgyloader
C:\MyDocuments\Programming\TclTk\toto
C:\MyDocuments\Programming\TclTk\titi
...

Thanks,

Fabien
 
Hi Avia,

The input file is really simple, one directory per line, i.e.,

C:\MyDocuments\Programming\TclTk\sgyloader
C:\MyDocuments\Programming\TclTk\toto
C:\MyDocuments\Programming\TclTk\titi
...
and I want to store those paths into an array swdirlist()



Thanks,

Fabien
 
If each line contains only a file name, it's really easy. Personally, I'd still advise using a Tcl list rather than a Tcl array, but since you've asked for an array, that's what I'll show you. I'll also demonstrate using Tcl's string trim command to trim off any extraneous whitespace characters that accidentally get inserted before or after the file name. Here it is:

Code:
set fid [open $file r]
set index 0

while {[gets $fid line] >= 0} {
  set swdirarray([incr index]) [string trim $line]
}
close $fid

By the way, to get all of them into a single Tcl list (Have I beaten this dead horse long enough? :) ), you could do the following:

Code:
set fid [open $file r]
set swdirlist [split [read $fid] "\n"]
close $fid

Or, if you wanted to be paranoid about stripping off any extraneous whitespace characters before or after the filenames:

Code:
set fid [open $file r]
while {[gets $fid line] >= 0} {
  lappend swdirlist [string trim $line]
}
close $fid
- Ken Jones, President
Avia Training and Consulting
866-TCL-HELP (866-825-4357) US Toll free
415-643-8692 Voice
415-643-8697 Fax
 
Actually I just realized that my input file can have multiple columns, up to three, the above works fine for one column, what about if there are several columns and I only want to select the first or second or third element for each line, what is the best approach?

Thanks again.
 
Well, two primary questions spring to mind in that case: 1) can the file names contain spaces (since this is a Windows system, I'm assuming the answer is "yes"); and 2) how are the columns separated?

I'm sure you see where I'm going with this. If the columns are white-space separated, it's going to be very tricky determining whether a white-space character is part of a file name or part or a column separator. So, how can you describe to me unambiguously that a particular column is complete? What are all the different potential data formats the file can contain? What is the information in the file? Which pieces of information will you want to extract? At this point, there are several different tools I could suggest, but the problem is too nebulous for me to know which might work. - Ken Jones, President
Avia Training and Consulting
866-TCL-HELP (866-825-4357) US Toll free
415-643-8692 Voice
415-643-8697 Fax
 
here is an example of a file: this is under unix but it will be similar under windows (apart from the file path will have \ instead of /)

/data/seismic1/R98 sys global
/data/seismic2/R98 global
/p1 global
/p2 global
/p3 global
/p4 global
/p5 global
/p6 global
/p7 global
/p8 global
/p9 global
/p10 global
/p11 global
/p12 global
/p13 global
/p14 global
/p15 global
/p16 global
/p17 global
/p18 global
/p19 global
/p20 global
/p21 global
/p22 global
/p23 global
/p24 global
/p25 global
/p26 global
/p27 global
/p28 global
/p29 global
/p30 global
/q1 global
 
Hm. I've got a possible solution. Can you at least guarantee that there will be no whitespace characters in the file name following the final file separator (i.e., the final "\" on Windows, the final "/" on Unix, or the final ":" on Macintosh)? And can you guarantee that the file separator character won't appear following the file name anywhere on the line? For example, you won't get any lines like:

[tt]/data/seismic2/R98 sys/global[/tt]

If so, I think we can safely do this with a regular expression. We'll write one that goes to the last file separator, then all characters following that until the first whitespace character. That will be the file name. Then, after any whitespace characters following the file name, we'll capture everything else into another variable I'll call flags. I'm going to use regular expression syntax introduced in Tcl 8.1, so you'll need that or later to use the following. You can do it in an earlier version of Tcl, but we'd need to use a different regexp. Anyway, here it is:

Code:
set fid [open $file r]
set pat {(.*[:/\\]\S*)(?:\s+)?(.*)}
while {[gets $fid line] >= 0} {
  if {![regexp $pat $line -> file flags]} {
    puts stderr "Invalid data format: $line"
  } else {
    # Do whatever you like with the filename
    # stored in "file" and the other data
    # stored in "flags".
  }
}
close $fid

Then, if the information following the filename is always a sequence of whitespace-separated "words," where each word is guaranteed not to contain a backslash or any quoting characters, you can safely use the contents of the flags variable as a list with Tcl commands like lindex. - Ken Jones, President
Avia Training and Consulting
866-TCL-HELP (866-825-4357) US Toll free
415-643-8692 Voice
415-643-8697 Fax
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top