buffering, eof, read quirks 1

cptk · Feb 10, 2004

Two problems:
1.) I can't seem to ever get the eof value to be 1...it's
always 0 (i.e. - not eof)!

set fid [open "| c_prgm" "r+"]
fconfigure $fid -buffering line
fconfigure $fid -blocking no
fileevent $fid readable "get-data $fid"
.
.
.
proc get_data {fid} {
while {![eof $fid]} {
set line [read $fid]
}

set first_elem [lindex $line 0]
set last_elem [lindex $fid end]
}

2.) When I only use one fflush(stdout)
cmd in the c_prgm, I get the correct "first_elem" (i.e. START) and "last_elem" (i.e. END). When I use the fflush after each fprintf, I don't get the right results. I played around with the -fconfigure buffering value (auto, none), but it seems that the solution is to NOT to do multiple fflush's. Although I got it to work for small data output coming from the c_prgm, I'm concerned when I start outputting large amounts of data from the c_prgm. My goal though is always the same: I want to have access to the first and last value of any output from the c_prgm, regardless of the size of output.

example of c_prgm:
fprintf(stdout, "START\n&quot

;
fprintf(stdout, "A\n&quot

;
fprintf(stdout, "B\n&quot

;
fprintf(stdout, "C\n&quot

;
fprintf(stdout, "END\n&quot

;
fflush(stdout);

Can someone elaborate on this whole phenonmina with buffering, eof, read cmd, and gets cmd. Thanks!!!

smugindividual · Feb 12, 2004

I think i see a problem with your fileevent procedure.
You currently have:

proc get_data {fid} {
while {![eof $fid]} {
set line [read $fid]
}

set first_elem [lindex $line 0]
set last_elem [lindex $fid end]
}

and i think it should look more like this:

proc get_data {fid} {
if [eof $fid] {
catch {close $fid}
} else {
set line [read $fid]
set first_elem [lindex $line 0]
set last_elem [lindex $fid end]
}
}

fileevent is set up to automatically watch the pipe for you. You dont need to use the while. This may be a part of your problem.

cptk · Feb 12, 2004

I tried your recommendation, but I'm still having problems!

My Goal: Capture all the data that's outputted from the c_prgm into either one string or one list (for now I don't care, because I can't get all the output data together!)

I know the fileevent watches the pipe, but the problem is that it never knows when there's no more data - i.e. - the
eof never gets to be "1"! Thus I can't set my first_elem and last_elem values. I've tried using "read" and "get", but to no avail. I've also played with the c_prgm, varying the fflush(stdout)'s I do - whether after each fprintf or only after the last one.

I'm doing this on a small amount of output. I suspect that I will run into another problem (i.e. - buffering problem) when I expand it to the "real" data (lots of output), but I got to crawl before I can run, huh?

I've about exhausted all the various permutation attempts to get this to work -- where am I going wrong?

smugindividual · Feb 12, 2004

Is it possible that your C program errors before anything is written to the pipe? Have you tried running the C program on its own? I've found that if I'm using a pipe to run another application, errors that occur in that application arent reported and the pipe will close as if there was no data. If possible, run your C program on its own to make sure it gets through. If you cant, try adding some print statments at crucial points of the C program so that when it run you can see its progress. It'll slow it down a little, but once your done with trouble shooting you can remove the print statments.

good luck

cptk · Feb 12, 2004

No, the C program works fine. Let me further explain that in my C program I go into a infinite loop:

(while c == 0)
{
scanf {"%d", &tk_request);
switch (tk_request)
{
case 1:
run this program
break;
case 2:
...etc.
}
}

When I run the C program on it's own (without tcl/tk's front-end), it basically sits waiting for a user input. When I enter a correct value (e.g. 1), it does what I expect - it processes data and spits out to screen a bunch of output.

It's this output that I want to capture when I run with my tcl/tk front-end script. Don't get me wrong, I do get data when in the tcl/tk mode, it's just that I can't seem to get all the data into one string or list variable. Also, I can't get to a successful "eof" value. I played with varing the # of fflush's after each fprintf, changed the fconfigure fid -blocking to line, none, auto; I tried fconfigure fid -blocking set to on and off.

Even when I simply output just one fprintf line with a value of say "abc", back in the tcl/tk script, I can't get the eof value to be 1. I tried using both gets and read, nothing seems to work. What the hell am I missing here?

AviaTraining · Feb 12, 2004

Just a note that you'll detect an EOF condition only when the pipe closes. (And even then, Tcl's eof command returns True only if the previous attempt to read from the channel via gets or read detected an EOF condition.) Flushing the channel doesn't close the channel, it just sends any data that is currently buffered. So, if you were flushing [tt]stdout[/tt] in your C program hoping to trigger an EOF in your Tcl script, you're not going to get it.

If you've got line-oriented textual data, like your example seems to indicate, I'd probably use a gets command to read a line at a time, and buffer the information read in a variable until I received the "END" marker in the data stream.

Here's the basic code skeleton that I use to read line-oriented textual data from a non-blocking channel:

Code:

set sock [socket $host $port]

# Or: set sock [open &quot;| $cmd&quot; r+]

fconfigure $sock -buffering line -blocking 0
fileevent $sock readable [list ReadLine $sock]

proc ReadLine {sock} {
    if {[catch {gets $sock line} len] || [eof $sock]} {

        # If we encounter an error reading with gets,
        # or if we detect EOF, close the channel.

        catch {close $sock}

    } elseif {$len >= 0} {

        # If gets returned 0 or greater, we successfully
        # read a line of data. Hand the line off to
        # ProcessLine for actual processing.

        ProcessLine $sock $line

    }

    # We reach this point if there wasn't a complete line
    # to read. We'll drop back into the event loop and
    # wait for more data to arrive on the channel.

}

proc ProcessLine {sock line} {

    # Process the line we just read in whatever manner
    # is appropriate.

    # In your case, you could do something like:

    global buffer  ;# A buffer array, one element for
                   ;# each channel.

    switch -- $line {

        START {
            # Clear the data buffer

            set buffer($sock) &quot;&quot;
        }

        END {
            # We've got all the data. Hand it off to
            # another procedure to use it any way we like.

            ProcessData $sock $buffer($sock)
        }

        default {
            # Append the data we just read, along with
            # the newline that gets stripped off, to
            # the end of our data buffer.

            append buffer($sock) $line \n
        }
    }
}

- Ken Jones, President, ken@avia-training.com
Avia Training and Consulting,

http://www.avia-training.com

866-TCL-HELP (866-825-4357) US Toll free
415-643-8692 Voice
415-643-8697 Fax

cptk · Feb 12, 2004

Ahhh, Ken I was glad to see that you responded to my post.

First off, I didn't know about what you said regarding the setting of eof only occurs after the pipe is closed. That solves one of my naggin problems ... THANKS!!

Second, I just was heading in that direction about using an actual END marker (coming from c_prgm) to indicate when all the output has been processed, but I was sort of relunctant thinking it was a "kludge" -- but it seems that's not the case.

After all that's been said, another question arises:
Do I have to close the pipe after each use, which then would require to reissue my open cmd. I would think not, which would defeat the purpose of using the fileevent readable cmd.

AviaTraining · Feb 12, 2004

I don't know exactly what your C program is doing, but I suspect you don't want to keep closing and reopening the pipe. Keep in mind that when you open a pipe in Tcl, the open actually starts the other program running and connects up the pipe channels for you to use. So if you do something like:

Code:

set fid [open &quot;| myprog&quot; r+]

the myprog program isn't running until Tcl executes the open command. In other words, open can't connect to a program that's already running. (Not directly. On Unix systems, you could use named pipes, but that's something completely different, and I'm Not Going There in this post.

) If you want to do something like that, you'll be better off using sockets for interprocess communication.

When you close a pipe, the result is that the [tt]stdin[/tt] and/or [tt]stdout[/tt] channels (depending on the mode you used to open the pipe) of the program you started get closed. I'm not certain how your C program is written, but many programs -- particularly those designed to accept input on their [tt]stdin[/tt] channel -- exit when their [tt]stdin[/tt] channel closes. So, depending on how you wrote your C program, closing the pipe to it in Tcl could exit your C program.

Even if your C program doesn't exit when its [tt]stdin[/tt] channel closes, it's still unlikely that you'll want to repeatedly open and close the pipe in Tcl. Why? Because each open command would start a separate copy of your C program running. Probably not what you had in mind.

So, in summary, I think you'll probably want to open a pipe to your program once, perform all interactions with it that you want, and then close the pipe once you no longer need to interact with the program.

- Ken Jones, President, ken@avia-training.com
Avia Training and Consulting,

http://www.avia-training.com

866-TCL-HELP (866-825-4357) US Toll free
415-643-8692 Voice
415-643-8697 Fax

cptk · Feb 12, 2004

Ken - Your the man ... thanks!

cptk · Feb 12, 2004

Just to clarify, in your code above, stmt.
} elseif {$len >= 0} {

it should be either ...

} elseif {$len = 0} {
or
} elseif {$line >= 0} {

...right?

Since $len will be the 0 if successful during the
catch or, if you use $line >= 0, to check if the gets
at least read some data. What happens if gets fails; what
does it return ?? .. -1 ??

AviaTraining · Feb 12, 2004

Nope, it's correct as written. Note how len gets set:

Code:

catch {gets $sock line} len

So catch stores the error message generated in len if an error occurs, or the return value of gets if there is no error. (I'll address what gets returns in a moment.) We then test the return value of catch in the if statement:

Code:

if {[catch {gets $sock line} len] || [eof $sock]} ...

The return value of catch is 0 if it detects no errors executing its script, and non-0 if it detects an abnormal situation, such as an error. I'm implicitly using the return value of catch as a boolean: 0 (false) meaning no error and non-0 (true) meaning error.

Note that in expressions containing the boolean operators [tt]||[/tt] and [tt]&&[/tt], Tcl evaluates only as much of the expression as is needed. So, in the case of an "or" ([tt]||[/tt]), if the first expression is True, Tcl doesn't bother evaluating the second expression. In this case, that means that if catch indicates an error, Tcl doesn't bother executing the eof subcommand. Alternately, if catch doesn't signal an error (boolean false), that means that the gets command executed successfully, and so when Tcl evaluates the eof command, it's reporting "current" information.

Okay, let's get back to what gets returns. It's a little different with non-blocking channels (like we've got in this case) than with blocking channels (the default).

[ul][li]For both blocking and non-blocking channels, if there is at least a complete line of data to read, gets reads those characters, strips off the end-of-line character(s) ([tt]\r[/tt], [tt]\n[/tt], or [tt]\r\n[/tt], depending on the -translation setting for the channel -- see the fconfigure reference page), stores the characters excluding the end-of-line in the variable provided (line in this case), and returns the number of characters read excluding the end-of-line.[/li]

[li]For both blocking and non-blocking channels, gets returns -1 if there is no data to read on the channel and it encounters EOF.[/li]

[li]For both blocking and non-blocking channels, if there is an incomplete line but also an EOF condition, gets reads the characters and returns the number of characters read. The EOF condition isn't reported until the next attempt to read from the channel.[/li]

[li]For a blocking channel, if there is not a ccomplete line of data to read but no EOF condition, gets waits ("blocks&quot

until there is a complete line to read (or an EOF condition). For a non-blocking channel, if there is not a ccomplete line of data to read but no EOF condition, gets doesn't read any of the characters, and immediately returns with a return value of -1[/li][/ul]
Notice that with non-blocking channels, gets can return -1 either if there is EOF on the channel or there is not a complete line of data to read. To determine which is the case, you need to call either the eof command, which we've seen, or the lesser-used fblocked, which returns 1 if the gets "would have blocked if it were a blocking channel" -- that is, there wasn't a complete line to read -- and 0 otherwise.

So in summary, the code above is correct. If gets raises an error, we close the channel. If eof returns True, the call to gets just encountered EOF so we close the channel. Otherwise, if len -- the return value of gets -- is 0 or greater, we've successfully read a line (or the last characters we're going to receive from the channel before EOF), and we process them. Otherwise, gets must have returned -1, but since we already tested for EOF, it must be telling us that it couldn't read a complete line, and so we drop back into the event loop to wait for more data.

As you can see, using a non-blocking channel can be a little confusing at first. That's why a lot of people just use blocking channels and simplify their handler to something like this:

Code:

set sock [socket $host $port]

# Or: set sock [open &quot;| $cmd&quot; r+]

fconfigure $sock -buffering line
fileevent $sock readable [list ReadLine $sock]

proc ReadLine {sock} {
    if {[catch {gets $sock line} len] || ($len == -1)} {

        # If we encounter an error reading with gets,
        # or if we detect EOF, close the channel.

        catch {close $sock}

    } else {

        # If gets returned 0 or greater, we successfully
        # read a line of data. Hand the line off to
        # ProcessLine for actual processing.

        ProcessLine $sock $line

    }
}

The downside to blocking channels is that it is possible, particularly with sockets, to receive incomplete lines. (The underlying network protocol often can "chunk" the data so that a line gets split across a couple of chunks.) And if our gets command blocks because of an incomplete line, our program is frozen until the rest of the line arrives. For GUI programs or servers handling multiple connections, that can be a big problem. That's why I always use non-blocking channels when possible. And once you get used to the way they work, they're really not that bad.

- Ken Jones, President, ken@avia-training.com
Avia Training and Consulting,

http://www.avia-training.com

866-TCL-HELP (866-825-4357) US Toll free
415-643-8692 Voice
415-643-8697 Fax

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

buffering, eof, read quirks 1

cptk

Technical User

smugindividual

Programmer

cptk

Technical User

smugindividual

Programmer

cptk

Technical User

AviaTraining

Instructor

cptk

Technical User

AviaTraining

Instructor

cptk

Technical User

cptk

Technical User

AviaTraining

Instructor

Similar threads

Part and Inventory Search

Sponsor