Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

compromise output vs speed

Status
Not open for further replies.

tokra

Technical User
Feb 20, 2003
45
0
0
GB
recently I've had to write a couple of apps that parsed text files that were a couple of gigs apiece. I wanted to output to the user that some progress is beeing made, but I don't wan't do it every line. I've come up with two solutions and was wonering which one is closer to whatever the generally acceppted best practice is.
Code:
//version 1
std::ifstream in("somefile.txt");
long buffersize=10000;//size of text buffer
char *tempc=new char[buffersize];
usigned long recordsparsed=0;
int threshold=500;//only output every threshold records

while(!in.eof())
{
     cin.getline(tempc,buffersize-1);
     parseline(tempc);//do stuff
     recordsparsed++;
    
     //*************************************************************
     //begin relevant code
     
     if (recordsparsed%threshold==0)
     {
         std::cout << recordsparsed << &quot; records parsed&quot; <<std::endl;
     }
     
     //end relevant code
     //*************************************************************
}
delete [] tempc;
tempc=NULL;



//&^&^&^&^&^&^&^&^&^&^&^&^&^&^&^&^&^&^&^&^&^&^
//version 2
std::ifstream in(&quot;somefile.txt&quot;);
long buffersize=10000;//size of text buffer
char *tempc=new char[buffersize];
usigned long recordsparsed=0;
int threshold=500;//only output every threshold records
while(!in.eof())
{
      
       //*************************************************************
      //begin relevant code
      
      long countdown=threshold;//variable to countdown since last output
      while(!in.eof()&&countdown!=0)
      
      //end relevant code
      //*************************************************************
      {
            cin.getline(tempc,buffersize-1);
            parseline(tempc);//do stuff
            recordsparsed++;
            countdown--;
      }
      std::cout << recordsparsed << &quot; records parsed&quot; <<std::endl;
}
delete [] tempc;
tempc=NULL;

WR
 
I think the first solution is faster and more elegant.
You could change the threshold to 1000 or pass it on the command line as parameter.
If it is absent or less than 1000 set it to 1000.
Also add
std::cout << recordsparsed << &quot; records parsed&quot; <<std::endl;
in the first solution to have the final message.

-obislavu-
 
Also there's a cool trick you can do if the output is just going to a console window and not to a file or pipe. Rather than outputting std::endl every line, output character '\r' (carriage return).

This causes the next line to draw over the previous line. Just remember to write the same number of characters each time, or if you write less, then write extra space ' ' characters to draw over what the previous line had.

You can make one of the characters on the line alternate between '-', '/', '|', '\', and back to '-' for an old-school &quot;animation&quot; effect. :)
 
teriviret- thanx for the tip, I didn't know you could do that.
obislavu- I like the first solution too cause there's less room for screwing it up when you code it. It all comes down to how fast the % operator is
With the first one recordsparsed%threshold must be computed for every iteration.
With the second one a test for nonequality for zero and a decrement on a long has to be performed for every iteration.

I don't know enough about how they are implemented to be sure which one is cheaper. The only reason I tried the second one is because decrement and testing for nonequality with zero just &quot;feel&quot; like they should be simple and fast.

WR
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top