Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

StreamReader: why is reading until EndOfStream so slow? 1

Status
Not open for further replies.

DayLaborer

Programmer
Jan 3, 2006
347
US
I was asked to review and rewrite some code that takes a StreamReader and turns it into a StringBuilder. The original code was this:
Code:
for (int i = 1; i <= dataSize; i++)
{
  if (streamReader.Peek() > 0)
  {
   streamReader.Read(c, 0, c.Length);
   sbXML.Append(c);
  }
}
When I saw that it was processing each character seperately, I thought I was making it far more efficient by changing it to this:
Code:
while (!streamReader.EndOfStream)
{
 sbXML.AppendLine(streamReader.ReadLine());
}
What I truly don't understand is that the time to update 1,000 records jumped from 17 seconds with the original code to over 2 minutes with my new, improved code!

Why?! Isn't the way I rewrote it better? Is there something I'm missing to make it faster?

Thanks,
Eliezer
 
can you show us more code?
i cant see the variable 'c'
 
also, what is the length of dataSize? maybe dataSize is less then the total length of the file.

reviewing the snippet above, there doesn't seem to be a need to loop through the reader at all. All you are doing is appending the text from the file to a string builders. for that matter you might as well append the entire file in one go.
Code:
sbXml = streamReader.ReadEntireFile();
or whatever the name of the member is.

Jason Meckley
Programmer
Specialty Bakers, Inc.

faq855-7190
 
When I saw that it was processing each character seperately, I thought I was making it far more efficient by changing it to this:
It seems that the variable -c- is just a 1-element array, as you needed the for{} statement to loop up to -dataSize- (most likely the file.Length).

The original code is ideal for reading blocks of data. The second code reads 'each line' (a line being just a series of characters 'terminated' by Environment.NewLine character/s). On Windows, this terminator is actually chr(13) + char(10).

Let's say your text file is 500,000 characters contains 100 lines, each line varying in length.

If the original code reads every 4096 (4kb) block, it will take about 120 iterations to read everything.

The second code, although will take less iterations, may still end slower because in order to determine the line terminator, it needs to check the presence of chr(13) + chr(10) everytime the byte offset increments.

The performance becomes even worse if there are actually more 'lines' in the file.

Going back to the original code, I think you can improve its performance just by increasing the block size (and rewriting the code a bit).

Code:
 string[] c = new string[4096];
 while(!streamReader.EndOfStream)
 {
   int n = rdr.Read(c, 0, c.Length);
   sbXML.Append(c, 0, n);
 }

But, if you're just dumping the entire file in the memory, why not then just read the entire thing in one go, as jmeckley pointed out? By the way, it's streamReader.ReadToEnd().

[wink]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top