StreamReader: why is reading until EndOfStream so slow? 1

DayLaborer · Jul 20, 2009

I was asked to review and rewrite some code that takes a StreamReader and turns it into a StringBuilder. The original code was this:

Code:

for (int i = 1; i <= dataSize; i++)
{
  if (streamReader.Peek() > 0)
  {
   streamReader.Read(c, 0, c.Length);
   sbXML.Append(c);
  }
}

When I saw that it was processing each character seperately, I thought I was making it far more efficient by changing it to this:

Code:

while (!streamReader.EndOfStream)
{
 sbXML.AppendLine(streamReader.ReadLine());
}

What I truly don't understand is that the time to update 1,000 records jumped from 17 seconds with the original code to over 2 minutes with my new, improved code!

Why?! Isn't the way I rewrote it better? Is there something I'm missing to make it faster?

Thanks,
Eliezer

DayLaborer · Jul 21, 2009

Any ideas?

MJB3K · Jul 22, 2009

perhaps a do while loop? although could be similar to what you have done!

Regards,

Martin

Computing Design And Services:

http://www.webrevolt.biz

TipGiver · Jul 22, 2009

can you show us more code?
i cant see the variable 'c'

jmeckley · Jul 22, 2009

also, what is the length of dataSize? maybe dataSize is less then the total length of the file.

reviewing the snippet above, there doesn't seem to be a need to loop through the reader at all. All you are doing is appending the text from the file to a string builders. for that matter you might as well append the entire file in one go.

Code:

sbXml = streamReader.ReadEntireFile();

or whatever the name of the member is.

Jason Meckley
Programmer
Specialty Bakers, Inc.

faq855-7190

phinoppix · Jul 25, 2009

When I saw that it was processing each character seperately, I thought I was making it far more efficient by changing it to this:

It seems that the variable -c- is just a 1-element array, as you needed the for{} statement to loop up to -dataSize- (most likely the file.Length).

The original code is ideal for reading blocks of data. The second code reads 'each line' (a line being just a series of characters 'terminated' by Environment.NewLine character/s). On Windows, this terminator is actually chr(13) + char(10).

Let's say your text file is 500,000 characters contains 100 lines, each line varying in length.

If the original code reads every 4096 (4kb) block, it will take about 120 iterations to read everything.

The second code, although will take less iterations, may still end slower because in order to determine the line terminator, it needs to check the presence of chr(13) + chr(10) everytime the byte offset increments.

The performance becomes even worse if there are actually more 'lines' in the file.

Going back to the original code, I think you can improve its performance just by increasing the block size (and rewriting the code a bit).

Code:

 string[] c = new string[4096];
 while(!streamReader.EndOfStream)
 {
   int n = rdr.Read(c, 0, c.Length);
   sbXML.Append(c, 0, n);
 }

But, if you're just dumping the entire file in the memory, why not then just read the entire thing in one go, as jmeckley pointed out? By the way, it's streamReader.ReadToEnd().

[wink]

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

StreamReader: why is reading until EndOfStream so slow? 1

DayLaborer

Programmer

DayLaborer

Programmer

MJB3K

Programmer

TipGiver

Programmer

jmeckley

Programmer

phinoppix

Programmer

Similar threads

Part and Inventory Search

Sponsor