One thing I've recently come across in studying algorithms and trying them against large files is that the assumption I use of a 64K buffer is probably inconsistent and outdated (it came from testing on an older system a few years ago).
You have differing devices which are capable of differing rates, which makes it more complex. Then I'm thinking that the hard drives today (as well as some other kinds of drives) can be driven a lot harder than 64K at a time with a performance advantage. Remember that you can have a performance hit if you read too little of a file at one time (for example, reading a byte at a time is VERY slow).
Then add to that the fact that you can do memory-mapped files in Windows as well as do overlapped I/O (async I/O, read file process it in a separate thread leaving the main thread to read the next block), and it makes me wonder if it would be suitable to rethink a performance strategy.
Any thoughts on how to approach such a problem so optimum performance from devices can be had no matter what?
It is not possible for anyone to acknowledge truth when their salary depends on them not doing it.
You have differing devices which are capable of differing rates, which makes it more complex. Then I'm thinking that the hard drives today (as well as some other kinds of drives) can be driven a lot harder than 64K at a time with a performance advantage. Remember that you can have a performance hit if you read too little of a file at one time (for example, reading a byte at a time is VERY slow).
Then add to that the fact that you can do memory-mapped files in Windows as well as do overlapped I/O (async I/O, read file process it in a separate thread leaving the main thread to read the next block), and it makes me wonder if it would be suitable to rethink a performance strategy.
Any thoughts on how to approach such a problem so optimum performance from devices can be had no matter what?
It is not possible for anyone to acknowledge truth when their salary depends on them not doing it.