Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Downloading File Logic 5

Status
Not open for further replies.

Glenn9999

Programmer
Jun 19, 2004
2,311
0
36
US
This isn't so much a puzzle as it is a logic problem. Say I have a process downloading a file. I rigged it to run a callback function roughly once a second, which returns the amount of data downloaded.

Now, I'm a bit puzzled on coming up with some things (at least for the sake of completeness more than just coming up with something):

1) Download speed. Now I can just return the amount of data downloaded from the callback and call it done - in truth that is the data transfer rate. But, is there something more proper like taking an average of all the times the callback fired? Or would this be most proper?

2) Estimated Time Left: Now this might be more appropriate to use an average against. What I'm doing is taking the amount I have left to download and then dividing it by the download speed. Seem proper?

The main thing I notice is that the numbers jump around wildly, so to remove any variation in sampling, would I need to be averaging the download speed samples?

(Statistics was never my thing, so yeah I'm not 100% sure about this)

It is not possible for anyone to acknowledge truth when their salary depends on them not doing it.
 
This is not a puzzle as such you would be better asking for help in the forum tfor the language you are using.

as a quick observation you could have a number of variables

Av.download speed: the maount downloaded devided byy the total time

Instantanious Download Speed: data downloaded during the last download period.

for eta I would suggest using the Av download speed. as it woudl tend to balance out the peaks & troughs.


I do not Have A.D.D. im just easily, Hey look a Squirrel!
 
I agree with IP Guru,

but you can have a long period with low download rate and once the download continues working with normal higher speed that low speed period will still influence the ETA very much. You can have inverse situation of a short period of fast download and the ETA calculation turns out harder, if there are concurrent downloads. One of them will be the first to finish and then the bandwidth will be available for remeining downloads. I haven't seen any downloader taking that into account. Then there are environments like a company having a very big bandwidth, but that being used by a stringly varying number of users is also divided to more or less downloads.

How about computing a best, medium and worst case ETA, taking the worst download rate>0, the average download rate and the best download rate.

Bye, Olaf.
 
Somethings to consider:
1) How important is this feature to the whole application, e.g. how much time/effort should be taken to work on this?

2) How accurate do you need to be? Will it matter if the calculations are off by 10%, 50%, 200% etc.

3) How long are the downloads going to be, seems to me that more accurate calcs are needed if the download time is expected to be minutes or hours vs seconds or fraction of a second.

For fun lets say that it is very important, needs to be highly accurate and the download times are in minutes to hours. I will add to that the assurtions that the more recent download speed is more important then older download speed.

1) Determine the speed for the last unit of time (Say second) and put it into a data struction (Array)

2) Average the array elements in groups (Say the first 1/3, 2nd 3rd, lst 3rd. Or maybe the first 60%, then next 30% and the last 10% ) The more groups the better. But I suspect there will be dimenising returns

3) Add up the averages with a weight factor example:
( Avg Group-1 * 1 ) + ( Avg Group-2 * 3 ) + ( Avg Group-3 * 6) In this case 60% of the data comes from the most recent download speed information

4) Final download speed = ( Weighted Average ) / (Total Weights)

Play with the number of groups, elements in the groups and weights to find the best solution. I suggest while debugging save the group values, projections and final actual times in a data structure to review

Lion Crest Software Services
Anthony L. Testi
President
 
Thanks all! I know this isn't the kind of thing that normally shows up here, but it was more a statistical logic problem (in my mind) than a direct programming issue relating to any particular language or software.

I ended up averaging the values and it ended up being closer to what I was expecting. Thanks again for the ideas!

It is not possible for anyone to acknowledge truth when their salary depends on them not doing it.
 
In fact Anthony has laid it out nicely. In my approach the worst download rate would get way off the real time needed, as soon as you have a period with low download rate. The latest download speed is surely a better indicator, but it get's even more complex with concurrent downloads.

For example: If you have 3 downloads approximately each with the same 1/3rd of a rather stable download bandwidth, and the downloads ETAs computed by their recent download rate are 1 2 and 3 minutes as the downloads are of different size, then you already know after 1 minute the first download stops and it's bandwidth could then be shared by the remaining downloads, so the third download will have it's current speed for the first minute, then will speed up and then again after the 2nd download finishes will speed up again. I know no download manager taking this into account, even if the internet bandwidth is really stable and you coul predict that.

Bye, Olaf.
 
One other point to consider:

Be careful of illogical situations. Say 90% of the download has completed (as measured in file size not in time) and because of low transfer rates it took 9 mins to complete. So we could ‘project’ that it will take a total of ~10 mins to complete, and no shorter than 9.001 mins because it has already taken 9 mins. But what happens if all of a sudden the transfer rate increases 1000X? The formula/algorithm may say that the total download time is < 9 seconds. In short watch out for back apply the current speed calcs.

The formula should be: (Time that has already taken ) + ( (Projected speed) * (remaining file to send) )


Lion Crest Software Services
Anthony L. Testi
President
 
Glen, you've missed out a vital part of the process.

After you've established the true remaining download time based on average download speed and remaining data to download, you have to transform it by some appropriate function such that the completion rate appears at least exponential to the user.

The idea of this is that any progress bar should whizz along really fast up to about 85%. It should then slow down substantially, but still move, up to 95%. Then it should craaaa Finally, at 99%, it should freeze completely for at least a minute.

From a user's perspective, this seems to be the protocol followed by most of the world's leading software manufacturers, so I assume it's Current Best Practice.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top