Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Metering Data Volume

Status
Not open for further replies.

stanlyn

Programmer
Sep 3, 2003
945
US
Hi,

Any ideas on best approach on collecting as much data through put and/or meter it as a user uses the system. The system is a document management system where the data and images are stored in SQL remotely. Our new billing model will based on a bucket system, where the user purchases a bucket of X size to be consumed within a defined period. Therefore, I need to journal outgoing query sizes and their returned result sizes as well as image sizes that are actually returned.

Also wondering if there is some sort of control/software that I can place in the app where all data flowing in and out of the app will be passing thru it, and of-course be able to communicate with it.

This topic is more about discussing pros and cons of how to do this, so please, if you see an issue, lets talk about it...

Thanks,
Stanley
 
Update...
Using imDisk with a 2gb ram disk, the speed was about 20-25% better. 1000 records took 1.2 seconds and 100000 took 9-10 seconds...

 
>I will pulling down the images on demand, and not when the query results are returned

I don't know what you're meaning with this, at some point you will fetch an image, won't you? Then you meter that.

I understand it's hard to separate the traffic of each individual customer from the overall, a number like the record count only would be a possibility to determine a rough estimate of the share, but not realtime, rather as end of month estimate, all traffic / total of all reccount = traffic per record, traffic per customer = this base number times his reccount.

Anyway, I see you already made use of the RAM disk idea, sounds like an awful performance, or are you talking about the data retrieval from remote to the cursor? I can't believe your figures, because when you fetch into a cursor that has a 2GB limit and you talk of 7,710,902K, which would be 7GB, wouldn't it? That speed actually sounds very nice, if you get that volume of data in 2.5 seconds from a remote MSSQL db.

Any action on the RAM disc below the 2GBsize a cursor can have at max should only take split seconds, if I consider how fast RAM speed is today capable to process volumes like 12 GB/s. And 7 GB in 2.5 seconds sound like almost RAM speed.

If you write out cursor with blobs to RAM disk to determine their size, I'd say in case you have a field which contains >90% of the size you're interested you simple determine LEN(field) as an estimate, that means no copy needs to be done and is fastest.

Bye, Olaf.

Olaf Doschke Software Engineering
 
Olaf,
I don't know what you're meaning with this, at some point you will fetch an image, won't you? Then you meter that.

Remember when I said the images are in a separate table and there is a parent/child relationship? The data that is returned into a cursor/table will not contain the image, however it will contain a key into the image table and if the customer wants to view the image, we fetch that image using the key. This keeps the return cursor small and we view the images on demand. Each image fetched will be metered and saved locally so a re-fetch is not needed, in case they want to print it, or ???.

In my numbers I should have said B (bytes) instead of K which dramatically changes the actual size, sorry... After you said what you said, I too was wondering why it did not complain with the 2gb limit. Here is a look at their real sizes, and also note this is not using a ram or ssd disk, but actually a platter disk. Also, again the sql machine is on the same local gb lan and nothing is considered remote (non-lan).

zzzz002_lqihqn.png


determine a rough estimate of the share, but not realtime, rather as end of month estimate, all traffic / total of all reccount

We need to do this as close to real time as possible, otherwise customer could easily rack up alot of usage (by sharing their credentials with co-workers, and etc). We also need to put up notifications (based on user defined threshold) when bucket falls below the threshold, and offer to upgrade.

you have a field which contains >90% of the size you're interested
Question... Lets say a specific blob field in a table contains 2mb of real data (not padding) for one record and the same field for all the other records is empty, what would the resulting cursor size look like? Would each record be the same size as the largest? Or would the size actually reflect its true usage? How close would the table size be to what was transmitted over the wire? And I know that compression can play a big role here.

Thanks,
Stanley
 
>and if the customer wants to view the image, we fetch that image using the key.

Yes, so it's as I said:
myself said:
at some point you will fetch an image, won't you? Then you meter that.

If images are contributing almost 100% to the traffic you could concentrate on metering them only, if the rest of transfer is neglectable. You're not working by the 80/20 rule if you meter everything precisely, while it would suffice to meter the major traffic part and estimate the total traffic has a factor of that size. In your case the data in documents seem to split up with about 1/3 in DBF and 2/3 in FPT. If all fpt datais in one blob field the measurement of all fpt data would simply be sum(len(blobfield)) and the dbf size can be determined with reccount()*recsize(), if you take thatroute you don't need towrite out the cursor to measure these sizes.

And stanlyn, don't confuse remote with another LAN, if you make a connection to a database, that is remote access, any SQLPassthrough is remote, no matter if the database is local in the LAN or on the internet. Any access through SQLEXEC is coming in as a cursor not being a filter is one of the consequences of that, so you don't have that problem at all.

There were too many things just being an unimportant side track, if you'd concentrate on the core problem. To meter the images all you need is LEN(blobfield) you fetch. You could go without storing the cursor to hdd to measure it.

Bye, Olaf.

Olaf Doschke Software Engineering
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top