Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Sum vs Sum Nooptmize

Status
Not open for further replies.

gkratnam

Programmer
Aug 9, 2007
37
US
I like to optimize some old code here. Not sure it will have any negative impact.

Code:
SELECT PrjTime
SEEK cur_projid &&PrjTime table has an index on project id
SUM NOOPTIMZE while (project_id = cur_projid) prj_hours to total_prjhours

Would removing the NOOPTIMZE cause any issues?

Please assist. Thanks!
 
In short summary, directly answering your question: No. While isn't optimized with or without NOOPTIMIZE clause.

This section of a Rushmore optimization help topic describes it:

VFP help said:
Operating Without Rushmore Query Optimization
Data retrieval operations proceed without Rushmore optimization in the following situations:

When Rushmore cannot optimize the FOR clause expressions in a potentially optimizable command.

When a command that might benefit from Rushmore contains a [highlight #FCE94F]WHILE[/highlight] clause.

When memory is low. Data retrieval continues, but is not optimized

So not using Rushmore has already been decided by using the WHILE scope.

So to twist the question a bit: Can this be speed up by Rushmore? Maybe, but surely not by just removing the NOOPTIMIZE clause.
Removing the SEEK and just doing SUM prj_hours to total_prjhours FOR project_id = cur_projid would use Rushmore.

The steps are (in general):
Find usable indexes for the FOR conditions
Create Rushmore bitmaps, if memory allows (setting SYS(3050))
Combine Rushmore bitmaps due to boolean algebra operations in the single terms of the FOR clause
Visit the record according to the bits of the result Rushmore bitmap.

The overall goal is least read access of the DBF by first knowing which recnos fulfill the for condition.

If you SEEK in proj_id order, the WHILE or REST scope are ideal and need no Rushmore. It will also depend on the number of records how you actually profit from knowing the recnos from the ruhmore optimization steps vs needing to traverse the index tree node by node intil there is no same valued sibling anymoe.

In your sepcial case, the conditions that Rushmore is just overhead are easily met. First, you only make use of one Rushmore bitmap as there only is one condition on the proj_id. Let me make the assumption one project_id makes less than 1% of all your data, then Rushmore still has to first create a bitmap with 1 bit for each record (also for those not matching), and while it only needs 1 bit per record and todays RAM memory not rarely allows you to take away 2GB for your VFP app, this still might be time better spent on traversing the index tree nodes more directly without even building up one Rushmore bitmap.

You could simply experiment and try out this instead:
Code:
Select Sum(prj_hours) from PrjTime Where project_id = cur_projid into array total_prjhours
? total_prjhours
That is SQL that will use Rushmore (Unless you SET OPTIMIZE OFF).

But you know (now) in the SUM command with a WHILE scope, NOOPTIMIZE is not doing anything, removing it also won't add Rushmore optimization usage, as the WHILE clause can't foresee which bits in a Rushmore bitmap mean consecutive records in the index order (that's what WHILE or also REST need). The bits in a Rushmore bitmap are always in physical (recno) order and when you add records for multiple employees and projects the records of one project are usually in chronological order but still records of other projects are scattered in between, so usually the bits of one proje_id are concentrated in a range relating to the project start and end, but scattered with records of other projects.

And with that in mind it's most timesaving to sum all projects or all active projects in one go, then you just build up multiple sum records and these can be built in parallel using every single record accessed in physical order. Then there also is no need to know which records to access, simply all or the tail of records starting from a known first record off all active projects.

Whatever your actual data situation is would determine what's best, but the reasoning to use NOOPTIMIZE wasn't bad, when the programmer adding it didn't know WHILE prevents Rushmore anyway. When you take it into your own hands to SEEK to the first record and then SUM or SCAN WHILE or REST you actually don't want Rushmore to need to build a bitmap, you want to traverse data by traversing the index.

There is one advantage of knowing recnos from Rushmore, not only when multiple Rushmore bitmaps are combined: You know the valid records in recno order and can visit them in that order. The advantage is lost with SSD drives, but the most modern platter drive controllers also use a technique to cache more than you actually read at first in case a later read needs data from the same sector and or they organize reads depending on head position and rotation speed and prediction of what sectors you may read. But that's also all "madness" of the past without random SSD access. Let alone the situation a file is cached in memory and you use the random access feature that gives RAM its name.

I don't say Rushmore is bad, but it only really shines, if you need to get a bigger portion of data or have several not very selective conditions that combine to a very select few result records to finally actually fetch only those. Ironically Rushmore will be fastest whenever the result or an intermediate bitmap is empty and no further boolean agebraic condition could add bits in the final bitmap (OR conditions, for example).

All the cdx access for getting there needs to be faster and less than without any index at all to break even with the simpler nooptimization strategy. Again this sounds worse for Rushmore than it is, because of a very simple reason: If Rushmore would only speed up the most complex cases, MS could have dropped it overall. There usually is a benefit.

In your special case I see a better fit in aggregating sums of multiple projects at once, that does the job better than repeating it for each project id. I think I also already said that in an earlier question you asked. And you can actually forget anything, when your data is so small, that keeping it cached is easily done and you won't notice much difference even without indexing at all.

Chriss
 
I'm thinking, optimized or otherwise, it will be nicely controlled by the OPs original seek and while clause.

Correct me if I am wrong, but the index will find the first valid entry and the while will stop when it ceases to be true.

Rushmore probably will not do better, that is just good design.


Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
Yes, that could be very true, Griff.

But I think the question is motivated as this SUM takes long to compute. So if that's the case there should be some mismatch between data and index. The index is of course used by the SEEK, but also SUM or any other command processing records that support a WHILE or REST scope will take index sort order into account. SUM could even give different results if you are on a projectid, then SET ORDER TO employeeid, and then SUM WHILE projectid=x, because the next record in employeeid order from the current record could be for another project and then you don't cover all project records.

So in short, for SUM with WHILE SCOPE the data is processed in current index sort order. The index is used, and that also assures you only fetch records with the same projectid. Even withou Rushmore creating a bitmap of the records matching the while condition.

So if this still runs slow, then there is some defect in the data or index or it won't get faster anyway because maybe the network is the bottleneck. And when the data isn't on a network share but local, I would suspect Antivirus intercepting the dbf access. VFP is not good when antivirus scans every file it just touches, for example.

Chriss
 
Regardless of the NOOPTIMIZE issue, I wonder if doing it in SQL would be faster than native VFP. In other words, instead of this:

Code:
SELECT PrjTime
SEEK cur_projid 
SUM NOOPTIMZE while (project_id = cur_projid) prj_hours to total_prjhours

do this:

Code:
SELECT SUM(prj_hours) FROM PrjTime WHERE project_id = cur_projid INTO ARRAY laTotal
total_prjhours = laTotal(1)

It could be that it won't make a big difference, but if performance is important then it would be worth doing some tests.


EDIT:
Sorry, I just noticed Chriss already suggested this.

Mike



__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
Is this one of those cases where the optimisation might be slower than the alternative?

I may well be wrong, but I think the optimisation process uses the indexes to extract a list of records to process in the form of
a bitmap (I am not 100% sure what that means, but it must take some processing time) then processes the records against that
list... so there's an overhead not included in the OPs design.

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
Griff, you're right, but see what I wrote. Think of the bitmap as a compact format for a list of recnos. Now notice the file offset of a record simply is Header()+(recno-1)*Recsize(), so moving to each record is fast. Traversing an index in the form of a binary tree also is fast in finding the first and subsequent records. Indeed subsequent sibling nodes are available in the form of pointers to their node. So you can easily go from first to next up to the last index node for a condition. The index node does not store the prj_hours value but, again like the Rushmore bitmap, the record number, so each index node is an indirect pointer to the location in the DBF.

Creating the bitmap also must read the index nodes and first "plot" each recno in a bitmap to then finally go to each record. Concentrating the reads to be done in one wash could speed up the DBF reading and make up the time spent to create the bitmap. Again, why would the mechanism still survive after it's introduction in very early FP versions, when it fails to accelerate queries? And as I also already said Rushmore can only really shine when it doesn't need to care for record order and can read the DBF in recno order of valid recnos. The more separate conditions you have, the more intermediate bitmaps are created, their bitwise or/and/not combination yields one result bitmap and ideally only a sparsely dotted bitmap means very few DBF reads (and less important for the illustration als related FPT file reads, of course, if memo values are fetched by a query).

I think if you google for it, it has been timed that a very specific SQL that has a primarykey=x Where condition is almost as fast as SEEK, even though it builds up a bitmap that only has one bit set. Of course SEEK wins, both the Rushmore bitmap creation and the SEEK read the index tree from the root level node to the node for the searched primary key value. The same still is true for normal index tags. A foreign key, like in time recording records, has a set of nodes that have a main and sibling nodes. I think it's undocumented whether a primary key means VFP doesn't go after sibling nodes by the unique nature of the index, or there simply are no sibling node pointers in each primary key node, so there is no need to read further than to the target node.

Reading in the CDX is always necessary, unless at least the portion of interest is already cached. The cache, of course, also is an accelerator in all of this. And hardware and OS caching on top of all that.

Chriss
 
Chris

I'm thinking that the Rushmore give advantages, and thus survives from old versions of FP, because it saves the programmer
from even thinking about the Seek... While.. - they can just say 'SUM x WHILE condition' and if there are indexes that match,
and they are recognisable as such, VFP is pretty efficient at processing that subset...

Of course, no matching index, that VFP can identify, and you just have the two overheads of a) trying to find an index match for
the condition (must take a few cycles) and b) processing the whole table against the condition if none is identified.

Now, how good is VFP at identifying those indexes? I have no idea, I don't rely on the database to do that... I would imagine
that it's good for simple keys, i.e. field names, but not so good if your table uses something like an index on UPPER(fieldname)?
I honestly do not know - perhaps Tamar does?

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
Well, the obvious condition for Rushmore to rush more is to need less time for CDX reading and bitmap build up and bitwise combinations than to scan the table for matching data.

Well, SEEK and other xBase commands still also can make use of the indexes, so it's not just done by Rushmore using indexes, it's also about the way it uses them.

Griff said:
how good is VFP at identifying those indexes?
Quite good. You have to steer it a bit by using the same expression as the key and VFP knows the index can be used.
It even doesn't need to be precisely exact. Maybe take a look into NORMALIZE(). Very shorthanded and inaccurately said it's for expressions what SOUNDEX() is for similar words. It's not exactly documented how Rushmore uses this function or more to find a suitable index for optimization, but we already found a case lately where an index that could have been used wasn't used. Therefore the rule, just write out the full key expression to point Rushmore to using its index.

And you missed the point I made as first: WHILE is a scope not Rushmore optimizable. The only value of an index for a WHILE scope is traversing its siblings till the end of the scope. And that's done also without Rushmore. To use Rushmore you don't need to remove NOOPTIMIZE here, but replace more code, you need to not even start with a SEEK. The SQL I and also Mike Lewis suggested could be faster. It all depends on amount and physical ordering of data, percent of result records and more factors, whether that or the "manual" solution of a SEEK plus WHILE scope works better.

And the other thing I'd like to point out once more: When the question arises, I think the given code runs slow for some reason, and it should run in a proper timing, as it's surely not slow to SEEK to the start of a range and then read it record by record following the trail of sibling records in the index.

So if that is slow, it actually pints out a index defect or wildly scattered records making reads jump forth and back in the DBF file simply because records are visited in index order and not in recno order after first determining the list of recnos of interest - which only Rushmore does.

If you assume Rushmore never can win because of its overhead, lets simply make an example where Rushmore is better than a developer can - in a separate thread. Because the steps Rushmore does are not all available to us as commands or functions we could use to reimplement Rushmore and optimize it for each individual case. There's a secret in that sauce.

Chriss
 
Griff,

It took me a long time to get my head round how Rushmore actually works, especially the thing about bitmaps. The thing that helped me the most was this long article by Christof Wollenhaupt.

If you feel sufficiently motivated, I recommend you study the article. But wait until you have a chunk of time available and you are able to concentrate on it, as there is a lot of information there to plough through.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
Hi Mike thanks.

Indeed the article portion suggests what I said:

Foxpert said:
If you only have a single condition, you can use the old dBase style:

Code:
SET ORDER TO RequiredIndex
SEEK Value
SCAN WHILE Field=Values
  * do something
ENDSCAN

[highlight #FCE94F]In many cases[/highlight] this is even faster than a Rushmore optimized query because Visual FoxPro doesn't have to create a bitmap

And a SUM WHILE is quite like a SCAN WHILE put into a single command. More important he writes what I also said: In many cases, but not in all cases. And it gets better when you have more conditions (and enough memory to create all the Rushmore bitmaps necessary).

Chriss
 
Thank you Mike

That's an interesting article and seems to back-up my thinking that a programmer who sets up their indexes and then uses them will
probably out perform (or at worst equal) Rushmore, but one who does not will probably not do so well.



Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
Griff,

one thing is for sure, you can't just put an index on all fields and then assume Rushmore will speed up any conditions in your queries. Using Rushmore also means balancing to not simply index all. That causes another flaw often only when a table grows larger: Insert time and Update time (or append/replace, doesn't matter) becomes longer.

It's rarely a discussion point as read operations are much more frequent, so its taken as the price to pay for the performance, even when you think you outsmart Rushmore, that'll be based on indexes.



Chriss
 
Thank you for all the feedback! I have been learning new things from your discussions and thought process, though some are too deep and could not get my head around.

Experimented with the SQL Rushmore suggestion and this is the outcome in seconds.

Test 1
opti1_skaxmh.png


opti2_p9akql.png



Test 2
opti3_nmgj8e.png


opti4_ujkm2j.png


The existing SEEK and WHILE method seems good. The test was done with 1.2 million records. It seems the timing is not bad for this record set, may need to look at other places for improvement. The loading time of the screen is slow which has this code, specially in this method but this method has other sub routines, need to check them. Antivirus can be ruled out since it is turned off for the directory, but network can be an issue. We noticed during peak times (then most users are accessing) it is slow, but at night it is not bad. I am planning to do more test in other places and come back to you with my questions.

Please note this code is to find the total time of just one project (one project id only), SQL may be a better solution if we are doing it for many projects. That's one of the reasons the results are stored in a variable instead of a cursor (let me know if this is not a good approach) then being used later to do some more calculations.

Also, need your help with archiving suggestions, Chriss already mentioned that in the other post. I need to do some analysis and educate myself first before asking for your assistance. The one thing comes to my mind right now is we need to access the archived data most of the time. This table is being used at 95% of the screens and how can we access the archived data with minimal code change.
 
Try the SQL solution again, after you set memory for Rushmore and other things sufficiently high.

Your 1.2 million records need ca. 150 KB for a single Rushmore Bitmap. And the memory is not only reserved for that. So give it a lot, start with 1GB:

Code:
SYS(3050,1,0x40000000)

You may go up to 0x80000000, maybe less as VFP will need some RAM for itself (runtime) and your EXE, too, and a 32bit process can "only" reserve a maximum of 2GB RAM. 1GB would allow lots of 150 kb Bitmaps bitmaps but also other things.

It's easily one of the things that you couldn't get your head around, but Rushmore needs memory to do its work. Of course it also depends on what RAM size you have. Rushmore will use assigned RAM and if that exceeds physical RAM it'll start swapping, which can also make it slow again. Find out what RAM the Windows system uses itself when it becomes idle ca. 5-10 minutes after Windows start. Resource monitor cn tell you. Then think about other applications RAM needs and if that still leaves you with 2GB it's also okay to get all of it. VFP will only use as much as it needs at a time, but it will take what it can to cache things and more. You set an upper bound to memory allocation of physical RAM.

The other question that came up to me as you posted the times: Did you even give it a try before asking, was your question only based on the assumption NOOPTIMIZE should never be used in any code? AS I said first in my first answer, removing the clause doesn't even change anything, as a WHILE scope isn't optimizable when you sort data by index.

Last not least, I'm glad to hear back you learned a lot. Doesn't matter if you got everything. I look forward to hearing back if SYS(3050) helped, if not, then let it be as it was, I wouldn't hunt down why Rushmore can't make this equally fast, you may be right about the bottleneck of LAN. Building up a Rushmore bitmap can read a lot from a CDX, that may be the factor here. SUM WHILE can't use Rushmore, but the index still is used, nothing else will tell what WHILE actually means in sort order. And this could read less from the CDX than Rushmore needs to read.

It's also nice to see you experimented twice to see if there was just a temporary issue slowing SQL down.


Chriss
 
Foxpert said:
How is this bitmap really created? The most important factor is that the index in Visual FoxPro is stored as a b-tree, a balanced tree. Every node is a 512 byte block that may contain up to 100 pointers to subordinate nodes. The nodes close to the root refer to key values, only leaf nodes refer to records. Because every node does not refer only to two other blocks, as one might have thought, the hierarchy in the index file can usually remain very flat. Additionally, all nodes are linked horizontally.

When Visual FoxPro searches a value, it navigates through the tree vertically top to down until it found the record. Then it continues to read horizontally as long as the search value matches the one in the index. This strategy of reading down and sideways reduces the number of index nodes it has to read, but increase the number of nodes to change when updating the index. While reading horizontally, Visual FoxPro sets the corresponding bit for each record found. This is possible because the index entry contains record number as well as key value. Data is stored compressed though. This is how the equal operator works.

By that description Rushmore does not have a million accesses to a CDX to set the bits, it does essentially the SEEK plus SCAN WHILE. Why would it read all 1 milion nodex to see if they mean 1 or 0 for the Rushmore bitmap? That would be dumb, essentially.



Chriss
 
I can read, thank you.

myearwood said:
Contrast that against 1 million accesses of the index to build a bitmap.

You may now edit it as you like, the original post can always be seen as proof.

Your trying to wind yourself out of quicksand if you stay with your claim 1 million accesses to the index are what's done.

Chriss
 
You argue besides the point again and with data that doesn't resemble the situation of aggregating data for a project. We all know SEEK is faster than Rushmore.

Did you vary SYS(3050)? If you'd do you'd actually get Rushmore to do its job properly and make sensible measurements.
Bbut if you compare SEEK/LOCATE in a scenario Rushmore can't even create one bitmap in physical RAM, of course it becomes slow.

Code:
SET NOTIFY CURSOR OFF
SET TALK OFF

SYS(1104) && purge cache
SYS(3050,1,0x80000) && minimum memory SYS(3050) accepts

SET COVERAGE TO timing.log
SELECT * FROM c_bigdata WHERE cField1 = '2346723' NOFILTER INTO CURSOR test
SET COVERAGE TO

SYS(1104) && purge cache
SYS(3050,1,0x40000000) && allow usage of up to 1GB

SET COVERAGE TO timing.log ADDITIVE 
SELECT * FROM c_bigdata WHERE cField1 = '2346723' NOFILTER INTO CURSOR test
SET COVERAGE TO

SYS(1104) && purge cache
SYS(3050,1,0x80000) && minimum memory SYS(3050) accepts

SET COVERAGE TO timing.log ADDITIVE 
SELECT * FROM c_bigdata WHERE cField1 = '2346723' NOFILTER INTO CURSOR test
SET COVERAGE TO

First time with too low SYS(3050) setting, the query takes about 10 seconds. After purging the cache and SYS(3050,1,0x40000000) the query time goes down to 0.004234 seconds.


Timings:
[pre] Duration,,..., Line Number
2.562971,,0000f3ex016t,8 && Select with low RAM
0.004486,,0000f3ex016t,15 && Select with high RAM
2.531512,,0000f3ex016t,22 && Select with low RAM, again
0.000198,,0000f3ex016t,31 && SEEK, fastest even with low RAM (of course)[/pre]

Besides that, the scenario is fetching a few records, all time attendance for one project. And even on that scenario I already said what Christoph said, Rushmore is usually not faster. But it's only extremely slow if it isn't allowed to make use of enough physical RAM. Too unfortunate that isn't that well known.

Chriss
 
Tried with 1GB memory, could not allocate to 2GB because of the system limitations.

First Run

opti5_unrbkj.png


opti7_smqsqn.png


Second Run

opti6_hcz463.png


opti9_twpogf.png


I think I can look for improvements in other places to speed up the loading time. However, I have learned a lot more about Rushmore from this post.

Chriss, the answer to your other question is I did not try it before posting. Simply assumed NOOPTIMIZE degrades performance. Thanks for educating me on that!

Thank you All for your tireless suggestions and knowledge sharing!
 
Hello, gkratnam

Thanks for trying. Yes, I think you're good with the code as is.
If 1GB is your upper limit, you could still try less, you might take away too much from the OS and other apps.

But indeed 0.02 seconds is good enough, isn't it?

Chriss
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top