Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Sum vs Sum Nooptmize

Status
Not open for further replies.

gkratnam

Programmer
Aug 9, 2007
37
US
I like to optimize some old code here. Not sure it will have any negative impact.

Code:
SELECT PrjTime
SEEK cur_projid &&PrjTime table has an index on project id
SUM NOOPTIMZE while (project_id = cur_projid) prj_hours to total_prjhours

Would removing the NOOPTIMZE cause any issues?

Please assist. Thanks!
 
I'm no expert on Rushmore, but I think I see something here. No NDA for me.

If FP uses a 'bitmap' to select 2-300 records from a set of whatever based on a single criteria there clearly is no
advantage to me as a programmer doing some kind of Seek and While. It's clear, you can't beat it.

But what if I have two limiting factors, not in a combined index - but either of them in another index.

As a programmer I can't take advantage of that, I can't Seek A... While A and B if B is not a compound of
the index that matches A (yeah, ok I might be miles off).

BUT if I have a 'bitmap' of A and one of B I could do a 'bitwise' comparison that only picks out A matching records and
B ones and give a result set in whizzo time - like we are used to subnet masking doing... you AND each item in the bitmap for
A with the one for B and unless they are BOTH true, the record is discounted.

Not that impressive with only the A, but good with A and B, and very good with A, B and C

Now I see why it's clever, sometimes.

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
Got it, Griff.

And Mike, deleted posts cn still be read. Thanks for the amusement.



Granted you know SYS(3050), also granted you know its effect on Rushmore, but not everything I write just addresses you.

Coverage profiling is measuring better than seconds(). Seconds() precision only is miliseconds.

Code:
CREATE CURSOR times (t1 B, t2 B, t3 B, t4 B)
SET FIXED ON
SET DECIMALS TO 6
starttime = datetime()
DO WHILE t4=t1
APPEND BLANK
replace t1 with SECONDS(), t2 WITH SECONDS(), t3 WITH SECONDS()
ENDDO
? t1, t2, t3, t4
On older systems it may even be worse precision in steps of 1/16th seconds, the precision the Bios clock usually only had, but we're now in the era of better hardware and EFI.
But doo whatever you like, I don't care.

Yes, High Performance counters are even better. I mentioned them already in a previous thread, doesn't matter. For a simple run, that's just over the top.

myearwood said:
You are dismissing that windows caches a lot.

No. Let me simply point out what I wrote.
myself said:
The cache, of course, also is an accelerator in all of this. And hardware and [highlight #FCE94F]OS caching[/highlight] on top of all that.

You're missing that my test does the SQL with low RAM twice and I purge the VFP cache to proof VFP here cannot profit from the OS cache that could have been used during the first time the query ran. Doesn't matter, you'll come up with what's wrong about that.

myearwood said:
SEEK followed by a SUM WHILE is your best bet.
I second this for this situation, but wait for it. You motivated me to do a test and it turns out Rushmore can even outperform SEEK followed by SUM WHILE.

myearwood said:
The speculative answers you get are just the blind leading the blind.
myself said:
While isn't optimized with or without NOOPTIMIZE clause.

What's speculative about that?

Maybe you refer to what I quoted from Christoph and second myself;
Christoph Wollenhaupt said:
[highlight #FCE94F]In many cases[/highlight] this is even faster than a Rushmore optimized query because Visual FoxPro doesn't have to create a bitmap
I take such things from him as granted truth.

Or you refer to
myself said:
The SQL I and also Mike Lewis suggested could be faster. It all depends on amount and physical ordering of data, percent of result records and more factors, whether that or the "manual" solution of a SEEK plus WHILE scope works better.

In any case you could be more precise with your criticism, I'm not even talking about formal points and the double standards you have about them. I get numb to that.

Lets see whether there is no break even point at all, because Rushmore has to do the same as SEEK+WHILE or REST to build to the bitmap, and so this time is always wasted. I made an assumption of perhaps 1% data relevant to aggregation for a project. So let me create a table for project hours which references a project by prj_id (int). That projects table isn't created nor needed for the performance test, but this will be used for filtering data of one project and for grouping in the scenario of just getting over the statistical aggregation in one go.

Code:
Cd Getenv("TEMP")
Set Talk Off
Set Notify Off
Close All
Erase vfpProjectHours.*

Create Table vfpProjectHours (Id Int Autoinc, Prj_id Int, Prj_Hours B)
Index On Id Tag Id Candidate
Index On Prj_id Tag Prj_id
Set Order To
Rand(-1)
For p = 1 To 90
   For ph = 1 To 111111
      * Floor(m.p + Rand()*11) varies from 1 to 100
      Insert Into vfpProjectHours (Prj_id, Prj_Hours) Values (Floor(m.p + Rand()*11), 1+Rand()*7)
   Endfor
ENDFOR
? RECCOUNT() && 9,999,990 records.

And now for the performance tests:

1. SQL
[pre]1 Hit 1st 0.000104 Avg 0.000104 starttime = Seconds()

1 Hit 1st 0.000074 Avg 0.000074 Clear
1 Hit 1st 0.000753 Avg 0.000753 Cd Getenv("TEMP")
1 Hit 1st 0.000019 Avg 0.000019 Set Exclusive Off
1 Hit 1st 0.000013 Avg 0.000013 Set Optimize On
1 Hit 1st 0.000015 Avg 0.000015 Set Talk Off
1 Hit 1st 0.000024 Avg 0.000024 Set Notify Off

1 Hit 1st 0.000016 Avg 0.000016 Sys(3050,1,0x40000000)
1 Hit 1st 0.000013 Avg 0.000013 Sys(1103)
1 Hit 1st 0.001997 Avg 0.001997 Use vfpProjectHours Shared

* SQL with Rushmore
* =================

* single project
* --------------
1 Hit 1st 0.000029 Avg 0.000029 p = Floor(Rand()*100)+1
1 Hit 1st 0.000015 Avg 0.000015 Sys(1103)

1 Hit 1st 0.467204 Avg 0.467204 Select Sum(Prj_Hours) As bTotal From vfpProjectHours Where Prj_id = m.p Into Cursor singleresult

* all projects
* ------------
1 Hit 1st 0.000028 Avg 0.000028 Sys(1103)
1 Hit 1st 10.782793 Avg 10.782793 Select Prj_id, Sum(Prj_Hours) As bTotal From vfpProjectHours Group By Prj_id Into Cursor allresults

1 Hit 1st 0.000107 Avg 0.000107 endtime = Seconds()

1 Hit 1st 0.010752 Avg 0.010752 ? endtime-starttime[/pre]

Total 11.254 s


2. xBASE
[pre]1 Hit 1st 0.000111 Avg 0.000111 starttime = Seconds()

1 Hit 1st 0.000073 Avg 0.000073 Clear
1 Hit 1st 0.001055 Avg 0.001055 Cd Getenv("TEMP")
1 Hit 1st 0.000023 Avg 0.000023 Set Exclusive Off
1 Hit 1st 0.000013 Avg 0.000013 Set Optimize On
1 Hit 1st 0.000015 Avg 0.000015 Set Talk Off
1 Hit 1st 0.000030 Avg 0.000030 Set Notify Off

1 Hit 1st 0.000021 Avg 0.000021 Sys(3050,1,0x40000000)
1 Hit 1st 0.000016 Avg 0.000016 Sys(1103)
1 Hit 1st 0.001727 Avg 0.001727 Use vfpProjectHours

* xBase with SEEK/SUM
* ===================

* single project
* --------------
1 Hit 1st 0.000026 Avg 0.000026 p = Floor(Rand()*100)+1
1 Hit 1st 0.000016 Avg 0.000016 Sys(1103)
1 Hit 1st 0.000026 Avg 0.000026 Set Order To Prj_id
1 Hit 1st 0.000116 Avg 0.000116 Seek m.p
1 Hit 1st 0.425917 Avg 0.425917 Sum Prj_Hours While Prj_id = m.p To bTotalProjectHours


* all projects
* ------------
1 Hit 1st 0.000029 Avg 0.000029 Dimension Totals[100]
1 Hit 1st 0.000015 Avg 0.000015 Store 0 To Totals

1 Hit 1st 0.000098 Avg 0.000098 Scan
100 Hits 1st 0.000028 Avg 0.000020 Scatter Fields Prj_id Memvar
100 Hits 1st 0.084450 Avg 0.750293 Sum Prj_Hours While Prj_id = m.Prj_id To Totals[m.prj_id]
100 Hits 1st 0.000027 Avg 0.000034 Endscan
1 Hit 1st 0.000016 Avg 0.000016 endtime = Seconds()

1 Hit 1st 0.010719 Avg 0.010719 ? endtime-starttime[/pre]

Total 75.468 s

Each run made on a freshly started VFP.

Looking into the detailed log I see usual SUM WHILE are below average, and once in a while they take 1, 10, even 20 seconds.

I'm sure you can explain that.

Chriss
 
I too learned more about Rushmore and speed considerations. Which reminds me...

Disclaimer: The following has absolutely nothing to do with the highly informative discussion.

Back in the dark ages (you remember them?) I wrote an application (or two) in dBASE II. When exposed to faster hardware, it screamed. Probably faster than dBASE 3, FP & VFP. Of course it was simpler, written in assembly and lacked a million advantages compared to VFP. I don't recall if it even had color available.

My point is ...
 
I don't recall if it even had color available.

It didn't. At least, not built into the language. You could output colour to the monitor by sending ESC sequences, but colour monitors were very rare (and very expensive).

In fact, I don't think even Foxbase+ had any built-in colour features, as far as I remember. I think it was Foxpro 1.0 that introduced colour codes in DEFINE BAR, etc.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
Mike said:
colour monitors were very rare (and very expensive).

Yes. As I remember, I think Apple came out with color first. It was expensive. That's probably why I stuck with my Heathkit H89 All-In-One with its 5" floppy drive and H-DOS, then a Z-100 (Zenith) with 768K memory. IBM later came out with their 640K PC - setting the PC industry back about 2 years.

Steve
 
100 * 0.75 = 75, you fail to see SUM WHILE failing badly, here.

I think it's an issue of Grabage collection pressure, but Griff, I'll come up with an example that's simple and so surprising to work better with Rushmore. I don't just trust Christoph here, I have learned from other MVPs.

Chriss
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top