Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

What am I doing wrong? 2

Status
Not open for further replies.

mcamardo

Programmer
Sep 27, 2005
12
US
This is a doublespace program I was creating and I have run into a lot of issues. I am new to the FoxPro language so explenations are great.

Set Talk Off

Store Fcount() To f

Dimension Afldcnt[f]

For i = 1 To f
If (Type('Field(i)') = 'C') == .T.
Store Field(i) To Afldcnt
Else
= Adel(Afldcnt, i)
Endif
Next i

Store Alen(Afldcnt) To a

For x = 1 To a
For recum = 1 To Reccount()
Store Afldcnt[x] To fld2chk
Do While ' '$Allt(&fld2chk)
Repl &fld2chk With Strt(Allt(&fld2chk), ' ', ' ')
Enddo
Next recum
Next x

Set Talk On

Data Processing Services
Advanced Marketing Direct
 
Please describe and explain what you are trying to do.
 
Hi,

To get the list of character fields into an array, try the following code, it is easier than your aproach:
Code:
USE yourTable
COPY STRUCTURE EXTENDED TO temp
SELECT field_name ;
  FROM temp ;
 WHERE field_type="C" ;
 INTO ARRAY aFldCnt
Then you can replace multiple spaces with single ones:
Code:
SELECT yourTable
howMany = alen(aFldCnt)
SCAN
   FOR i=1 to howMany
       fldVal = ALLTRIM(EVALUATE(aFldCnt(i)))
       IF "  "$fldVal
          DO WHILE '  '$fldVal
             fldVal = STRTRAN(fldVal, '  ', ' ')
          ENDDO
          REPLACE &aFldCnt(i) WITH fldVal
       ENDIF
   ENDFOR
ENDSCAN
It works, but surely there are better ways to do the same. Good luck.
 
The more simply could be to use
STRTRAN(your_field,"textwrong","textcorect"), e.g.

textwrong = "WINDOWS X"
textcorect= STRTRAN(txtuni, " "," ")

After that textcorect="WINDOWS XP" (because textwrong was 7 spaces and textcorect was 1 space)
****************************************
And If you wanted repair your_field e.g. in 30 records, you simply give:

replace your_field with STRTRAN(txtuni, " "," ") all

Stefan


 
I'm pretty sure AFIELDS() was in FoxPro 2.x. It's the easiest way to get a list of fields.

Also, check the FoxTools.FLL file. In later versions, it has a function called Reduce() for turning multiple spaces into a single space.

Tamar
 
This is how I did it and it works efficiantly...


Set Talk Off

Dimension Afld(Fcount())

Store 0 To cntr2

For cntr1 = 1 To Fcount()
If Type(Field(cntr1)) = 'C'
cntr2 = cntr2 + 1
Store Field(cntr1) To Afld(cntr2)
Endif
Next cntr1

Arycnt = cntr2

For cntr3 = 1 To Arycnt
Store Afld(cntr3) To fld2chk
Scan for ' '$allt(&fld2chk)
Do While ' '$allt(&fld2chk)
Repl &fld2chk With Strt(Allt(&fld2chk), ' ', ' ')
Enddo
Endscan
Next cntr3

Set Talk On


Data Processing Services
Advanced Marketing Direct
 

As Tamar noted, AFIELDS() function already existed in FoxPro 2.x, and in one function call, it will give you the number of fields and an array of field names (already trimmed) and types.

Also, you don't need another FOR..NEXT loop just to save names of character fields. You can do all you need to do in just one.

This code is shorter and does exactly the same. On a big table, I would expect it to be faster, too, but didn't have time to test.

Code:
SELECT MyTbl

nFld=AFIELDS(aFld)

FOR cntr1=1 TO nFld
   IF aFld(cntr1,2)="C"
      fld2chk=aFld(cntr1,1)	
      SCAN FOR '  '$&fld2chk
         DO WHILE '  '$&fld2chk
            REPLACE &fld2chk WITH STRTRAN(&fld2chk,'  ',' ')
         ENDDO
      ENDSCAN
   ENDIF
NEXT


You might also want to take advantage of EVALUATE() function, as shown by TheRambler. With less macro substitutions, it should get even faster.
 

myearwood,

You call it bad news? :) I call it a correction. And I agree with you. Thanks for finding time to test, because I haven't.

Moreover, I would guess, ALLTRIM() should probably be also added into SCAN FOR, for the same reason; or just SCAN without the FOR clause should be used. And yes, $ operator is non-ortimizable, but I don't remember whether AT() function existed in 2.x versions, and don't have anything in front of me right now (but I think it already existed). If so, of course it should be used instead, as it is optimizable.

As for row-wise operation, possibly it makes sense, but it should be tested on corrected versions.

But I still would use AFIELDS(), it's one of the best ways to get number, names, and type of all fields in one motion.

In case you want to test this, please post the results. I might try it myself, but probably in a few days.
 

OK, I tested it.

It seems that it's non-optimizable $ operator, macro substitutions, and also performing REPLACE within the loop and not after that taking up the time, not so much row-wise or column-wise operation.

Test table: 1,000,000 records, 5 fields, out of them 2 are 75 characters long and full of multiple spaces. I made several identical copies of it to have an uninterrupted testing, and calculated average time from 2 test runs.
The tables are unindexed and were open before the test began. Before the second run fresh copies of the original table were made.

Here is the test:

Code:
CLEAR
SET TALK OFF
SET SAFETY OFF

*===================================================
[b][COLOR=red]This is my corrected code, row-wise, 
with no $ operator and minimum possible macro substitutions:[/color][/b]
*===================================================

a=SECONDS()

SELECT Double_Space_1

nFld=AFIELDS(aFld)
SCAN
   FOR i=1 to nFld
       fld2chk=aFld(i,1)
       fldVal=EVALUATE(aFld(i,1))
       IF aFld(i,2)="C" AND AT('  ',ALLTRIM(fldVal))>0
          DO WHILE AT('  ',ALLTRIM(fldVal))>0
             fldVal=STRTRAN(fldVal,'  ',' ')
          ENDDO
          REPLACE &fld2chk WITH fldVal
       ENDIF
   ENDFOR
ENDSCAN

b=SECONDS()
?b-a	[COLOR=navy grey]&& Average running time [b]194.1095[/b] seconds[/color]


*===================================================
[b][COLOR=red]This my code, also row-wise but using $ operator
and macro substitutions, and also performing REPLACE within DO WHILE loop:[/color][/b]
*===================================================

a=SECONDS()

SELECT Double_Space_2

nFld=AFIELDS(aFld)
SCAN
   FOR i=1 to nFld
       fld2chk=aFld(i,1)
       IF aFld(i,2)="C" AND '  '$ALLTRIM(&fld2chk)
          DO WHILE '  '$ALLTRIM(&fld2chk)
          	REPLACE &fld2chk WITH STRTRAN(&fld2chk,'  ',' ')
          ENDDO
       ENDIF
   ENDFOR
ENDSCAN

b=SECONDS()
?b-a	[COLOR=navy grey]&& Average running time [b]443.1205[/b] seconds[/color]


*===================================================
[b][COLOR=red]This is [u]TheRambler[/u]’s unchanged code, also row-wise:[/color][/b]
*===================================================

a=SECONDS()

SELECT Double_Space_3
COPY STRUCTURE EXTENDED TO temp
SELECT field_name ;
  FROM temp ;
 WHERE field_type="C" ;
 INTO ARRAY aFldCnt

SELECT Double_Space_3
howMany = alen(aFldCnt)
SCAN
   FOR i=1 to howMany
       fldVal = ALLTRIM(EVALUATE(aFldCnt(i)))
       IF "  "$fldVal
          DO WHILE '  '$fldVal
             fldVal=STRTRAN(fldVal,'  ',' ')
          ENDDO
          REPLACE &aFldCnt(i) WITH fldVal
       ENDIF
   ENDFOR
ENDSCAN

b=SECONDS()
?b-a	[COLOR=navy grey]&& Average running time [b]202.4060[/b] seconds[/color]


*===================================================
[b][COLOR=red]This is [u]mcamardo[/u]’s own code, column-wise
and performing REPLACE within DO WHILE loop:[/color][/b]
*===================================================

a=SECONDS()

SELECT Double_space_4

Dimension Afld(Fcount())

Store 0 To cntr2

For cntr1 = 1 To Fcount()
 If Type(Field(cntr1)) = 'C'
  cntr2 = cntr2 + 1
  Store Field(cntr1) To Afld(cntr2)
 Endif
Next cntr1

Arycnt = cntr2

For cntr3 = 1 To Arycnt
 Store Afld(cntr3) To fld2chk
 Scan for '  '$allt(&fld2chk)
  Do While '  '$allt(&fld2chk)
   Repl &fld2chk With Strt(Allt(&fld2chk), '  ', ' ')
  Enddo
 Endscan
Next cntr3

b=SECONDS()
?b-a	[COLOR=navy grey]&& Average running time [b]335.3145[/b] seconds[/color]

*===================================================

SET TALK ON

Got to admit, for what it's worth, I tested it in VFP6. I don't have any other version.

 

Even if AT isn't optimizable (even though I haven't found this in Help yet), still it is much faster than $.

As for indexing on AT() and ALLTRIM(), or on $, I don't think it's a good idea. Besides that index order would depend on the actual number of character of the field, wouldn't it rebuild every single time we replace the record with its double-space-removed version? I think it has a potential to slow down the performance, not speed up, even if it would become optimizable.
 

Well, I'd never use such an index for this purpose, too, even though it's the only way to rushmore optimize those functions.

Besides that such an index may be dangerous, optimization doesn't always mean speed.

I've just ran 2 tests.

First, I created indexes on AT(" ",ALLTRIM(field)) for the 2 largest character fields in my test table, and ran on it my fastest code (see the first one from the post above). From around 184 seconds it used to be it took a few second less, 175.

Then I created indexes on " "$ALLTRIM(field) for the same 2 fields in an identical table, and ran TheRamblers code on it. From average of 202 seconds it took before now it took over 486 seconds.

That's one more argument against frequent usage of $ (especially in indexes and loops), and also proves that optimization isn't always the best choice (you know, several commands have NOOPTIMIZE clause).
 
Well, seems mcamardo already got his answer, but this topic is still a good excuse to learn some more...
I remember seeing this ( comparison in another thread:
Lauren Clarke said:
Since AT and ASCAN give you more information, namely the position of the matching candidate, they will take longer. Thus, $ wins. Q.E.D.
Of course, this is not the same situation, and she also says:
Lauren Clarke said:
Note that it seems tha AT() has been improved in VFP8 and seems to nominally beat $ in some of these tests where AT() and $ were equivalent in previous versions.
So that makes me wonder what kind of results I would get if I run the above tests in FoxPro 2.x, but I would prefer to leave it behind, I must say good bye to an old, good friend.
 
>Um, fwiw, Lauren Clarke is a "he," not a "she"
Thanks, how else could I have known?
According to the Name Voyager Lauren is a girl's name, although there are a few boys with that name (Lauren Clarke had to be one of them!).

As for "Tamar", I know you are a "she" (palm tree). I had seen your picture, read some of your articles and always keep an eye on your posts.

Since my handle is my signature as well, now I wonder if people think wether I am a "he" or a "she"... it shouldn't make any difference though. And while we are at it, I think "Stella" (star) is a beautiful name, and "Mike" should mean "willing to help", since there are so many here (Mike Lewis, Mike Gagnon, Mike Krausnick, Mike Pratt, ...).
 

TheRambler,
I will be glad if you can give me the link to that other thread. I remember not only seeing/reading the article and looking thoroughly on the charts, but also commenting on it. I don't seem to find it now.

As far as I understood it, the whole first paragraph contained well hidden irony, coupled with the fact that TANSTAAFL is not very well known as a scientific method. Because, after the first paragraph, when you actually see the charts, with the time on the vertical axis, you see right away that on big lists AT() consistently making better times than $. (Meaning, the lower the line is located, the lower the time, the better the performance.)

I don't know what version it was originally written for, but in my VFP6 AT() is also much faster than $. It's possible that the article is very old, and just the charts were updated for later versions. I don’t have old versions to check whether $ was actually faster several versions ago.

myearwood,
Building those indexes would not be a way to improve his code's performance.

I was not trying to improve the performance.

I told you right from the start that those indexes have potential to slow down, not improve the performance. I just think the test has shown that $ is much slower than AT(). Because without those indexes, my code based on AT() loop was just slightly faster than TheRamblers, based on $ loop. I build indexes for both, for mine - based on AT(), and for his (for some reason, I believe it's a "he") - based on $, just the way we created the loops.

Yes, I knew - and said upfront - that the indexes will be constantly rebuilding. But since the time difference for these indexed tests became very noticeable, we now see which index can be updated faster.
 
Sorry Stella740pl, I've looked for that thread before, but couldn't find it either, seems to me it is gone. And I must add that you and myearwood are both correct, although whether I am male or female is irrelevant, it is an agreement nevertheless.
 
That is the same link I posted above (16 Dec 05 15:38), I guess Stella740pl is looking for a thread with a similar discussion in some of the VFP forums, which we both remember (she posted, I just read), but now we can't find it.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top