Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to check duplicate record?

Status
Not open for further replies.

alisaif

ISP
Apr 6, 2013
418
AE
Hi,

I want to check LPO (Purchase Order #) in a table which contains around 900K records using the following:

Code:
      LOCATE FOR ("&mlpo")$lpo
      IF ! EOF()
         BROWSE 
      ENDIF

But it is very slow. Can I seek it rapidly?

Thanks

Saif
 
Fast SEEKs can be done with INDEXSEE() or SEEK(), either on a full value or a field starting with mlpo.

The $ operator is the least ideal way to search, do you really look for fields having some partial value inside them? Can you give an example value of mlpo and example values of the lpo field?

If you really make such searches and seek partial values, it most probably will be better to split lpo into parts each having a certain meaning, especially if that can turn your search into a normal equal or begins with SEEK.

Bye, Olaf.
 
Thanks for the reply

LPO No. can be like this so403330 here I want to seek with 403330 only.

Thanks

Saif
 
Well, then you should create two fields, one only having the prefix "so" and the other only having the numeric part.

If the prefix always is so you either don't store it at all or you prefix mlpo with "so" and then do a normal SEEK or INDEXSEEEK(). You can't optimize $ and you can only optimize LIKE clauses similar to "begins with searches, i.e. LIKE val+"%", but you can't optimize a search like "any two characters and then a specific number. The correct split storage then is the key to fast queries.

Bye, Olaf.
 
Searching for any string inside another string is always going to be slower than searching for an entire string (or for one string at the start of another). That's because VFP must retrieve every record, and look at each instance of the field in turn. In other words, it can't take advantage of any indexes.

It doesn't make any difference what commands or functions you use to do the search. The only solution that comes to mind is the one that Olaf suggested: split the target field into two, and create an index on the sub-field that you will be searching on. Once you've done that, your code should run very much faster.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
Also you should'nt macro expand 900K times.
Do it ONE time into a variable and use that.
 
Well, Dr. Dolittle, the way macro expansion works it will only be done once. The macro is put into the code line, which then is compiled and finally executed. It is one LOCATE in the 900K records, but it only locates the first row, if there is a match.

In regard of macro substitution, it is not necessary here at all, you can do [tt]LOCATE FOR m.mlpo$lpo[/tt], but that won't accelerate this, too. The major bad choice is the $-operator is not optimizable with an index. The only thing to do really is reorganizing the data to go into separate fields.

Bye, Olaf.





 
Well, then you should create two fields, one only having the prefix "so" and the other only having the numeric part.

It is not necessary that the LPO starts with Alpha Characters, sometime it is like SO403303LM, SO-403303, SO/403303/2017-1 etc.

With due respect, how you guru deals with that situation. I have character field(LPO c(19)) of 19 in sales order.

Thanks

Saif
 
Well, in the first place I don't have so many different variants of order numbers or what ever that is. You might have several representations with slashes or dashes, but these all will have just one base of data of a prefix and the main number.

How would I handle things like SO403303LM, SO-403303, SO/403303/2017-1?

It's still obvious you have some prefix SO. I still assume this may vary. What does SO mean? If there are different types of records associated with different prefixes, they would perhaps even go into separate tables, not only separate fields. You may have bills, cashbox reports, credit payments, etc. All have a different nature and go into different tables.
.
Just in regard of the three sample numbers, I'd split this in prefix (order type or whatever it means), order (or whatever) main number, suffix (LM, again what does it mean, give it a name and you know where to store it as what separate field, finally a year, a month, maybe a date.

Putting this all together is the matter of a format. So you might have a format mask you store per record, i.e. a textmerge expression. For example the three formats <<prefix>><<mainnumber>>, <<prefix>>-<<mainnumber>>, and <<prefix>>/<<mainnumber>> would result in either SO403303, SO-403303, or SO/403303. Still, the mainnumber always is 403303, this could even be an integer, if you only need up to 8 digits, or 9, but only up to 2 billion.

With this sepration, you only gain more control over your data. The way you store it you just give up and say you have no choice to store this in a better manner. Common, this never is the case, this never is the fault of the data, if you don't get any idea to store your data in the necessary atomicity it's only your fault. What is so difficult? If you are not able to see patterns and their parts, you are not able to do programming.

Bye, Olaf.
 
This is the LPO no. which we received from various customers. And, in sales order we mentioned this LPO no. for their reference.

Thanks

Saif
 
I know the whole lpo is stored in the field called lpo. The question is rather rhethorical, too. Whatever you would name the parts of the whole lpo would become your field names of the parts of the lpo. If you don't have names fro them, then invent names. It doesn't matter. What matter is, that the lpo is split into the single parts you want to search for, you can't make an instring search, but you can search the single parts, when the lpo is split.There is no other solution, so if you still reject the idea, you have to live with slow searching.

Bye, Olaf.
 
Thanks for the reply!

How can I select the numeric from the strings.

SO403303LM, SO-403303, SO/403303/2017-1?

Thanks

Saif
 
Alisaif,
Assuming you JUST want the numeric digits, you can do something like the following:

Code:
lcValue = "SO403303LM, SO-403303, SO/403303/2017-1?"
lcNumOnly = ""
FOR lnLoopCount = 1 TO LEN(lcValue)
	IF BETWEEN(ASC(SUBSTR(lcValue,lnLoopCount,1)),48,57)
		lcNumOnly = lcNumOnly+SUBSTR(lcValue,lnLoopCount,1)
	ENDIF
ENDFOR

MESSAGEBOX(lcNumOnly)

But is that what you want, or some subset of the comma delimited value you proposed?

Best Regards,
Scott
MIET, MASHRAE, CDCP, CDCS, CDCE, CTDC, CTIA, ATS

"Everything should be made as simple as possible, and no simpler."[hammer]
 
We can give you ways to remove the Non-Numeric characters from the string, but doing so during your SEEK FOR DUPLICATES will make for a VERY SLOW search.

Yes, you get the LPO number from another source, but when you import it you could store it in its original form AND also store its component parts into different separate fields.
So the added time would be on the Import phase and not on the SEEK phase. (I think this is what Olaf was suggesting just above.)

To remove the Alpha Characters
Code:
cStartStr = < your starting string >
IF "/" $ cStartStr
[indent]cFinalStr = STREXTRACT(cStartStr, '/' ) && Get String Between '/' Characters[/indent]
ELSE
[indent]cKeepStr = "0123456789"  && Characters to Keep (Numeric Only)[/indent]
[indent]cRemoveStr = CHRTRAN(cStartStr, cKeepStr, "")  && Everything to Remove[/indent]
[indent]cFinalStr = CHRTRAN(cStartStr, cRemoveStr, "")  && Final String (Numeric String Only)[/indent]
ENDIF

Good Luck,
JRB-Bldr
 
Thanks both Scott and JRB-Bldr, very useful piece of code.

Saif
 

You can create an index on just the numbers, spaces and slashes for example:

index on chrtran(LPO, chrtran(LPO,' /01234567890',''),'') tag LPO
seek('4030')




Marco Plaza
@vfp2nofox
 
Thanks Marco for sharing this strange index technique.

Saif
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top