Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Integer or Character as Primary Keys

Status
Not open for further replies.

stanlyn

Programmer
Sep 3, 2003
945
US
Hi,

In my previous thread on comboboxes several contributors starting talking about the behavoir of comboboxes in relation to the data types, (char vs int). This has raised several question for me and would like your feedback...

My previous apps never really had a primary key designated as such, instead I used a field like "acct_id" that held user data and served as a primary key. I'm thinking about having a field designated just for the primary key named something like "key_primary" or "key_parent" and have another field serving as a foreign key named "key_foreign" or "key_child". This naming eliminates any confusion on what the key represents, parent or child. This way children never gets orphaned, no matter what the real world values are or gets changed to. Previously, if I changed the value of "acct_id", I'd first have to change the children's key and then change the parent. That need to update the child records would be eliminated if I used dedicated primary and foreign key fields.

Now, integers or characters is the question? I like integers because they are unique and easily human readable/useable. I like guids because they are guaranteed unique without consulting other resources. I hate guids because I cannot remember them and they are only copy/pasteable... Forget about writing them down...

And, is it better for us to generate the primary id at the client with guid() or at the server with antoinc?

So what are the pros and cons for each, whether we use vfp or sql from a vfper's perspective?

Like to hear your thoughts,
Stanley

 
>Is there a udf that could generate a unique number much the same was as guids are generated? Place it in the "default value" field.
Have you read all I wrote here? I have already mentioned faq184-7743. It shows how to populate ID fields with a separate singel autoinc in a single system table for id generation.

Bye, Olaf.
 
Besides your fears about pack and dropping autoinc are wrong. PACK doesn't renumber autoinc fields, they stay with their value. And I don't remember ever having recommended dropping columns, just modifying them temporarily to normal integers, which retains their value, then put back the auoinc feature and setting the next value correctly.

Anyway you have found a problem of autoinc fields. Appending data. Like in a single Insert sql statement you can't specify a value for the pk. Merging data is complex, but it's a complex topic anyway, also in case of guids. You have to match records to be updated and only append new data. It's a separate topic, purely appending is normally not a solution anyway.

Bye, Olaf.
 
>> Why would you name foreign key fields customer_pk or carrier_pk?
It tells me where they connect. If I had these 2 fields in the orders table, it would be clear that these 2 fields contains tablename_pk value (customer_pk in this case) used to create the relation. Notice the above mentioned customer table only has a primary key and regular fields. The value in the primary key is also stored in the orders table in its own field named customer_pk. This tells me this field corresponds with the customer's pk field. And if there is another relation to another table (such as carrier) then it would have its own field named carrier_pk and that allows me to visualize what its there for and where it gets its data. Th relations would be created between customer.pk_id and orders.customer_pk and carrier.pk_id and orders.carrier_pk.

If you don't agree, then how would you name the foreign key fields in th eorders table? And maybe I can see where you are coming from.

I'm still studying your post...

Thanks,
Stanley
 
Hi Mike,

I was just thinking: doesn't VFP store numbers as strings anyway? not as 'true' numeric? i.e. the
representation of an integer for an ID key is very likely to be a 10 or 14 (or larger) character
block, rather than a binary numeric (in a VFP table not a xxSQL database).




Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are not good for you.
 
Disregard that, I am a dinosaur.
The int type is stored in 4 bytes, not a big old string.
Same as the memo position

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are not good for you.
 
>> Your still confused about this.

No, not at all. We are saying the same thing 2 different ways. My understanding is the same as yours, however I've never used the primary key mechanism before now. For years I was doing essentially the same thing but without primary keys. Just regular fields customer.cust_id = orders.cust_id...

Thanks,
Stanley
 
>> Here's the catch: If you look into your data after 5 years and the customer has moved your 5 year old order points to the new address, but not the one you used 5 years ago for shipping. If you need to know this for reasons of some law, you have to copy over net address data somewhere to know the address valid at the time the order was made. But that's not a reason to avoid the key concept of a separate identifier, this is addressed (no pun intended) with audit trailing of data. In this case another easy solution is of course saving PDFs with the generated delivery slip having the address as it was current at order time.

I've always saved the invoice billing and shipping names and addresses as part of the order, therefore preserving that snapshot in time, as it was then and there.

Stanley

 
>> Besides your fears about pack and dropping autoinc are wrong. PACK doesn't renumber autoinc fields, they stay with their value. And I don't remember ever having recommended dropping columns, just modifying them temporarily to normal integers, which retains their value, then put back the auoinc feature and setting the next value correctly.

Good to hear this...


>> Anyway you have found a problem of autoinc fields. Appending data. Like in a single Insert sql statement you can't specify a value for the pk. Merging data is complex, but it's a complex topic anyway, also in case of guids. You have to match records to be updated and only append new data.

Can a scatter be issued? I use it a lot to quickly store all of a records values for use in a new or another record on a field by field basis with a shortcut command like F8 and whatever field you are in, the scattered value gets copied in. same for each and all fields.

Thanks,
Stanley
 
Stanley, I haven't read your posts after 27 Mar 15 03:10 in detail, nor Olaf's detailed replies. But I will just pick up one point. You talked about possibly using GUIDs instead of integers after you switch to SQL Server. There's no reason to do that. SQL Server supports the IDENTITY attribute, which is equivalent to VFP's Integer (AutoInc).

My argument against using GUIDs apply just as much to SQL Server (or any back end) as to VFP. Because of their size, it takes longer to compare two GUID values that to compare two integers. And since comparison operations are at the heart of indexing, index retrieval, etc., this can have a significant effect on performance.

Also, when you come to actually migrate your data from DBFs to SQL Server, you will find it much easier if the target data type is similar to that of the source.

To go back to your suggested table structure, you wrote this:

Code:
Customer table fields
ID(I) is the PK and autoinc,
name(c,30)
address(c,30)
other fields........

Orders table fields
ID(I) is the PK and autoinc,
order_num(c,12)
order_date(t,8)
order_amt(n,9.2)
customer_pk(I) is a foreign key that links to the customer's ID field
carrier_pk(I) is a foreign key that links to the carrier's ID field (ie. UPS, FedEx)

That looks reasonably good. The main change I would make is to user Customer_ID and Carrier_ID rather than Customer_PK and Carrier_PK respectively. But that's just my habit; feel free to ignore it. Also, I would probably use the Currency data type for Order_Amt.

One other suggestion: Always name your tables (and therefore your foreign keys) with either the singular or plural form, but not both. So: Customer, Order, Invoice OR Customers, Orders, Invoices. Choose the form you prefer, and then use it consistently. Just like standardising on field names, this will save you a lot of silly mistakes when referring to the tables in your code.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
Thinking that no further action would happen tonight regarding some unanswered questions I had, I decided to create a udf that generated a numeric value. Here it is. I tested it on an I7 machine and did not get any duplicated over an iteration of 1000.

Code for duplicate testing:

Clear
dd=Set("Decimals To")
Set Decimals To 0
For x=1 To 1000
?Val(Strtran((Dtoc(Date(),1))+Alltrim(Str(Seconds(),12,3)),'.','))
Endfor
Set Decimals To 2


The UdF is put in the "default ??" in the table as getNumericGuid(). Set procedure to the proc file first.
**********************************************************************************************************
Procedure getNumericGuid
* Test Name - testname@testname.com
dd=Set("decimals to")
Set Decimals To 0

NumericGuid = Val(Strtran((Dtoc(Date(),1))+Alltrim(Str(Seconds(),12,3)),'.','))

Set Decimals To (dd)
Return NumericGuid
Endproc


Does anyone get duplicates?

Thanks,
Stanley

 
Hi Mike,

>> That looks reasonably good. The main change I would make is to user Customer_ID and Carrier_ID rather than Customer_PK and Carrier_PK respectively. But that's just my habit; feel free to ignore it. Also, I would probably use the Currency data type for Order_Amt.

The reasoning behind naming the foreign key customer_pk as opposed to customer_id is that I've seen a lot of other people's code where they use customer_id to hold the customer id like "stan-01" which has nothing to do with a primary key. That's why the pf is on the end as it clearly tells us coders that it is belongs to the customer's primary key. To non programmers, customer_pk means nothing while customer_id means "mike-01"

Thanks,
Stanley
 
It would be better if we can force the decimals to 0 within the command itself. This would eliminate the need to initialize a procedure file if we could put

Val(Strtran((Dtoc(Date(),1))+Alltrim(Str(Seconds(),12,3)),'.',''))

directly inside the table's "default value" property.

Stanley
 
To non programmers, customer_pk means nothing while customer_id means "mike-01"

Well, I would argue that non-programmers are unlikely to see your field names, which are internal to your database and application. But I don't feel strongly about it.

Mike



__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
No duplication problems outputting to screen when I use the ? (print} command. However when append blank for 3000 times vfp is able to create 12 new records before a different number comes up. I also had to use a 16 wide numeric field. What is the function to convert a numeric to an integer?

Olaf, I will explore your document tomorrow...

Thanks,
Stanley
 
By non-programmers I mean those people with MS Access and other odbc enabled devices... They get to see the whole structure at the table level.

Stanley
 
Hi Stanlyn,

>If you don't agree, then how would you name the foreign key fields in th eorders table? And maybe I can see where you are coming from.

Well, I already said that I would rather use a fk suffix for foreign key fields than a pk suffix. But using ID as suffix is even better, as its neutral and everybody knows that IDs are FKs coming from other tables, where they are PKs. The only strong feeling I have about this is you're making a confusing decision. If you want to have your own very individual naming convention then use whatever you like, the norm is a suffix of "ID", as you already used...

>customer.cust_id = orders.cust_id
So you could stay with the naming convention. Mainly index customer.cust_id with a primary index and orders.cust_id with a regular index to have keys. But actually the thing making the tables realte to each other is the REFERENCES clause of CREATE TABLE and the index tags are non optional preparations for this.

>Can a scatter be issued?
Totally wrong context for asking about SCATTER. I was thinking of merging current data with data coming from a CSV file eg 1000 current records merged with 500 coming from somewhere else with an overlap of updated and new data. As I said this is a totally different topic. I won't get into detail about this, but scatter surely only is about single records.

If you have a problem with GATHER writing to the readonly ID field, then not only because of the readonly nature of the autoinc field. You're forgetting you don't copy IDs, that would voilate the primary index. You also don't need scatter/gather for updating records, as you have buffering (since VFP3?). Even if you want to copy some data from one to another record you can use the EXCEPT clause of scatter or gather or both to avoid copying the ID value.

>I decided to create a udf that generated a numeric value
Why on earth would anyone do that, please reread.

Your generator might not create duplicates on one machine, but on multiple with slightly different clock settings. And it has another major problem, even if that wouldn't be one: It produces values far too large for an integer field, have you counted the decimals? A date alone has 8, seconds go up to 86000 each day, so anther 5, 13 digits. ints go up to 2 billion, 10 digits with the highest 2 at max.

Bye, Olaf.
 
>However when append blank for 3000 times vfp is able to create 12 new records before a different number comes up.
So you see your generator would fail anyway. Simple reason: Printing on screen needs to scroll the whole screen up one line, when it's full. That does take much more time than adding a record to a dbf.

Code:
Create Cursor curtest (id N(16) Default Val(Strtran((Dtoc(Date(),1))+Alltrim(Str(Seconds(),12,3)),'.','')), cTest C(1))
Append Blank 
Do While Reccount()<100000
   * only append cTest field, not ID
   Append From Dbf("curtest") FIELDS cTest
EndDo 
Select Max(rec.cnt) from (Select id, Count(*) from curtest group by id) rec
This result in 400 to 550 records with same ID for me.

Bye, Olaf.
 
If you are dead set against using Integer (Autoinc), an laternative might be the Windows [tt]CoCreateGuid()[/tt] function, which creates a 128-bit GUID:

Code:
DECLARE INTEGER CoCreateGuid IN ole32 STRING @ pguid
lcGUID = REPLICATE(CHR(0), 16)
CoCreateGuid(@lcGUID)

Unlike a value based solely on the time of day, this one is supposed to be globally unique - that is, it is different from any other such value generated by anybody at any time anywhere in the world.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
Thanks Olaf for the link to "Generate integer IDs (unique, usable for primary key" I've spent most of the day with and I've already got it implemented and am very happy with it. I even learned alot from that code and its implementation, so again thanks.

And, Mike, I was actually wanting the numeric pk instead of guids. I would have went with guids if an numeric equivalent was not be found, but once again, Olaf pulled it off, and ever so brilliantly.

Thanks, Stanley
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top