Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Table's ROWID as a PK/unique key 4

Status
Not open for further replies.

Andrzejek

Programmer
Jan 10, 2006
8,502
US
Just don’t shoot the messenger…

Could some of you Oracle experts tell me what are the problems/issues/’things to watch for’ with using table’s ROWID as a Primary Key / unique key?

Let’s say a table (or tables) does not have a PK established and there is no field or combination of fields that could be used to uniquely identify the record, and programmers are (pretty much) left with using ROWID as the only field to program against.


Have fun.

---- Andy
 
as long as the database NEVER moves a row, you never rebuild the table, you never have to restore the table or have any activity against it you should be fine. Add a column and assign a sequence number for a unique key.

Bill
Lead Application Developer
New York State, USA
 
Andrzejek,

Your questions are good questions...ones that every data(base) designer should have good answers for.

Andrzejek said:
...there is no field or combination of fields that could be used to uniquely identify the record.

First, relational theory suggests that there should be no "application/business duplicates" in a relational table (else you probably have not yet fully/adequately defined the "business characteristics" of the table). Simply adding in something that artificially "makes" the rows unique is not and adequate solution. Therefore, you should be able to identify at least a combination of attributes (i.e., columns) that form a "business-unique" instance (i.e., row) of data. But, as you will see in the rest of my posting, below, there are multiple good reasons for not using "business-unique" keys (especially multi-column unique keys) as a Primary Key.

Primary keys have really only one purpose for existing: To be an efficient, unique match target for any foreign key that is a child to a parent record.

With the above singular reason for PKs to live, I have derived, over my 25 years in the Oracle world, the following "Characteristics for High-Quality Primary Keys":
Characteristics for High-Quality Primary Keys said:
[ul][li]Unique, no duplicates, ever. No two subjects can have the same identity; no one subject can have multiple identities.[/li]

[li]Non-Null. Primary keys are Universal…Everyone has one.[/li]

[li]Never changes. If a Primary Key changed, then all of the (possibly millions) of "dependent children" would need to change, as well, to provide parent-child data integrity. Therefore, to reduce the need for changes to Primary Keys, a high-quality Primary Key should be:[/li]

[ul][li]Stupid…Non-intelligent…has no business meaning. If business rules/meaning changes, there is no effect upon (no need to change) Primary- or Foreign-Key values.[/li]
[li]Ours…not subject to anyone else’s business rules.[/li][/ul]

[li]Short as possible while still preserving uniqueness[/li]
[ul][li]Number, not alphanumeric; internally, numbers take less space than alphanumerics.[/li][/ul]
[/ul]

Your Case #1: "Using ROWID as Primary Key"...can disobey:
[ul][li]Never Changes rule. If the ROWID changed because of a table re-build or MOVE TABLE, the effect upon child Foreign Keys would be disasterous.[/li]
[li]Stupid...Non-intelligent. ROWIDs are intelligent. If the intelligence/meaning changes, then all of the foreign key must change, as well.[/li]
[li]Ours. ROWIDs are Oracle's, not ours. If Oracle decided to change the ROWID model (as they have done on recent releases), then our foreign keys would need to change, as well.[/li]
[li]Short as possible. Oracle ROWIDs occupy 10 bytes of storage. Most numeric Primary Keys do not require 10 bytes of storage to accommodate uniqueness (e.g. the number 123456789 occupies only 6 bytes of storage). You might say that a 4-byte savings is not much, but multiplied of millions of rows of foreign keys, a few bytes of savings can add up.[/li]
[li]Number. ROWIDs are a hexidecimal value. Not only are ROWIDs longer than they need to be, we humans tend to have problems dealing with base-16 versus base-10.[/li][/ul]
So, in Case #1, ROWIDs break 5 of the 7 rules for high-quality primary keys.

Your Case #2: "Using (multiple-column) Business-unique columns"..can disobey:
[ul][li]Unique, no duplicates, ever and Non-Null. Depending upon the nature of the components of a business-unique key, we might not be able to enforce the “no duplicates” and/or “not null” requirements of one or more columns, thus potentially breaking this vital rule.[/li]
[li]Never Changes rule . If the components of a business-unique key change because of a business-rule change, the effect upon child Foreign Keys would, again, be an avoidable, needless update to possibly millions of rows.[/li]
[li]Stupid...Non-intelligent. Components of business-unique keys are intelligent, by definition. If the business intelligence/rule(s) changes, then all of the dependent foreign keys must change, as well.[/li]
[li]Ours. Components of a business-unique keys are not necessarily ours. If a column originates from an organization that is not part of our organization, and if they change their business rule(s) for our business-unique column(s), then our foreign keys would need to change, as well.[/li]
[li]Short as possible. Components of business-unique keys typically occupy more storage space than most numeric Primary Keys...thus our business-unique primary key would be longer than necessary.[/li]
[li]Number. Depending upon the nature of the data types of the components of a business-unique key, they might not be the shorter numeric storage type.[/li][/ul]
So, in Case #2, using a business-unique combination as a Primary Key potentially breaks all of the Characteristics for High-Quality Primary Keys.

My suggestion for Primary Keys in all cases/situations is simply to CREATE SEQUENCE <name>... and populate a Primary Key with values from the sequence...that should obey all of the above "rules".

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
“People may forget what you say, but they will never forget how you made them feel.
 
SantaMufasa,

I totally agree with you (and others here), 100%, and if I could I would give you ‘give me five!’ over the Net.

But what this post does, and a few others, is giving me the ammunition (rock reasons) to talk to the person who sets up the data base I have to work with. It is not my idea to ignore all the rules of relational data base design. Yes, I am sorry to say, no Primary Keys, no Foreign Keys, no referential data integrity, repeating the data all over the tables, using repeating fields (name1, name2, name3, …, name10, etc.), tables with 200+ fields, and many, name other issues. And that’s with 200+ tables, and growing…

Why is that, you ask? I don’t know, I never got the answer.

What my post does is pretty much saying: “Please, please, prove me wrong and tell me I have nothing to worry about”. But, that’s unfortunately not the case.


Have fun.

---- Andy
 
Andrzejek said:
...no Primary Keys, no Foreign Keys, no referential data integrity, repeating the data all over the tables, using repeating fields (name1, name2, name3, …, name10, etc.), tables with 200+ fields, and many, name other issues. And that’s with 200+ tables, and growing…

I'll make this comparison: You expect a professional building contractor to come and build your house, following all building codes. Instead, it sounds like you got a teenager whose building experience is limited to building a couple of tree houses with left-over wood scraps.

What you have described indicates that the database designer has not had formal training in the rules and expectations of a high-quality database design. There are plenty of good course, books, et cetera that one can take to become a database-design professional. If your colleague does not develop proper db design skills and continues to exhibit his/her current design behaviors, then your application/company/users will be losing 10s/100s of thousands of dollars/euros in value for the application.

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
“People may forget what you say, but they will never forget how you made them feel.
 
Santa,

just to kick things around a bit, I disagree with your 'never changes' item to do with PK's.

For all the reasons you mention (and more) I always use integer surrogate PK's. However, I advise developers that primary keys are subject to change without notice, and that they should code accordingly.
A pk is there to provide uniqueness and a reference for an FK. If you took every PK in a database and added one to it, and all the matchine FK's, then it would still be fine and no data corruption would ensue.

I once had to split a database into four parts, and then later join them. I did it by multiplying each pk by 10, so they all ended in zero. Then I added 1-4 for each part.
I subsequently rejoined them, with a guarantee of no PK collisions, because I knew the parts all ended with different numbers.

what think ye?

Regards

T
 
Thargy said:
I once had to split a database into four parts, and then later join them.

Tell us more about the driving business reason for this activity.

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
“People may forget what you say, but they will never forget how you made them feel.
 
Trust you to pose a good question.

The reason was that one database fulfilled four (in fact more) separate business functions.
Due to bad design, there was no functional isolation in the database, and so I had to artificially create some, by splitting the data in to four sets.
This enabled testing and verification to take place by four separate groups in parallel, instead of having to do it sequentially (thereby saving weeks of elapsed time).
At the end, the whole lot had to be bolted back together.

Be advised that this took me a man month of effort to script and automate - it's not for the faint-hearted!

Regards

T
 
Thargy,

First, across all big-table rows, the current PKs are unique. By splitting the unique rows into four tables, the PKs are still unique across the four tables. There should be no need to change the PK or FK values. By adding a "TABLE_TYPE_CODE" column that indicates which of the four tables "owns" each row. That technique requires no modification of either the PK or the FK values, for either bifurcation or for re-consolidation.

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
“People may forget what you say, but they will never forget how you made them feel.
 
Santa,

the problem wasn't splitting a table or tables, it was splitting the entire database!
I didn't split any table, I produced four separate databases with identical structures.
Then the data from each functional part was moved out into its own database and worked on.
At the end, four databases were merged back in to one, relying on the difference in PK/FK values to avoid collisions.

Is this making any kind of sense?

Regards

T
 
Correct me if I am wrong, but isn’t also a side* benefit of having Primary Keys / Foreign Keys established to have also Indexes on those fields in the tables as well? I think all PK and FK fields are indexed by the data base, right?

*In my mind is not really a ‘side’ benefit, but more like one of the main benefit. Is that the true statement that without the Indexes on a table, data base has to do the full table scan? (Which causes performance issues?)


Have fun.

---- Andy
 
For everyone else's edification, Thargy and I discussed this issue at length over the phone. And please correct me, Thargy, if I misspeak here, but here is a summary of our conclusions:[ul][li]Modifying a PK value (for whatever reason) is not a good idea (because of the implied massive effect [by comparison] on the FK values).[/li]
[li]Having a Business or its IT department build meaning into a PK (for whatever purpose) is never a good thing. If you want to build new meaning into a table's row, add a new column to the table...don't modify the PK.[/li][/ul]

Is this a fair summary of our conclusions, Thargy?

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
“People may forget what you say, but they will never forget how you made them feel.
 
Sorry for the delay in responding.

Andy said:
I think all PK and FK fields are indexed by the data base, right?

Oracle automatically creates unique indexes on PKs, but Oracle does not automatically create indexes on FKs.

Andy said:
...without the Indexes on a table, data base has to do the full table scan? (Which causes performance issues?)

Correct. (NB: But if you are using Oracle's Exadata, then indexes become nearly irrelevant.)




[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
“People may forget what you say, but they will never forget how you made them feel.
 
SantaMufasa said:
But if you are using Oracle's Exadata, then indexes become nearly irrelevant.

Obviously, I was speaking too passionately about Exadata...I should have said, "...then indexes become less of an issue in terms of full-table scans." Sorry for my intemperance.


[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
“People may forget what you say, but they will never forget how you made them feel.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top