Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Indexes on a nullable column 1

Status
Not open for further replies.

BJCooperIT

Programmer
May 30, 2002
1,210
US
I am writing queries on production tables. My problem is a recursive table that has multiple rows for the logical primary key. It contains over 20 million rows. Simplified it would be something like:
Code:
Table=CARS
[b][COLOR=blue]
KEY[/color]  PERSON  MAKE     YEAR [COLOR=red]FK_CARS[/color][/b]
001  BJC     CHEVY    2001 
002  BJC     TOYOTA   1986 001
003  BJC     PEUGEOT  1976 001
There are indexes on the highlighted columns. I need to select rows where FK_CARS is null and join to a second table, but the query is timing out. The index on FK_CARS does not help when nulls are involved. Does anyone have tip that could help speed up the queries? I am fast running out of ideas to try. Oh, did I mention that I only have SELECT privileges?

Beware of false knowledge; it is more dangerous than ignorance. ~George Bernard Shaw
Consultant Developer/Analyst Oracle, Forms, Reports & PL/SQL (Windows)
My website: www.EmuProductsPlus.com
 
Barb,

Here are some follow-up questions I have for you:

1) How many rows in CARS have NULL FK_CARS?
2) If FK_CARS is null, then what column(s) in the CARS table are you using to join with what column(s) in the "second table"?

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I can provide you with low-cost, remote Database Administration services: see our website and contact me via www.dasages.com]
 
Hi Dave!

1. 20,410,053
2. KEY is used to join to the second table (which has over 22 million rows)
 

I solved once a similar problem by creating a 'FUNCTION' index, kinda like:

INDEX(KEY, NVL(FK_CARS,'?'))...

HTH.
[3eyes]

----------------------------------------------------------------------------
The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb
 
So, functionally, are you saying:

"Join CARS.KEY to TABLE2.<some_column>, WHERE FK_CARS IS NULL"?

If so, since an index does not contain rows where its indexed column(s) contain NULLs, you are, by design, doing a full-table scan of CARS. But, if you have an index on the column CARS.KEY and an index on TABLE2.<some_column>, then the joins should be (per join), miraculously fast.

Now, pragmatically, if "20,410,053" of your "over 20 million rows" are NULL, then you absolutely, positively WANT to do a full-table scan to turn up the NULL rows anyway: when you plan to return 99.44% of your rows, you do not want to be accessing your index...It actually slows you down.

But for any one of those 20,410,053 rows, you want only 1 or 2 matching rows from another table, you absolutely, positively WANT to do the matching via an INDEX.

So, the next questions are, "How frequently must you do your massive query...(once a minute, hour, day, week, month)?" and "How timely must your data be (accurate as of this instant, one hour, since midnight, since the beginning of the month, et cetera). If the answer is toward the end of that list, then you can use other methods (e.g., temporary tables, materialized views, et cetera) to give you virtually instantaneous results.

Let us know,

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I can provide you with low-cost, remote Database Administration services: see our website and contact me via www.dasages.com]
 
LKBrwnDBA,
I would try that if I had those kind of privileges. The state is a bit paranoid about giving consultants access to production databases. You don't know the hassle it took to have the DBAs create a plan table for me in production so that I could run explain plans!

Originally I wanted to create a temporary table for the keys to the rows where FK_CARS is not null (only 3,491,635 rows) but that is not possible without another drawnout hassle.
 
...and your "state people" do not consider "fatal 'timeout' errors" as "another drawnout hassle?"[smile]

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I can provide you with low-cost, remote Database Administration services: see our website and contact me via www.dasages.com]
 
Every CAR row "should" have a match in the second table. The explain plan says it will use the correct index as a INDEX UNIQUE SCAN. Unfortunately the query timesout after an hour or so.

This is a once-and-done script to validate data. This data crosses applications and departments. My manager wants to see if the other application has corrupted any of the relationships between four tables.
 
I think since I am the only one having the timeouts, it isn't a priority.
 
Barb,

Under those circumstances, could you not accomplish what you want with the much-more-efficient MINUS operator to turn up the exception data?...Just a thought.

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I can provide you with low-cost, remote Database Administration services: see our website and contact me via www.dasages.com]
 
Tried that early on, but that code got trashed in the wake of "floating specs". Thanks for the reminder. I will go back and see if that works any better. Will let you know.
 
BJ,

the only index that can cope with nulls is (to my humble knowledge) a bitmap index.

It would be superlatively powerful at finding the non-null columns, but not much use with the nulls.

Regards

Tharg

Grinding away at things Oracular
 

BJCooperIT said:
I would try that if I had those kind of privileges. The state is a bit paranoid about giving consultants access to production databases

If you have no prvileges to create indexes (or whatever) and the DBA's won't help, then any proposal discussed is futile -- right?

We wish you luck!
[sadeyes]




----------------------------------------------------------------------------
The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb
 
Dave,
The MINUS does not help in this instance, but it was worth a try. It actually increased the cost by about 25%.

LKBrwnDBA,
Sometimes working a state contract does seem futile and can be very frustrating. That said, I do respect the fact that the database here is the most secure that I have ever encountered - a very good thing for the public. The privileges are doled out one-by-one when a need is proven and documented. I do not envy the DBAs - their job must be very tedious and being the gatekeeper can't be much fun.

All,
I have started the requisite email chain documenting the need and the problems I have encountered. They cannot expect the task to be completed without granting the access/tools to achieve it. I have asked my manager if there is a DBA I could work with directly - the best way to build up trust in my opinion. Really am anxious to discuss that function index idea for which I do not have privileges.

I left the queries running overnight and got a "snapshot too old" error. This morning I tried to create a database link so that I could copy some production data to the development database. But... you may have already guessed - ORA-01031 insufficient privileges!
 

BJCooperIT said:
I have started the requisite email chain documenting the need and the problems I have encountered. They cannot expect the task to be completed without granting the access/tools to achieve it

That is the best course of action: No tools/help = No delivery of results. [thumbsup2]



----------------------------------------------------------------------------
The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb
 
Is there any way to whittle the data down by doing a series of reports by individual departments? While cumbersome, it might allow you to pull the pieces you need and then reassemble them in another venue.
 
Yes. Since my objective is to gather counts and the primary key is a sequence number, I can do a series of ranges and add them up in the script. That will be faster than waiting for privileges. I thought of that technique but put it on hold since after the total counts are gathered they will want detail reports and I was not sure how they would want it sorted (certainly not by a one-up sequence number). But it is a good idea and my best shot at this point. Thanks!

Beware of false knowledge; it is more dangerous than ignorance. ~George Bernard Shaw
Consultant Developer/Analyst Oracle, Forms, Reports & PL/SQL (Windows)
My website: www.EmuProductsPlus.com
 
How is your memory situation?
If you can do a series of queries, would it then be possible to build a query around this series, and have the cursor populate a PL/SQL table in memory? If this is feasible, once your table was fully populated you could have it write out the information to a file. Ugly? You bet! But a kill's a kill, and if the elegant solutions are impossible, all you're left with are the ugly ones!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top