Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Very Slow Updates 2

Status
Not open for further replies.

pabowen

Programmer
Nov 6, 2002
95
US
I would like to thank everyone far in advance for any help they can give me on this quesiton.

I have several statements that perform what should be a simple update on a single table. However, when placed against a full production load it has never completed (I probably don't have the patience to wait several days)

In any case here is the scenario:

staging table = 2.2 million rows
person table = 700K rows

Staging table has an index on uniquekey and a unique index on uniquekey, class, class_system

person has indexes on uniquekey, plus several irrelevant ones.

The simplest of the queries I am running is
Code:
UPDATE staging SET mig_status = 80 WHERE EXISTS 
(SELECT * FROM person WHERE person.XUNIQUEKEY = staging.xuniquekey
AND xempstatus IN ('A', 'P', 'L', 'Q'))
AND (mig_status IN (0, 3, 6, 7)
OR mig_status IS NULL)

As I said this is the simplest of the queries and it is choking. Is this because of how I am using the EXISTS statement?

The person table is a fairly wide table., staging is not very wide.

Any help would be greatly appreciated!
Patrick
 
pabowen,

would a CTAS be unacceptable to your DBA's?

I know that the dropping and renaming of the tables is a nuisance, but it seems the obvious 'quick kill' here. The other thing is, why select *, why not select 1 (or null). I guess that this has got to be less I/O work, if nothing else.

Regards

Tharg

Grinding away at things Oracular
 
2.2 million rows is not much by modern standards, so the query shouldn't take that long. You don't say whether you have run DBMS_STATS against the two tables or not. Obviously, you would need to do this or Oracle may not use the unique indexes. Also, get an explain plan of the queries to check that it is using the indexes you expect. It may for example be choosing to use an index on xempstatus
rather than uniquekey.
 
I am not sure how to run DBMS_STATS or how to run a query plan? (Sorry orginally a Microsoft Guy). Can anyone point me in the right direction and I will do those to see what I end up with.

Thanks,
Patrick
 
These are big subjects, so you'll have to do a fair amount of research yourself. But, broadly, for plans these are your options:

a) Use the "explain plan for.." query command. This writes the plan to a set of tables, which you can query. The are some pre-canned queries for this which you should be able to find on google without too much difficulty. You will also need to create the plan tables: the script for this comes as part of your Oracle installation or you can find it on google.

b) A lot of tools have the explain plan built in. If you're not familiar with Oracle, a good idea might be to download a trial version of TOAD from Quest software. That has a button with a picture of an ambulance on it that will print out the plan for the query. Again, you will have to build the plan tables, but I think TOAD can do this for you.

c) Use SQL Trace. To do so, you need to issue the following command immediately before running the query:

alter session set sql_trace true;

This creates a trace file on your server. You can then reformat that using a command called tkprof.
The resulting text file will contain the plan as part of the output.

Getting stats:

begin
dbms_stats.gather_table_stats(ownname=>'XXX', tabname=>'XXXXX');
end;

or

analyze table xxx estimate statistics;

There are lots of options, but you'll need to look at the manual to get full details.
 
Dagon, I think we are starting to get somewhere.

I was able to execute the plan and this is what I recieved.

Code:
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

-----------------------------------------------------------------------------
| Id  | Operation            | Name       | Rows  | Bytes |TempSpc| Cost  |
-----------------------------------------------------------------------------
|   0 | UPDATE STATEMENT     |            |   113K|  3113K|       |  7955 |
|   1 |  UPDATE              |REG_STAGING |       |       |       |       |
|*  2 |   HASH JOIN SEMI     |            |   113K|  3113K|  3560K|  7955 |
|*  3 |    TABLE ACCESS FULL |REG_STAGING |   113K|  2224K|       |  2255 |
|*  4 |    TABLE ACCESS FULL |PERSON      |   355K|  2776K|       |  5264 |
--------------------------------------------------------------------------


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("PERSON"."XUNIQUEKEY"="REG_STAGING"."XUNIQUEKEY")
   3 - filter("REG_STAGING"."MIG_STATUS"=0 OR "REG_STAGING"."MIG_STATUS"=3
               OR "REG_STAGING"."MIG_STATUS"=6 OR "REG_STAGING"."MIG_STATU
              S"=7 OR "REG_STAGING"."MIG_STATUS" IS NULL)
   4 - filter("PERSON"."XEMPSTATUS"='A' OR "PERSON"."XEMPSTATUS"='L' OR "P
              ERSON"."XEMPSTATUS"='P' OR "PERSON"."XEMPSTATUS"='Q')

Note: cpu costing is off

22 rows selected.

I don't really understand what it is telling me, but I believe the 'Table Access Full' statement is indicating that it is performing a full table scan and utilizing no indexes. Am I correct?

I executed the dbms_stats.gather_table_stats command as you mentioned, and then reran the plan, but it did not change the results.

Am I using it wrong. I did look in Enterprise Manager and the statistics tab for each of the indexes on the reg_staging table are empty.

Thanks,
Patrick
 
Try the dbms_stats again, but including the indexes. This should populate the stats for the indexes.

Code:
begin
dbms_stats.gather_table_stats(ownname=>'XXX', tabname=>'XXXXX', cascade=>true,
method_opt=>'FOR ALL COLUMNS SIZE auto');
end;

If that doesn't work, run dbms_stats.gather_index_stats on each index separately.

 
I performed the statistics update, as you suggested, including the indexes. I reran the plan and am receiving this output.

Code:
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------

-----------------------------------------------------------------------------
| Id  | Operation            |  Name        | Rows  | Bytes |TempSpc| Cost  |
-----------------------------------------------------------------------------
|   0 | UPDATE STATEMENT     |            |   827K|    17M|       |  8187 |
|   1 |  UPDATE              |REG_STAGING |       |       |       |       |
|*  2 |   HASH JOIN SEMI     |            |   827K|    17M|    18M|  8187 |
|*  3 |    TABLE ACCESS FULL |REG_STAGING |   827K|  8889K|       |  2255 |
|*  4 |    TABLE ACCESS FULL |PERSON      |   357K|  3836K|       |  5264 |
-----------------------------------------------------------------------------


PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("PERSON"."XUNIQUEKEY"="REG_STAGING"."XUNIQUEKEY")
   3 - filter("REG_STAGING"."MIG_STATUS"=0 OR "REG_STAGING"."MIG_STATUS"=3
               OR "REG_STAGING"."MIG_STATUS"=6 OR "REG_STAGING"."MIG_STATU
              S"=7 OR "REG_STAGING"."MIG_STATUS" IS NULL)
   4 - filter("PERSON"."XEMPSTATUS"='A' OR "PERSON"."XEMPSTATUS"='L' OR "P
              ERSON"."XEMPSTATUS"='P' OR "PERSON"."XEMPSTATUS"='Q')

Note: cpu costing is off

The statistics that show up in Enterprise Manager are:
Tree Height: 2
Distinct Keys: 215556
Clustering Factor: 1554901
Leaf Blocks: 5706
Average Leaf Blocks/key: 1
Average Data Blocks/Key: 7
Number of Rows: 2156755
Sample Size: 2156755

I am still seeing that the plan lists it as a "TABLE ACCESS FULL" Does that indicate it is performing a full table scan?

Thanks,
Patrick
 
Yes, it is but it's not a nested loop full table join i.e. it's not scanning person for each row of staging. The method is a "hash join semi", which is a special technique used for evaluating "exists" statements. It's usually not a bad access plan, but it does have to create a hash table for person, so that could be an overhead. It would certainly be a problem if you are calling the update repeatedly, because it will have to build the hash table each time.

You could try explicitly hinting it to use an index join:

Code:
UPDATE staging SET mig_status = 80 WHERE EXISTS 
(SELECT /*+ INDEX(person index_name) */ 1 FROM person WHERE person.XUNIQUEKEY = staging.xuniquekey
AND xempstatus IN ('A', 'P', 'L', 'Q'))
AND (mig_status IN (0, 3, 6, 7)
OR mig_status IS NULL)

Check the plan to make sure it has worked.

An alternative approach, which will need unique indexes on both tables, would be to recast the update as a join.

Code:
UPDATE (SELECT staging.mig_status 
FROM person 
WHERE person.XUNIQUEKEY = staging.xuniquekey
AND xempstatus IN ('A', 'P', 'L', 'Q')
AND (mig_status IN (0, 3, 6, 7)
OR mig_status IS NULL))
set staging.mig_status = 80

If there isn't a unique index on person, then you could copy the table using "create table as .. select distinct xuniquekey.." first and create the unique index on the uniquekey.

Also, how many rows are you updating each time ? If there are relatively few rows with mig_status IN (0, 3, 6, 7), then adding an index on this column will speed things up.
 
Dagon, Thank you very much for your help. I think we are on the right track, but I am not able to get this peice to work.

I used your suggestion to recast the update as a join.

I am able to get the select to run in 17 seconds (used to take 20 minutes), but when I try it as part of the update statement it tells me "ORA-00904 B1U_SYS.REG_STAGING.MIG_STATUS: INVALID IDENTIFIER.

Here is the exact code I am trying to execute.

Code:
UPDATE
(SELECT b1u_sys.reg_STAGING.MIG_STATUS
FROM PATHLORE_DATA.PERSON PERSON, B1U_SYS.REG_STAGING 
WHERE PERSON.XUNIQUEKEY = B1U_SYS.REG_STAGING.XUNIQUEKEY
AND xempstatus IN ('A', 'P', 'L', 'Q')
AND (b1u_sys.reg_STAGING.mig_status IN (0, 3, 6, 7)
OR b1u_sys.reg_STAGING.mig_status IS NULL))
SET b1u_sys.reg_STAGING.mig_status = 80;

Again, I appreciate your help.
 
You can't qualify the column you are updating. It should be:

Code:
UPDATE
(SELECT b1u_sys.reg_STAGING.MIG_STATUS
FROM PATHLORE_DATA.PERSON PERSON, B1U_SYS.REG_STAGING 
WHERE PERSON.XUNIQUEKEY = B1U_SYS.REG_STAGING.XUNIQUEKEY
AND xempstatus IN ('A', 'P', 'L', 'Q')
AND (b1u_sys.reg_STAGING.mig_status IN (0, 3, 6, 7)
OR b1u_sys.reg_STAGING.mig_status IS NULL))
SET mig_status = 80;

Remember also that they must both have unique keys on XUNIQUEKEY or you will get:

ORA-01779: cannot modify a column which maps to a non key-preserved table
 
Dagon, Thank you very much!!!! I think I have this, or at least a beginning to it.

The join ended up solving my problem. It is still taking 18 minutes to run (much better than 2 hours) on my laptop. But a simple update of the whole table also is taking a considerable amount of time. So there is some other issue. Now I have to resolve the other statements.

You have educated me in several aspects of Oracle, thanks again.

Patrick
 
Dagon,

You've just demonstrated Santa-level capability by going from tkprof and explain plans, through to update join statements, demonstrating an awareness of the 'non key-preserved' cock up.

Excellent work!

I herewith award you the willif grand order of the purple pointy thingy.

Regards

T

Grinding away at things Oracular
 
Other things to look at:

a) are there any indexes, constraints or triggers on the column you are updating ? These could slow the update down.

b) is the update causing any row chaining or migration. Check the folloiwng in dba_tables:

pctfree value - should as big or slightly bigger than the percentage increase in column width that changing the mig_status column from NULL to a value would produce

chain_cnt following dbms_stats of the table after the update.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top