Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Query help

Status
Not open for further replies.

AlStl

MIS
Oct 2, 2006
83
0
0
US
Guys,

I am trying to accomplish follow task. We are using Oracle 10g.


I have a table called TABLE_A consist of 5 columns:


HGSN, YEAR, KEY, VERT, YEAR


HGSN, YEAR, KEY, VERT - make up the primary key.


Here is how records are in the table:

HGSN YEAR KEY VERT TOTAL

XX1234 2006 P 1 400
XX1234 2006 D 1 300
XX1234 2006 P 2 150
XX1234 2006 D 2 100
XX1234 2007 D 1 188


Total number of records in the table approx: 5 million

Here is what I am trying to accomplish:

Delete every records with key='D' if there is no corresponding record with key='P'

so, if you look at above dataset - last record should be deleted since there no cooresponding 'P' record.

This is what I wrote:

DELETE from TABLE_A
WHERE KEY='D'
AND HGSN||YEAR||KEY||VERT NOT IN
(SELECT HGSN||YEAR||KEY||VERT FROM TABLE_A WHERE KEY='P');

This SQL is taking a very long time...I mean hrs. I also created a unique index on all four primary key columns,
but result remain the same. How can I reduce the execution time. Any suggestions will be highly appreciated.

Thx,

Al
 
First, AlStl, I believe that you have a logic error in your long-running code...Every 'D'-key row will be deleted since a row with a 'D'-key value will never be IN a result set where you allow only 'P'-key values, right?

Here's the proof:
Code:
SQL> select * from table_a;

HGSN         YEAR K       VERT      TOTAL
------ ---------- - ---------- ----------
XX1234       2006 P          1        400
XX1234       2006 D          1        300
XX1234       2006 P          2        150
XX1234       2006 D          2        100
XX1234       2007 D          1        188

5 rows selected.

DELETE from TABLE_A
WHERE KEY='D'
AND HGSN||YEAR||KEY||VERT NOT IN
(SELECT HGSN||YEAR||KEY||VERT FROM TABLE_A WHERE KEY='P');

3 rows deleted.

SQL> select * from table_a;

HGSN         YEAR K       VERT      TOTAL
------ ---------- - ---------- ----------
XX1234       2006 P          1        400
XX1234       2006 P          2        150

2 rows selected.

SQL> rollback;

Rollback complete.

SQL> select * from table_a;

HGSN         YEAR K       VERT      TOTAL
------ ---------- - ---------- ----------
XX1234       2006 P          1        400
XX1234       2006 D          1        300
XX1234       2006 P          2        150
XX1234       2006 D          2        100
XX1234       2007 D          1        188

5 rows selected.
Instead, I believe that the following code is both correct and much faster:
Code:
DELETE from TABLE_A x
WHERE KEY='D'
AND not exists
   (SELECT null from TABLE_A y
     WHERE x.HGSN=y.HGSN
       and x.YEAR=y.YEAR
       and x.VERT=y.VERT
       and y.KEY ='P');

1 row deleted.

SQL> select * from table_a;

HGSN         YEAR K       VERT      TOTAL
------ ---------- - ---------- ----------
XX1234       2006 P          1        400
XX1234       2006 D          1        300
XX1234       2006 P          2        150
XX1234       2006 D          2        100

4 rows selected.
Let us know about this code's correctness and speed.

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I provide low-cost, remote Database Administration services: www.dasages.com]
 
[deleted]...I also created a unique index on all four primary key columns, but result remain the same. How can I reduce the execution time. Any suggestions will be highly appreciated.... [deleted]

Hi Al,
When you have a PRIMARY KEY in oracle you already have a Unique index on pirmary key's columns :)

DELETE from TABLE_A x
WHERE KEY='D'
AND not exists
(SELECT null from TABLE_A y
WHERE x.HGSN=y.HGSN
and x.YEAR=y.YEAR
and x.VERT=y.VERT
and y.KEY ='P');
This what you need as Musafa stated. And I am sure it will be more and more faster than your query which already incorrect!

Some notes:
If you have any blob area in table you should delete in smaller parts... by 1000 record and than another 1000
If your performance problem continues..Then, You can create an index on KEY ('D', 'P') and you can force Oracle to use 'D' by using query hint's. However plz remember distribution of 'D' must be smaller than 'P'... For example %5 of your records must be 'D' then INDEX on 'D' maybe meaning full.
 
Mufasa,
RedInTheCorner,

This was a fantastic piece of advise. This code worked liked charm. Execution time was reduce to 1/10 or even less.

One last thing. I also would like to insert records if this condition is true:

If there is record with Key='P', which does not have a coorespoding records witk key='D' then insert 'D' record.

For example:

HGSN YEAR KEY VERT TOTAL

XX1234 2006 P 1 400
XX1234 2006 D 1 300
XX1234 2006 P 2 150
XX1234 2006 D 2 100
XX1234 2007 D 1 188
XX1111 2008 P 0 199

Last record has no coorespoding 'D' record. A new record should be inserted, so records will look like this:

HGSN YEAR KEY VERT TOTAL

XX1234 2006 P 1 400
XX1234 2006 D 1 300
XX1234 2006 P 2 150
XX1234 2006 D 2 100
XX1234 2007 D 1 188
XX1111 2008 P 0 199
XX1111 2008 D 0 199

Keep in mind I am dealing with large number of records in this table. So a select statement may return more than one record. Do I need to write a cursor to deal with this kind of situtaion?

Thx.

Al


 
Al,

I suggest a slight variation of the previous code to achieve what you want:
Code:
SQL> select * from table_a;

HGSN         YEAR K       VERT      TOTAL
------ ---------- - ---------- ----------
XX1234       2006 P          1        400
XX1234       2006 D          1        300
XX1234       2006 P          2        150
XX1234       2006 D          2        100
XX1234       2007 D          1        188
XX1111       2008 P          0        199

6 rows selected.

insert into table_a
select hgsn, year, 'D', vert, total
  from TABLE_A x
 WHERE KEY='P'
   AND not exists
       (SELECT null from TABLE_A y
         WHERE x.HGSN=y.HGSN
           and x.YEAR=y.YEAR
           and x.VERT=y.VERT
           and y.KEY ='D');

1 row created.

SQL> select * from table_a;

HGSN         YEAR K       VERT      TOTAL
------ ---------- - ---------- ----------
XX1234       2006 P          1        400
XX1234       2006 D          1        300
XX1234       2006 P          2        150
XX1234       2006 D          2        100
XX1234       2007 D          1        188
XX1111       2008 P          0        199
XX1111       2008 D          0        199

7 rows selected.
Let us know if this does what you want.

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I provide low-cost, remote Database Administration services: www.dasages.com]
 

Just curious is the following is better:

delete from table_a t0 inner join
( select HGSN, YEAR, VERT from table_a group by HGSN, YEAR, VERT
having count(*) = 1 ) t1
where t0.HGSN = t1.HGSN and t0.YEAR = t1.YEAR and t0.VERT = t1.VERT and t0.key = 'D'
 


That is SQL Server syntax, I have to modify it a little bit to make ORACLE happy with it ; -)

delete from table_a t0
where exists
( select HGSN, YEAR, VERT from table_a t1
where t0.HGSN = t1.HGSN and t0.YEAR = t1.YEAR and t0.VERT = t1.VERT
group by HGSN, YEAR, VERT
having count(*) = 1
) and t0.key = 'D';
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top