Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Cursors in Oracle 1

Status
Not open for further replies.

mike777

Programmer
Jun 3, 2000
387
0
0
US
Hello.
We are FoxPro programmers trying to learn Oracle.
That's not necessarily relevant, but if you know FoxPro, it will help you know where we're coming from.
We have an Oracle procedure that looks something like:
Code:
create or replace procedure SomeProcName as
     cursor SomeCurName is
          select awhole,bunchof,stuff
          from   AReallyBigTable;
     vcAWhole    AReallyBigTable.AWhole%type;
     vcBunchOf   AReallyBigTable.BunchOf%type;
     vcStuff     AReallyBigTable.Stuff%type;
     begin
          open SomeCurName;
          loop
               fetch SomeCurName into vcAWhole,vcBunchOf,vcStuff;
               -- we do some stuff here...
               -- more stuff here...
               -- ok this is enough...
           end loop;
     end SomeProcName;

Our question is this:
When we run the cursor:
Code:
select awhole,bunchof,stuff
          from   AReallyBigTable;
from a SQL Editor, it executes immediately (because it only returns the first 100, 500, etc.whatever rows). HOWEVER, it appears to be clear to us that within the procedure, it returns the entire cursor before it begins fetching the rows one by one.
(In FoxPro, they have the Scan...EndScan...). We are desperate to find a way to process a cursor row-by-row, so it doesn't eat up our server's resources to pull this huge cursor into memory (20M, 70M, 100M records).
Any help you can provide is greatly appreciated!
Thank you.
-Mike Kemp
 
I have no background in FoxPro, but I do have a background in Oracle Tuning.

What you should consider trying to do is not using a cursor. Instead of using a cursor and doing stuff inside a cursor, use update, delete, or insert statements. Performance wise this is much better.

The code may not look as sophisticated or elegant, but you will be hard-pressed to do better in terms of performance.

Sometimes I use a table on the side to segregate those records that have to be manipulated, so that I don't have to read irrelevant data from large tables.

I hope this gives you some ideas.

Aryeh Keefe
 
aryeh1010,
Thanks for the response.
The issue we're dealing with (which is why we went the cursor route to begin with) is that we need to update so much data , that it's killing our rollback segments. The amount of data we're updating is as much as 6G.
So, we thought, if we could use a cursor and step through the dataset, we could update 10k records, then do a commit, update the next 10k, commit, etc. We use a counter variable to keep track of how many records are being updated.
Is there a way to do this using the update statement? I suppose the rownum would work? We thought we might be able to say:
Code:
update AReallyBigTable set SomeField = 'someval' where rownum between nBeg and nEnd
where nBeg and nEnd are variables that hold incrementing values like 1-10k,10001-20k, etc.
This is OK, however, we have a problem with this method when we try to do big deletes, since every time we delete a range of rows, then commit, the rownums are reset on the entire table and our next delete statement will not capture the correct records.
Any help you can provide again is very much appreciated.
Thanks.
-Mike
 
If you have the resources available, then you shouldn't let a 6gb rollback segment intimidate you.

However, it is possible to do things your way simply by looping rhrough the cursor and issuing commit statements every time the counter reaches a multiple of 1000, 10,000, or whatever. This cuts up your transactions into smaller chunks.

I can tell you based on my experience, that it might be a good idea to hire a db consultant to give you some pointers on how to deal with your particular situation, as what you are thinking of doing will definitely come at the expense of performance. I've seen software projects nearly get cancelled because of performance issues.

With some planning ahead of time, you can get a good set of principles for db design and programming so that your job will be easier over time. It might be harder at first, but it's a lot better to take some time now, than rewrite code later under a lot of pressure, while changing db structure.

Good luck,
Aryeh Keefe
 
Mike, your SQL editor works the same way: it opens a cursor and fetches the first 100, 500 etc rows in a loop. More robust querying tools use array binding, but in any case the open cursors and fetch data. Then, rownum is calculated after retreiving all rows, thus rownum between nBeg and nEnd return nothing, because if nBeg<>1 then it wouldn't return even the first row! Is 6G an estimation of RBS size or a number of rows? Though in any case if you work with such a big amount of data you must have adequate hardware.

Regards, Dima
 
Hi Mike,

Can I suggest you to create an extra column on your driving table say upd_flag and use this column to open your cursor like

create or replace procedure SomeProcName as
cursor SomeCurName is
select awhole,bunchof,stuff
from AReallyBigTable
where upd_flag is NULL
and rownum < 10000 ( COMMIT FREQUENCY )
for update of upd_flag;

vcAWhole AReallyBigTable.AWhole%type;
vcBunchOf AReallyBigTable.BunchOf%type;
vcStuff AReallyBigTable.Stuff%type;
FLAG BOOLEAN := TRUE;
begin
LOOP -- An outer loop
open SomeCurName;
loop
fetch SomeCurName into
vcAWhole,vcBunchOf,vcStuff;
EXIT WHEN SomeCurName%NOTFOUND;

-- SET FLAG
FLAG := FALSE;
-- we do some stuff here...
-- more stuff here...
-- ok this is enough...

-- UPDATE THE CURRENT ROW IN THE END;

update Areallybigtable
set upd_flag = 'Y'
where current of Somecurname;

end loop;

COMMIT;
CLOSE SomeCurName;

if ( FLAG = TRUE) then exit; end if;

END LOOP; -- OUTER LOOP

end SomeProcName;


 
Hi Dima,

I've quickly written something, by using FLAG variable, to exit from the outer loop, it might be wrong but I'm sure something of that sort can be done.

Regards
Ajay.
 
In fact I see that it may take a lot of time to explain that the whole idea is wrong, thus I just choose to carp at most explicit errors :)

Regards, Dima
 
Instead of using an inner loop you may issue something like
Code:
update Areallybigtable 
set upd_flag = 'Y'
where <main condition>  
and upd_flag is null and rownum <10000

But in any case this is much slower and less clear than updating the whole table or creating a new one "as select" with the following renaming it.

Regards, Dima
 
I'm sorry Dima, Are you suggesting something new or is it an explaination to prove that above idea is COMPLETELY wrong.
 
Also, can I request you to explain your solution further as to how can we replace inner loop with update statement.
 
This is an explanation that your implementation of wrong idea may be improved by performing batch update instead of updating record-by-record. Obviously other fields may be updated by the same statement.

Regards, Dima
 
I appreciate your comments to enhance the performance of my wrong(?) idea, I'm sure there will be several other ways to achieve this. But you have still not explained how my idea is wrong? I guess this will do the job (if not efficiently).
 
Hello all.
Thanks for all the help.
We took all of your suggestions seriously.
Here's what ended up working for us. My boss came up with this, and it worked great. I think it's very interesting, and I hope it's of some value to you.
We tried the mass update statement and the loop. Neither of these worked. In the case of the mass update, it crashed the server. In the case of the loop as it was suggested here, it was way too slow (ran for 4-5 hours, and was only 10-15% done).
So we created two procedures, both with loop routines. The first procedure is a select statement that pulls out all the records that need to be updated, using a cursor. It iterates through the cursor and outputs the ROWID's to a temporary table. The second procedure sorts the temporary table on ROWID (that's the only column in the table). Then we loop through the sorted temporary table, and update the records in the master table whose rowid's match the current row in the temporary table.
This took about 1.5 hours for both routines to run completely. The theory is that this works because once we sort the temporary table on ROWID, and then begin iterating through it updating the current record, we are stepping through contiguous disk space. Since the arm on the disk drive doesn't have to search the disk for various disk blocks, writing is continuous. Anyway, that's the theory, and it worked for us.
The other benefit to using a procedure over a straight-SQL update is that we found we could put markers in another table we named "timing". We were updating 10M records, so every 100k, we would do an insert statement that put the time, record number, etc. in a temporary table. This way we could watch the progress, and extrapolate how long the entire routine would take based on how fast it completed each 100k iterations. This is much better, in our opinion, than running an update statement that takes an hour or so (or longer), only to have it crash after 45 min, while in the meantime we have no idea how long it's going to take to finish, or what kind of progress it's making. Obviously, also, updating 100k records (we only committed at this point) is a lot less resource-intensive than updating 10M.
So, in the final analysis, the secret for us was sticking with a procedure, running update statements in a loop and using the ROWID to gain access to contiguous disk space.
Hope this is of some value you.
-Mike
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top