Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Job queues locking up

Status
Not open for further replies.

rtefft

Programmer
Aug 23, 2001
70
US
I have a report manager package that executes via DBMS_JOB queue every 5 minutes. It simply looks for report submissions, and launches a new DBMS_JOB for each one as needed. It has been running fine for months, but lately the job will simply hang in the queue for no apparent reason. If other report jobs were running, they will be frozen as well. There are no error messages, no entries (at all) in its log file, nothing. It's as if Oracle just decided to stop running all the job queues. Deleting and resubmitting the jobs will correct the problem... until the next time. The problem occurs at (seemingly) random every few days, and does not seem to correspond to any other database issues/events. JOB_QUEUE_PROCESSES is set to 20, and the number of jobs has never exceeded 6. We are running 9.0.1.1 with an upgrade to 9.2 in our future. Any ideas would be appreciated.

Thansk in advance,
Rich

____________________________
Rich Tefft
PL/SQL Programmer
 
Dima,

JOB_QUEUE_INTERVAL was deprecated in Oracle 9i and is no longer set in our INIT.ORA files. Since the problem happens erattically (anywhere from 1-5 days between problems with 50-100 jobs/day), this would seem to indicate the Oracle defined interval is working fine.

At the time of the lockups, running jobs appear to stop and use no resources, and generate no errors or output; it's like their process was suspended. For jobs that were not actually running when the lockup occurs (but are scheduled to run later), they do not run when their time comes around. It hasn't happened for a few days now, and I'm waiting for the next occurrence to gather more info.

Thanks for yor help - stay tuned...
Rich

____________________________
Rich Tefft
PL/SQL Programmer
 
Rich, is it possible that you have some locking issues?
It's also possible that their processes were suspended on OS level (some sysadmins may do this with greedy processes without notification)

Regards, Dima
 
BTW, could you clarify the notion of 'job' term? Do you mean records (tasks) in user_jobs view or number of executions of that tasks per day or a number of concurrent tasks?

Does "decided to stop running all the job queues" means that already launched processes are frozen or just that none of scheduled process is started? JOB_QUEUE_PROCESSES sets the maximum number of processes, but how many of them (Jxx) are running? And what's with your CQJ0 process, is it up?

Regards, Dima
 
Sem & Dima:

When the lockup/freezing occurs, all jobs currently executing freeze and all pending jobs never start. I've been working with the DBA and sysadmin and they are puzzled as well.

Let me clarify more of the background. We have a single entry in the job queue called RPTMGR that executes every 5 minutes. On each execution, it queries a table containing pending report requests, and launches a separate 1-time job request for each report to be run. There is no inter-job communication. When the individual report jobs complete, they are done and do not repeat. Our RPTMGR processes (ie spawns 1-time jobs for) about 100 reports/day. We have JOB_QUEUE_PROCESSES set at 20, and never have more than 5 reports (plus RPTMGR) running at a time. No other jobs exist in the DB.

As for locking, the only locks involved should be normal row-level locks from (1) RPTMGR updating each record in the table to show the report was submitted, and (2) each report's update of the same record later on to note completion. Reports take an average of 1-5 mins to execute, and the RPTMGR itself completes an execution cycle in a few seconds (repeated every 5 minutes).

The problem has not occured since I increased the RPTMGR interval from 1 min to 5 mins; perhaps there was some conflict even though the run times are so fast.

I'm setting this aside until it recurs, so I'll post info if I have it. Thank you both for your suggestions.

Rich



____________________________
Rich Tefft
PL/SQL Programmer
 
Try to make updates more "concurrent friendly", e.g. by using FOR UPDATE NOWAIT and logging an error outside the main transaction (either in autonomous transaction or into a file or pipe). I think that either RPTMGR locks itself or some reports may mark their completion in the same table thus multiple instances of the same report lock each other.

Regards, Dima
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top