Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Subquery Question re:Percentile 1

Status
Not open for further replies.

Kearns

Programmer
Nov 28, 2002
10
CA
I am using SQL Server 2000 and I have a table that contains a column of job_name and cpu_time. Every time a job completes the job_name and cpu_time used is interted to the table. I have been asked to find the cpu_time of the 75th percentile for each job_name. There are over 50,000 rows with about 200 unique job_names

I have used the following code to get the result I need, but I have to recode the job_name every time I want look at a different job_name.

SQL Code
--------
Code:
 select top 1 b.job_name, b.cpu_time
from (
select top 75 percent *
from job_hist_tbl
where name = 'ARRPMT'
order by cpu_time ASC
)b
order by b.cpu_time DESC

Is there a way to repeat this code for each distinct job_name in table job_hist_tbl?

The job_hist_tbl contains:
job_name cpu_time
-------- --------
ARRPMT 6
ARRPMT 8
ARRPMT 10
ARRPMT 12
GLRACWLD 1
GLRACWLD 2
GLRACWLD 4
GLRACWLD 9
GLRINQAC 3
GLRINQAC 5
GLRINQAC 7
GLRINQAC 11


The results I expect to get are:

job_name cpu_time
-------- --------
ARRPMT 10
GLRACWLD 4
GLRINQAC 7


 
create a udf, like this:
Code:
CREATE FUNCTION [dbo].[Percentile75]
(
--The string to be converted to proper case
@job_name varchar(50)
)
--This function returns the proper case string of varchar type
RETURNS int
AS
BEGIN

DECLARE @returnvalue int

select top 1 @returnvalue = b.cpu_time
from (
select top 75 percent *
from job_hist_tbl
where job_name = @job_name
order by cpu_time ASC
) b
order by b.cpu_time DESC

RETURN @returnvalue

END

and then use it in a select, like this:

Code:
select distinct job_name, dbo.Percentile75(job_name)
from job_hist_tbl

Cheyney
 
Great, it works! Thanks for the help cheyney.
 
This solution works and runs quickly with a small table, but I am having a problem with a large table. The table I am using has about 120,000 rows and 237 distinct names.

When I executed the query it ran for over 16 hours before I killed it. Why does it take so long?

While experimenting with the code I tried running the code from the function as a stand alone query:

Code:
select top 1 b.name, DATEDIFF(s, b.start_time, b.end_time)AS exec_time
from (
select top 95 percent *
from dbo.IPK_JOB_HISTORY
where name = 'APR15'
order by DATEDIFF(s, start_time, end_time) ASC
)b
order by DATEDIFF(s, b.start_time, b.end_time) DESC

I got a result back in 1 second. Then I ran this code:

Code:
select distinct name
from dbo.IPK_JOB_HISTORY
where name = 'APR15'

I got a result back in 1 second. But if I run:

Code:
select distinct name, dbo.Percentile95(name)
from dbo.IPK_JOB_HISTORY
where name = 'APR15'

where udf dbo.Percentile95 is:

Code:
CREATE    FUNCTION [dbo].[Percentile95]
(
--The string to be converted to proper case
@name varchar(50)
)
--This function returns the proper case string of varchar type
RETURNS int
AS
BEGIN

DECLARE @returnvalue int

select top 1 @returnvalue = DATEDIFF(s, b.start_time, b.end_time)
from (
select top 95 percent *
from dbo.IPK_JOB_HISTORY
where name = @name
order by DATEDIFF(s, start_time, end_time) ASC
) b
order by DATEDIFF(s, b.start_time, b.end_time) DESC

RETURN @returnvalue

END

it takes over 5 minutes to get a result.

Granted I am a novice at SQL, but I would have expected a result in a few seconds. Is there something else happening that I don't know about?
 
Probably not as I did not create one. What index would you recommend?
 
What are all the columns on IPK_JOB_HISTORY ?

Do you have a primary key?

Cheyney
 
The table definition is:

Code:
job_key     (int, Not Null)
job_number  (int, Null)
name        (varchar(32), Not Null)
queue       (varchar(32), Null)
submit_time (datetime, Null)
start_time  (datetime, Null)
end_time    (datetime, Null)
exitcode    (int, Null)
cpu_time    (int, Null)
job_params  (text, Null)

the primary key is job_key.

FYI: We are using a product called The Argent Job Scheduler for running batch jobs. I created this table to maintain runtime history for performance analysis. I use a trigger on an Argent table to populate IPK_JOB_HISTORY.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top