Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

tuning sql for counting no of rows with x no of consecutive numbers

Status
Not open for further replies.
Nov 19, 2003
42
GB
Hi

i need to count the number of people who are logged in over and including 3 minutes

eg user a logs in at 9:00 and logs out at 9:05
user b logs in at 9:00 and logs out at 9:01
user c user a logs in at 9:00 and logs out at 9:03

from example above that would be users a and c!

the row data exists in my database as
accountid,timestamp,event

user a, 9:00, logged in
user a, 9:01, logged in
user a, 9:02, logged in
user a, 9:03, logged in
user a, 9:04, logged in
user a, 9:05, logged out

etc

the sql below works great for what I want but it takes about an hour to run when I look for data in between 2 dates

Code:
SELECT X.ACCOUNTID, B.TITLE, X.NO_OF_MINS_WATCHED, X.START_WATCHING, trim(to_char(extract(month from trunc(add_months(sysdate, -1), 'MM')),'09')||extract(year from trunc(add_months(sysdate, -1), 'MM'))) 
FROM 
( SELECT accountid,channelid,
 MIN (usagetimestamp )KEEP (DENSE_RANK FIRST  ORDER BY usagetimestamp) start_watching,
 COUNT (*) NO_OF_mins_watched
 FROM 
 	(SELECT B.*, SUM (FLAG) OVER (PARTITION BY accountid, channelid ORDER BY usagetimestamp) GRP
 	FROM (SELECT accountid,showid,usagetimestamp,channelid,
	 DECODE
	 ( (usagetimestamp - LAG (usagetimestamp) OVER (PARTITION BY accountid , channelid 	 ORDER BY usagetimestamp)) *24*60 ,0, 0, 1, 0, 2, 0, 1) FLAG
	 FROM Y i) 
	 B
)
GROUP BY accountid,channelid, GRP ) X,  B WHERE ( B.EXTERNALNO(+)=X.CHANNELID ) AND ( ( X.START_WATCHING ) between trunc(add_months(sysdate, -1), 'MM') and last_day(trunc(add_months(sysdate, -1),'MM')) )

Can anyone think of any way to speed up the SQL here at all- admittedly table Y is huge and the database if underspec'd and very slow but an hour is still too long for it to take!

Thanks in advance for any help!
 
is the table itself partitioned?

Can you limit the number of rows by partitioning by months, quarters weeks or something like that?

Do not confuse table partitioning with the analytics PARTITION BY statement.

Regards

T
 
The problem is that you are summing up across the entire table every time and, given that it writes out a message for every connected user every minute, the table must be huge.

Short of re-designing the system (which I think it desperately needs), I think the best bet would be to pull a smaller interval of data from the main table into a temporary table and work on that.

 
No the table isn't partitioned- but should be!!The problem is I don't know how to do it and neither does our DBA (long story dont ask!! - small company not willing to spend money - my database not a priority etc!)

Is partitioning something I could read up on and do myself or is it a very specialist subject?
And would you have any websites/books you could point me in the direction please? I have a good interest in learning DBA stuff anyway so I'm hoping I can give partitioning a go myself!

Dagon I'll try the temp table route in the meantime- thanks for the suggestion!


thanks for any help you can offer :) Its very much appreciated!
 
Assuming that you're implementing range partitions by date, which appears to be a likely choice, it should be fairly easy.

Always read the oracle documentation first - no really - do read the documentation first.

The major issue may well turn out to be financial, as i believe that partitioning is at extra cost to the basic oracle licence - you should check first.

Dagon's idea of a redesign is the best though. Using powerful features to make dreadful data look good is always a bad move. You need huge effort when simply sorting out the problem at source would be much chepaer and easier.

If partitioning is expensive, you might be able to use its cost to persuade the powers that be to redesign the table to something sensible, as the cost of a redesign might be less than that of partitioning.

Regards

T
 
ahh I didn't realise it would be anything to do with extra licensing costs- that will be a definite no then as I've been asking for a new server to host the db for 2 years with no hope of getting one!

When you mean a redesign of the database- what roughly do you mean please?
 
I'm putting words in Dagon's mouth here, so don't be surprised if he chips in with some flames. Anyway, as he mentioned, you're summing all users for every minute, which is generating huge amounts of data, when really, you only want to know those users who were logged in for 3 or more minutes. A user who was logged in for 10 minutes appears to have 10 entries, one for each minute. This is a huge overhead, as you only need to determine the boolean condition of logged in >= 3 minutes, which is a datum. One extra datum per record is much easier to handle than tens or hundreds of data.

So, on that basis, Dagon suggested (I believe) that you do something along the lines of:-

Code:
DROP TABLE USER_ACTIVITY CASCADE CONSTRAINTS PURGE;

CREATE TABLE USER_ACTIVITY
(
 USER_ID INTEGER NOT NULL,
 LOG_ON_TIME_STAMP DATE NOT NULL,
 LOG_OFF_TIME_STAMP DATE
);

--An activity takes > zero time
ALTER TABLE USER_ACTIVITY
ADD CONSTRAINT CONSISTENT_ACTIVITY_DATES
CHECK (LOG_OFF_TIME_STAMP > LOG_ON_TIME_STAMP);

--A user can only log on one session at a time
ALTER TABLE USER_ACTIVITY
ADD CONSTRAINT UQ_ID_AND_ACTIVITY_TIMES
UNIQUE(USER_ID,LOG_ON_TIME_STAMP,LOG_OFF_TIME_STAMP);

INSERT INTO USER_ACTIVITY (USER_ID,LOG_ON_TIME_STAMP,LOG_OFF_TIME_STAMP)
VALUES (1,SYSDATE,SYSDATE+1/24);
INSERT INTO USER_ACTIVITY (USER_ID,LOG_ON_TIME_STAMP,LOG_OFF_TIME_STAMP)
VALUES (2,SYSDATE,SYSDATE+2/24);
INSERT INTO USER_ACTIVITY (USER_ID,LOG_ON_TIME_STAMP,LOG_OFF_TIME_STAMP)
VALUES (1,SYSDATE,SYSDATE+3/24);
COMMIT;

The query below
Code:
SELECT USER_ID,LOG_ON_TIME_STAMP ,LOG_OFF_TIME_STAMP, (LOG_OFF_TIME_STAMP-LOG_ON_TIME_STAMP)*1440 DURATION 
  FROM USER_ACTIVITY 
 WHERE LOG_OFF_TIME_STAMP >= LOG_ON_TIME_STAMP + 3/1440;

returns
Code:
USER_ID                                LOG_ON_TIME_STAMP   LOG_OFF_TIME_STAMP  DURATION                              
1                                      09/03/2009 13:45:45 09/03/2009 14:45:45 60                                    
1                                      09/03/2009 13:45:45 09/03/2009 16:45:45 180                                   
2                                      09/03/2009 13:45:45 09/03/2009 15:45:45 120

Note that from the query above storing start and end times is sufficient. An entry for each intervening minute is pointless. Also, by storing log on and log off in separate columns, date arithmetic is easy, because all of oracle's standard functions work. You don't have to grind through every row in the table to get what you want.



Regards

T
 
In terms of any good books, I have found the following to be a good introductory text.

Title: "Oracle Database 10g Data Warehousing"
Authors: Hobbs, Hillson, Lawande, and Smith"

The text covers all available partitions. Plus, you will also be exposed to some more free bees (e.g. Oracle Discoverer, etc)

I hope this helps.

Tx

rogers42
 
Maybe I'm missing something, but surely a simple analytic will do the job:

Code:
WITH T AS (
SELECT 'user a' usr, TO_DATE('9:00','HH:MI') tm, 'logged in' stat FROM DUAL UNION ALL
SELECT 'user a', TO_DATE('9:01','HH:MI'), 'logged in' FROM DUAL UNION ALL
SELECT 'user a', TO_DATE('9:02','HH:MI'), 'logged in' FROM DUAL UNION ALL
SELECT 'user a', TO_DATE('9:03','HH:MI'), 'logged in' FROM DUAL UNION ALL 
SELECT 'user a', TO_DATE('9:04','HH:MI'), 'logged in' FROM DUAL UNION ALL
SELECT 'user a', TO_DATE('9:05','HH:MI'), 'logged out' FROM DUAL UNION ALL
SELECT 'user b', TO_DATE('9:00','HH:MI'), 'logged in' FROM DUAL UNION ALL
SELECT 'user b', TO_DATE('9:01','HH:MI'), 'logged out' FROM DUAL UNION ALL
SELECT 'user c', TO_DATE('9:00','HH:MI'), 'logged in' FROM DUAL UNION ALL
SELECT 'user c', TO_DATE('9:01','HH:MI'), 'logged in' FROM DUAL UNION ALL
SELECT 'user c', TO_DATE('9:02','HH:MI'), 'logged in' FROM DUAL UNION ALL
SELECT 'user c', TO_DATE('9:03','HH:MI'), 'logged out' FROM DUAL UNION ALL
SELECT 'user a', TO_DATE('9:07','HH:MI'), 'logged in' FROM DUAL UNION ALL
SELECT 'user a', TO_DATE('9:01','HH:MI'), 'logged in' FROM DUAL UNION ALL
SELECT 'user a', TO_DATE('9:02','HH:MI'), 'logged out' FROM DUAL)
SELECT COUNT(DISTINCT USR)
FROM (
SELECT usr, tm, stat, rank () over (partition by usr, stat order by tm) rnk
from t)
WHERE rnk >=3
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top