Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Query with cursor and loops (calculating # of days)

Status
Not open for further replies.

chmdguy2

Technical User
Nov 21, 2003
10
0
0
US
I'm having some difficulty with a query and hoping someone out there can provide me with assistance. What I am attempting to do is create one row, per id, and calculate the number of days (excluding gaps)

table_data:
id start_dt end_dt
1 1/23/2004 2/22/2004
1 2/17/2004 3/19/2004
1 4/27/2004 5/27/2004
2 1/23/2004 2/22/2004
2 2/17/2004 3/19/2004
2 4/27/2004 5/27/2004

Assuming that these are continuous, if I calculated the difference (end_dt)-(start_dt) for each row, the summed the column, I would get a result of 91 days. However the result should be 47. (Duplicate days between row 1 & 2, and a gap in days between row 3 and 4)

If anyone could help with me with this I would greatly appreciate it!

 
how large are the ranges going to be, i.e. difference between min(startdate) and max(enddate)?

rudy
SQL Consulting
 
The ranges will between start_dt and end_dt will probably be no more than 90 days, the typical range is 30 days. The total range (from the first start_dt to the last end_dt will be no greater than 1 year-30 days)

Thank you for looking into this...
 
more questions --

you said "if I calculated the difference (end_dt)-(start_dt) for each row, the summed the column, I would get a result of 91 days"

yes, that's true, but it means you are not counting the first day of the range (or the last)

and this matters if we want to disregard overlaps

consider the two ranges

1/23/2004 2/22/2004
2/17/2004 3/19/2004

these ranges have 30 and 31 "elapsed" days if you don't count one of the end points

however, as far as overlap goes, should you count 2/17?

do the ranges overlap on 2/17 through 2/22 for a total of 6 overlapping days?




rudy
SQL Consulting
 
Yes, I need to count 2/17 in the range. The ranges will overlap, but I need to count them only once.
 
so, what's your count for the number of days in those two ranges?

30 + 31 - 6 = 55

however, to_days('2004-03-19')- to_days('2004-01-23') = 56


rudy
SQL Consulting
 
hi there, i needed something to do in my lunch break that didn't involve going out in the rain, so have a look at the following - it gets the result 47 for each id. I've not got any more time, unfortunately, so it will need a lot of tweaking (for example for this 30+31-6=55 issue), but i hope it gets you started.
Code:
--drop table #tmp
create table #tmp(xid int, startdate datetime, enddate datetime, diff int, diff2 int, diff3 int, diff4 int)

insert into #tmp(xid, startdate, enddate) values(1,cast('20040123' as datetime),cast('20040222'as datetime))
insert into #tmp(xid, startdate, enddate) values(1,cast('20040217' as datetime),cast('20040319'as datetime))
insert into #tmp(xid, startdate, enddate) values(1,cast('20040427' as datetime),cast('20040527'as datetime))
insert into #tmp(xid, startdate, enddate) values(2,cast('20040123' as datetime),cast('20040222'as datetime))
insert into #tmp(xid, startdate, enddate) values(2,cast('20040217' as datetime),cast('20040319'as datetime))
insert into #tmp(xid, startdate, enddate) values(2,cast('20040427' as datetime),cast('20040527'as datetime))

update #tmp 
set
    diff = datediff(dd,startdate, enddate)

update #tmp
set
    diff2 = case when (select min(t.startdate) from #tmp as t 
                       where t.startdate < #tmp.enddate and t.startdate > #tmp.startdate and t.xid = #tmp.xid) is not null
                 then datediff(dd,enddate, (select min(t.startdate) from #tmp as t 
                                            where t.startdate < #tmp.enddate and t.startdate > #tmp.startdate 
                                            and t.xid = #tmp.xid)) 
                 else 0 end

update #tmp
set
    diff3 = case when (select min(t.startdate) from #tmp as t 
                       where t.startdate > #tmp.enddate 
                       and t.enddate = (select min(t2.enddate) from #tmp as t2 
                                        where t2.enddate > t.enddate and t.xid = t2.xid)
                       and t.xid = #tmp.xid) is not null
                 then datediff(dd, enddate, (select min(t.startdate) from #tmp as t where t.startdate > #tmp.enddate 
                                             and t.enddate = (select min(t2.enddate) from #tmp as t2 
                                                              where t2.enddate > t.enddate and t.xid = t2.xid)
                                             and t.xid = #tmp.xid))
                 else 0
                 end
update #tmp
set
    diff4 = case when (select min(t.startdate) from #tmp as t where t.enddate> #tmp.enddate and t.xid = #tmp.xid)> enddate 
                 then datediff(dd, (select min(t.startdate) from #tmp as t 
                                    where t.enddate> #tmp.enddate and t.xid = #tmp.xid), enddate)
                 else 0 end


select xid, sum(diff + diff2 + diff3 + diff4) from #tmp
group by xid

select * from #tmp
 
The count I would be looking to (in this case) would be 55 days. Is that helpful?
 
yes, 55 is quite helpful, it means you are not counting the start date in each range

my solution involves starting with an integers table that has the integers 0 through 9 in it --

[tt]create table integers (i integer);
insert into integers (i) values (0);
insert into integers (i) values (1);
insert into integers (i) values (2);
insert into integers (i) values (3);
insert into integers (i) values (4);
insert into integers (i) values (5);
insert into integers (i) values (6);
insert into integers (i) values (7);
insert into integers (i) values (8);
insert into integers (i) values (9); [/tt]

now you can cross-join this table with itself three times to generate all thousand integers from 0 to 999

[tt]select 100*h.i + 10*t.i + u.i as num
from integers h
cross
join integers t
cross
join integers u [/tt]

h=hundreds, t=tens, u=units, get it?

okay, now let's start with a base date that is as early or earlier than the earliest date in the data table

by inspection, '2004-01-01' will do nicely

this date has to be hardcoded into the query

now we create an inner join between the thousand numbers and your date ranges, using a rather fanciful join condition which merely adds the number to the base date and determines whether it falls within the range

the trick here, the &quot;55 solution&quot; as it were, is not to count the first date in any range

then all we have to count the number of distinct dates in the ranges

simple, eh?

[tt]select id
, count(distinct
date_add('2004-01-01'
, interval 100*h.i+10*t.i+u.i day)
) as uniquedays
from integers h
cross
join integers t
cross
join integers u
inner
join dateranges
on date_add('2004-01-01'
, interval 100*h.i+10*t.i+u.i day)
between date_add(start_dt
, interval 1 day)
and end_dt
group
by id[/tt]

results:

id uniquedays
1 86
2 86



rudy
SQL Consulting
 
oh wait a sec, that's not right

it should be 85

that darned overlap!!!

oh well, the idea is there, perhaps you can run with it

rudy
SQL Consulting
 
rudy

i plugged in what you suggested, however I'm getting an error message stating i'm missing a closing paren...
***it's stating that it's missing in line2, after the date--- did I miss something in the syntax???
***I also added a &quot;from&quot; statement to identify the table I am pulling the data from at the bottom of the statement.. not sure if it's in the correct spot. Can you take a look?

Thank you again for all of your help!!

btw.. I'm thinking if I add a clause to -1 for each result, it will remove the first starting date?? Sound like a viable option?

select id,
count(distinct date_add('2002-01-01', interval 100*h.i+10*t.i+u.iday)) as unique days
from integers h
cross
join integers t
cross
join integers u
inner
join dateranges
on date_add('2002-01-01', interval 100*h.i + 10*t.i + u.i day)
between date_add(start_dt, interval 1 day) and end_dt
from particp_testing
group by id;
 
sorry, dateranges was the name of the table i was using for testing

take out your &quot;from particp_testing&quot; just before the GROUP BY and change dateranges to particp_testing

as for subtracting -1, no, it isn't as simple as that, sorry

think very carefully about why you are counting 2/17 twice but 1/23 not at all

:)

rudy
SQL Consulting
 
thanks again, I made the changes to the reflect the source table. any idea why i am getting the error message regarding the close paren?
 
no dice... :(

anything else I should try? Here is the entire code I've entered in from your suggestion..

create table integers (i integer);
insert into integers (i) values (0);
insert into integers (i) values (1);
insert into integers (i) values (2);
insert into integers (i) values (3);
insert into integers (i) values (4);
insert into integers (i) values (5);
insert into integers (i) values (6);
insert into integers (i) values (7);
insert into integers (i) values (8);
insert into integers (i) values (9);

--- cross join this table with itself three times to generate all thousand integers from 0 to 999----

select 100*h.i + 10*t.i + u.i as num
from integers h
cross
join integers t
cross
join integers u

------ h=hundreds t=tens u=units ---------

select id,
count(distinct date_add('2002-01-01', interval 100*h.i+10*t.i+u.i day)) as unique days
from integers h
cross
join integers t
cross
join integers u
inner
join marlene_testing
on date_add('2002-01-01', interval 100*h.i + 10*t.i + u.i day)
between date_add(start_dt, interval 1 day) and end_dt
group by id;
 
perhaps remove the space from unique days, that was supposed to be a column alias name

rudy
SQL Consulting
 
still couldn't get past that error...

Thanks for all your help though, I appreciate it!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top