Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Backfill Census Data 2

Status
Not open for further replies.

briangriffin

Programmer
Nov 6, 2008
878
US
I'm collecting an hourly census of hospital patients by department; an ETL package is inserting that data into a table:

Dept Census CensusDate CensusTime InsertDateTime
5W 25 7/1/2011 1600 7/1/2011 16:05:05
5E 15 7/1/2011 1600 7/1/2011 16:05:05
6W 20 7/1/2011 1600 7/1/2011 16:05:05
OBS 5 7/1/2011 1600 7/1/2011 16:05:05

However, if the department has no patients a record is not created:

Dept Census CensusDate CensusTime InsertDateTime
5W 25 7/1/2011 2000 7/1/2011 20:05:05
5E 15 7/1/2011 2000 7/1/2011 20:05:05
6W 20 7/1/2011 2000 7/1/2011 20:05:05


After each load, I need to backfill with records for each department that had no patients. In this example I would need to insert:

Dept Census CensusDate CensusTime InsertDateTime
OBS 0 7/1/2011 2000 7/1/2011 20:05:05

CensusDate and CensusTime are varchar, InsertDateTime is smalldatetime.

The examples I've found online are more complex than this one needs to be. I have a script that works using a crossjoin and a couple of variable tables, but it's a little clumsy. Is there a standard, clever way of doing this?

TIA






 
A cross join is the standard way (or at least... the way I would do it). You may not need the table variables though.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
If you have a table with the departments on it, then you can just left join to that, or do a left join to a derived table that does a select distinct over your data.


are you doing a standard INSERT INTO select?
If so, you can just tack the left join/derived table there and do some basic isNull's to get the zeros for the departments with no patients.




[blue]The doc walks in.[/blue] The good news:[green]"It's just Grumpy Old Man Syndrome."[/green] The bad news:[red]"You're not even 30."[/red]
 
I took the left join approach and ended up with this:

Code:
declare 
@v_censusdate varchar(12),
@v_censustime varchar(12),
@v_insertdatetime smalldatetime


set @v_censusdate = (select max(censusdate) from census where insertdatetime = (select max(insertdatetime) from census))
set @v_censustime = (select max(censustime) from census where insertdatetime = (select max(insertdatetime) from census))
set @v_insertdatetime = (select max(insertdatetime) from census)


insert into census

select a.unit, 0, @v_censusdate, @v_censustime, @v_insertdatetime

from 

(select distinct unit from census) a left outer join census b
	on a.unit = b.unit
	and b.censusdate = @v_censusdate
	and b.censustime = @v_censustime
	
	where a.unit + @v_censusdate + @v_censustime not in
		(select unit + censusdate + censustime from census)

It seems to work, and I guess the reason I can get away with it is that I only need to identify missing records from the latest load. Feel free to point out how this might not be as efficient as it could be.

Thanks.
 
BrianGriffin,

Um, I don't know if that's exactly what you were after.
I'm assuming you get some sort of poll data, like such:
Block A, 3, 20110712, 13:00
Block B, 4, 20110712, 13:01
Block C, 5, 20110712, 13:02

Block D, 0, 20110712, 13:03 <-- INFERRED


Your script is inserting 1 row at a time, filling in 0 block for all the others each time.

This would "appear" to work as long as the time is being incremented each load, such as if you were doing a GetDate().

Looking at the data received (shown above) and what you would wind up in the database, are drastically different

Block A, 3, 20110712, 13:00
Block B, 0, 20110712, 13:00
Block C, 0, 20110712, 13:00
Block D, 0, 20110712, 13:00

Block A, 0, 20110712, 13:01
Block B, 4, 20110712, 13:01
Block C, 0, 20110712, 13:01
Block D, 0, 20110712, 13:01

Block A, 0, 20110712, 13:02
Block B, 0, 20110712, 13:02
Block C, 5, 20110712, 13:02
Block D, 0, 20110712, 13:02


I'm also assuming that you have "check in" periods, such as every 15 mins or an hr. Your data load should probabaly "snap" to these time blocks for reporting purposes. That way time entered in at 13:01 records as 13:00, but time entered as 13:13 snaps to 13:15.


There's a "jenky" way to do this, leaving whatever you are using as an interface with a single row passed in.

You would have a check if the current "block" of time existed in the database. If it didn't you would do a blank bull load of all the units. If it does, then this block doesn't run.

Then you would call an update on the single row that was passed in. This way, each single update will "correct" it's value, with the first update for the block ensuring those that don't get filled in have a zero value.


LODLAIDEN!


[blue]The doc walks in.[/blue] The good news:[green]"It's just Grumpy Old Man Syndrome."[/green] The bad news:[red]"You're not even 30."[/red]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top