Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Add Sequential numbering of groups of records by date 1

Status
Not open for further replies.

cgarmas

Technical User
Jun 16, 2005
37
US
Hi all,
I am using MS SQL and I need help with creating a column were I can add a sequantial numbering of groups of records based on sequantial dates. The table structure is as follows
MemberID EnrollDate TermDate
001 01/01/2005 05/30/2005
001 07/01/2005 09/30/2005
001 10/15/2005 12/31/2005
002 01/12/2005 06/30/2005
002 07/15/2005 12/30/2005
003 02/01/2005 08/30/2005
003 09/01/2005 12/25/2005

I need this table to look like this

MemberID EnrollDate TermDate Seq_Number
001 01/01/2005 05/30/2005 1
001 07/01/2005 09/30/2005 2
001 10/15/2005 12/31/2005 3
002 01/12/2005 06/30/2005 1
002 07/15/2005 12/30/2005 2
003 02/01/2005 08/30/2005 1
003 09/01/2005 12/25/2005 2

I'll appreciate any advice to help me resolve this problem.
Cgarmas

 
SQL Server 2005 has ranking functions built in. If that's the version you are using, then look the commands up in books on line.

For sql 2000, the simplest way is to insert the data in to a table variable first (with an identity column), then use a subquery to return the min value from the identity column, joining the table back to itself, etc... (confused yet?). The point is... it's not at all intuitive or straight forward how to do this.

First, create a table variable with an identity column (and all the other columns you care about).
Code:
Declare @Data 
Table   (RowId Integer Identity(1,1), 
        MemberId Int, 
        EnrollDate DateTime, 
        TermDate DateTime)

Then, insert the data in to this table variable, making sure that the data is ordered properly while inserting it.
Code:
Insert Into @Data (MemberId, EnrollDate, TermDate)
Select MemberId, EnrollDate, TermDate
From   [!]TableName[/!]
Order By MemberId, EnrollDate

Then, with a little knowledge and some SQL Magic, you can get your data, like so...

Code:
Select RowId - MinRowId + 1 As RankColumn, *
From   @Data D
       Inner Join(
         Select Min(RowId) As MinRowId, 
                MemberId 
         From   @Data 
         Group By MemberID
         ) As A On D.MemberId = A.MemberId
Order By RowId



-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
Thank you George, I'll try this first thing in the morning.
 
I tried this approach, but due to the large number of records in the data set: over 500,000 records; to process this is taking several hours and is not done yet, and I am causing a problem to other people with the server. Is there any other way I can do this that can be more efficient in terms of time it takes to process? Thank you.
 
First off... If the query runs for more than a minute, then you should stop the query from running.

Second... You could try adding an index to the table variable.

Code:
Declare @Data 
Table   (RowId Integer Identity(1,1), 
        MemberId Int, 
        EnrollDate DateTime, 
        TermDate DateTime
        [!]Primary Key Clustered (MemberId, EnrollDate)[/!])

Also... try commenting out the last select part. See how long it takes to put the data in to the table variable.

I suspect that it may take a little longer to load the data in to the table variable when you add the primary key part, but will probably cause the last query to run a lot faster.

Please keep us posted on your progress.

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
I got the following error message when I tried getting the data into the table variable.
"Cannot insert the value NULL into column 'UniqueId', table '@Data'; column does not allow nulls. INSERT fails.
 
Hmmmmmm....

Are there any records where MemberId and/or EnrollDate are NULL?

[tt][blue]Select * From Table Where EnrollDate Is NULL Or MemberId Is NULL[/blue][/tt]

Since we added a primary key to the column, it won't allow nulls in that field (or duplicates). I suggest you check your data.



-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
You are right, there were some records that have a NULL value for MemberID
 
So, I presume that you added a where clause on the query that loads the data.

[tt][blue]Where MemberId Is not NULL[/blue][/tt]

Has this solved your problem?

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
I excluded the records with NULL MemberID, but I got this message: "Violation of PRIMARY KEY constraint 'PK__@Data__571B2BD2'. Cannot insert duplicate key in object '#56270799'."
 
Now you have the duplicate problem to deal with.

This should locate your duplicates.

Code:
Select MemberId, EnrollDate, Count(1) As DupCount
From   TableName
Group By MemberId, EnrollDate
Having Count(1) > 1

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
Yes, I identified the duplicate records. This data is a little tricky, In this database we have two fields that have MemberIDs, one is the MemberID, and MemberID2, the first one stores the unique ID for each member, but the same member may have multiple MemberID2.
The problem is that some times members start a new enrollment period and on the same date of that enrollment they receive a new MemberID2 and therefore the new MemberID2 would have the same EnrollDate, creating a duplicate record for the unique MemberID. Sorry about all this lengthy explanation, but may be this will help to get all the necessary pieces in place. Thanks again for all your help.
 
Hmmmmm... Complicated data like this is usually a call for some sample data. Please include data with MemberId and MemberId2. DON'T include anything sensitive like names, phone numbers, addresses, etc...

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
this is a good sample of duplicate situations.
-----------------------------------------------
MemberID MembID2 EnrollDate TermDate
8187 1010602-00 19960109
8187 1010602-10 19960109 19970317
8187 1003533-11 19980128 19990505
8187 1003533-11 19990506 20000731
16978 1006330-10 19960118 19960916
16978 1010345-10 19960918 19960918
16978 1006330-10 19960918 19990131
16978 1006330-10 19990201 20001121
16978 1006330-10 20001122 20030303
21304 1007777-10 19960618 19961130
21304 1007777-10 19961201 19980310
21304 1007777-10 20030922 20031128
21304 1068908-00 20040301 20051130
21304 1068908-00 20051201 20051201
114477 1019297-16 20010307 20010307
114477 1043265-10 20010307 20020131
114477 1043265-10 20020402 20020701
114477 1043265-10 20020715 20030131
117933 1043731-00 20021205 20030415
117933 1043731-00 20030416 20030831
117933 1043731-00 20030901 20051121
117933 1083842-00 20051208
117933 1043731-00 20051208 20051208
125106 1047226-00 20040616 20040802
125106 1047226-00 20050614 20060514
125106 1078899-00 20050803
125106 1078931-00 20050803 20050803
125106 1047226-00 20060515
207871 1085160-00 20060614 20060630
207871 1085648-00 20060701 20060701
207871 1085648-10 20060701 20060711
207871 1085648-10 20060712 20060712
 
And the expected results....

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
The expected results would be as follows>
rows with empty TermDate cells are because these members are still enroll.
As I am interested in the enrollment segments based on the unique MemberID, those segments with enrollment and terminartion date on the same day are to be ignored and excluded.
-------------------------------------------------
MemberID MembID2 EnrollDate TermDate Seq_Number
8187 1010602-00 19960109
8187 1010602-10 19960109 19970317 1
8187 1003533-11 19980128 19990505 2
8187 1003533-11 19990506 20000731 3
16978 1006330-10 19960118 19960916 1
16978 1010345-10 19960918 19960918
16978 1006330-10 19960918 19990131 2
16978 1006330-10 19990201 20001121 3
16978 1006330-10 20001122 20030303 4
21304 1007777-10 19960618 19961130 1
21304 1007777-10 19961201 19980310 2
21304 1007777-10 20030922 20031128 3
21304 1068908-00 20040301 20051130 4
21304 1068908-00 20051201 20051201 5
114477 1019297-16 20010307 20010307
114477 1043265-10 20010307 20020131 1
114477 1043265-10 20020402 20020701 2
114477 1043265-10 20020715 20030131 3
117933 1043731-00 20021205 20030415 1
117933 1043731-00 20030416 20030831 2
117933 1043731-00 20030901 20051121 3
117933 1083842-00 20051208 4
117933 1043731-00 20051208 20051208
125106 1047226-00 20040616 20040802 1
125106 1047226-00 20050614 20060514 2
125106 1078899-00 20050803 3
125106 1078931-00 20050803 20050803
125106 1047226-00 20060515 4
207871 1085160-00 20060614 20060630 1
207871 1085648-00 20060701 20060701
207871 1085648-10 20060701 20060711 2
207871 1085648-10 20060712 20060712 3
 
It's hard to tell if this is right. I made a table variable and hardcoded your sample data. You can copy/paste this entire query in to Query Analyzer and run it. If it appears to work properly, you can remove the @Temp lines and substitute your actual table name (where I highlight the code in red).

Code:
[green]--Test data[/green]
Declare @Temp Table(MemberId Int, MembId2 VarChar(20), EnrollDate Int, TermDate Int)

Insert Into @Temp Values(8187,      '1010602-00',    19960109,    NULL     )   
Insert Into @Temp Values(8187,      '1010602-10',    19960109,    19970317)
Insert Into @Temp Values(8187,      '1003533-11',    19980128,    19990505)
Insert Into @Temp Values(8187,      '1003533-11',    19990506,    20000731)
Insert Into @Temp Values(16978,     '1006330-10',    19960118,    19960916)
Insert Into @Temp Values(16978,     '1010345-10',    19960918,    19960918)
Insert Into @Temp Values(16978,     '1006330-10',    19960918,    19990131)
Insert Into @Temp Values(16978,     '1006330-10',    19990201,    20001121)
Insert Into @Temp Values(16978,     '1006330-10',    20001122,    20030303)
Insert Into @Temp Values(21304,     '1007777-10',    19960618,    19961130)
Insert Into @Temp Values(21304,     '1007777-10',    19961201,    19980310)
Insert Into @Temp Values(21304,     '1007777-10',    20030922,    20031128)
Insert Into @Temp Values(21304,     '1068908-00',    20040301,    20051130)
Insert Into @Temp Values(21304,     '1068908-00',    20051201,    20051201)
Insert Into @Temp Values(114477,    '1019297-16',    20010307,    20010307)
Insert Into @Temp Values(114477,    '1043265-10',    20010307,    20020131)
Insert Into @Temp Values(114477,    '1043265-10',    20020402,    20020701)
Insert Into @Temp Values(114477,    '1043265-10',    20020715,    20030131)
Insert Into @Temp Values(117933,    '1043731-00',    20021205,    20030415)
Insert Into @Temp Values(117933,    '1043731-00',    20030416,    20030831)
Insert Into @Temp Values(117933,    '1043731-00',    20030901,    20051121)
Insert Into @Temp Values(117933,    '1083842-00',    20051208,    NULL     )   
Insert Into @Temp Values(117933,    '1043731-00',    20051208,    20051208)
Insert Into @Temp Values(125106,    '1047226-00',    20040616,    20040802)
Insert Into @Temp Values(125106,    '1047226-00',    20050614,    20060514)
Insert Into @Temp Values(125106,    '1078899-00',    20050803,    NULL     )   
Insert Into @Temp Values(125106,    '1078931-00',    20050803,    20050803)
Insert Into @Temp Values(125106,    '1047226-00',    20060515,    NULL     )   
Insert Into @Temp Values(207871,    '1085160-00',    20060614,    20060630)
Insert Into @Temp Values(207871,    '1085648-00',    20060701,    20060701)
Insert Into @Temp Values(207871,    '1085648-10',    20060701,    20060711)
Insert Into @Temp Values(207871,    '1085648-10',    20060712,    20060712)

[green]-- Query starts here[/green]

Declare @Data 
Table	(RowId Integer Identity(1,1),
		MemberId Int,
		MembId2 VarChar(20),
		EnrollDate Int,
		TermDate Int
		Primary Key Clustered (MemberId, EnrollDate)
		)

Insert Into @Data(MemberId, MembId2, EnrollDate, TermDate)
Select MemberId, Min(MembId2), EnrollDate, Min(TermDate)
From   [!]@Temp[/!] [green]-- Your table name here[/green]
Where	MemberId Is Not NULL
		And IsNull(EnrollDate, -1) <> IsNull(TermDate, -1)
Group By MemberId, EnrollDate
Order By MemberId, EnrollDate


Select RowId - MinRowId + 1 As RankColumn, D.MemberId, D.MembId2, D.EnrollDate, TermDate
From   @Data D
       Inner Join(
         Select Min(RowId) As MinRowId, 
                MemberId 
         From   @Data 
         Group By MemberID
         ) As A On D.MemberId = A.MemberId
Order By RowId

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
George,
I wanted to thank you for your help and all the work you put on this, I had to abandon this project for a while but I finally got back to it and got the results I needed from the code you developed. I only have one question though, just to try to understand how the code works, how does the code takes care of the first row of the data?
MemberID MembID2 EnrollDate TermDate Seq_Number
8187 1010602-00 19960109

Thank you
Cgarmas
 
Hmmmm.... I'm sure there is a magical combination of words that would clearly explain how this works. Unfortunately, I can't think of them at the moment. So, allow me to demonstrate the process with an example.

Suppose you were tracking customers and the date on which they purchase from you. Now, suppose you wanted a report that numbers their purchases. The order table could look something like...

[tt][blue]Customer OrderDate
-------- ---------
1 Dec 2, 2006
1 Dec 4, 2006
2 Dec 5, 2006
1 Dec 8, 2006
1 Dec 10, 2006[/blue][/tt]

The first thing we do is to create a table variable with an identity column. It's important to insert the data in to the table variable in the proper order. After the insert, the table variable would look like...

[tt][blue]
RowId Customer OrderDate
----- -------- ---------
1 1 Dec 2, 2006
2 1 Dec 4, 2006
3 1 Dec 8, 2006
4 1 Dec 10, 2006
5 2 Dec 5, 2006
[/blue][/tt]

Then, we get the minimum row id for each customer. So, Customer 1 would have MinRowId = 1 and Customer 2 would have a MinRowId = 5.

Now, to get the rank, we simply need to take the RowId, subtract the MinRowId, and then add 1.

[tt][blue]
RowId MinRowId RankNumber Customer OrderDate
----- -------- ------------- -------- ---------
1 1 [green](1-1) + 1 =[/green] 1 1 Dec 2, 2006
2 1 [green](2-1) + 1 =[/green] 2 1 Dec 4, 2006
3 1 [green](3-1) + 1 =[/green] 3 1 Dec 8, 2006
4 1 [green](4-1) + 1 =[/green] 4 1 Dec 10, 2006
5 5 [green](5-5) + 1 =[/green] 1 2 Dec 5, 2006
[/blue][/tt]

Hopefully this helps you to understand how the query works. If not, then let me know and I will try to explain further.

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
Thank you, this is very helpful. I don't want to abuse of your kindness, but I would like to ask you if you can explain also how the Min(MembId2) and Min(TermDate) work in this case.
Cgarmas
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top