Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

SQL 2005: Find longest sequence of events within a table

Status
Not open for further replies.

jondow

Programmer
Oct 19, 2006
45
GB
Hi all,

I have a table of football results which in its most basic form contains columns for team, date and result (Win,Lose). Can anyone suggest a way to find the longest run of wins/losses for any team in the table without using a cursor? The longest run should be limited to a single team but I want the longest run of wins/losses from all teams.

Thanks for your help!

Rick

Team Date WiLose
A 01/01/2010 Win
A 07/01/2010 Lose
A 13/01/2010 Lose
A 19/01/2010 Lose
A 25/01/2010 Lose
A 31/01/2010 Lose
A 06/02/2010 Win
A 12/02/2010 Lose
A 18/02/2010 Win
B 01/01/2010 Lose
B 07/01/2010 Win
B 13/01/2010 Lose
B 19/01/2010 Win
B 25/01/2010 Win
B 31/01/2010 Win
B 06/02/2010 Win
B 12/02/2010 Lose
B 18/02/2010 Lose
C 01/01/2010 Lose
C 07/01/2010 Lose
C 13/01/2010 Lose
C 19/01/2010 Lose
C 25/01/2010 Lose
C 31/01/2010 Win
C 06/02/2010 Win
C 12/02/2010 Lose
C 18/02/2010 Win
 
Not doing something right myself but getting close... but this will make it easier to test.

create table team (
Team char(1),
Date datetime,
WiLose varchar(10))

SET DATEFORMAT dmy

insert into team values ('A', '01/01/2010', 'Win')
insert into team values ('A', '07/01/2010', 'Lose')
insert into team values ('A', '13/01/2010', 'Lose')
insert into team values ('A', '19/01/2010', 'Lose')
insert into team values ('A', '25/01/2010', 'Lose')
insert into team values ('A', '31/01/2010', 'Lose')
insert into team values ('A', '06/02/2010', 'Win')
insert into team values ('A', '12/02/2010', 'Lose')
insert into team values ('A', '18/02/2010', 'Win')
insert into team values ('B', '01/01/2010', 'Lose')
insert into team values ('B', '07/01/2010', 'Win')
insert into team values ('B', '13/01/2010', 'Lose')
insert into team values ('B', '19/01/2010', 'Win')
insert into team values ('B', '25/01/2010', 'Win')
insert into team values ('B', '31/01/2010', 'Win')
insert into team values ('B', '06/02/2010', 'Win')
insert into team values ('B', '12/02/2010', 'Lose')
insert into team values ('B', '18/02/2010', 'Lose')
insert into team values ('C', '01/01/2010', 'Lose')
insert into team values ('C', '07/01/2010', 'Lose')
insert into team values ('C', '13/01/2010', 'Lose')
insert into team values ('C', '19/01/2010', 'Lose')
insert into team values ('C', '25/01/2010', 'Lose')
insert into team values ('C', '31/01/2010', 'Win')
insert into team values ('C', '06/02/2010', 'Win')
insert into team values ('C', '12/02/2010', 'Lose')
insert into team values ('C', '18/02/2010', 'Win')

SET DATEFORMAT mdy

select *
from team

--Row_number should do it for you but I am not getting the correct result. But someone will find my flaw..

select *,
row_number() over
(
partition by wilose
order by team, date, wilose
) as wlrank
from team
order by team, date, wilose

Then you just take the highest value for each team and win/loss.

Simi
 
This post is based on simians work.

Simian,
Couldn't directly use the row num function. there was nothing to reset the ticker once it found another streak.

Jondow,
You're not getting away from some sort of looping mechanism.
This is not a cursor, and should perform alright.
My queries are based on Simians table structure, but i had to add more data rows to get multiple streaks for the same team.

Code:
Declare @Teams as table(Team char(1),Date datetime,WiLose varchar(10), GameOrder int)
INSERT INTO @Teams
select *, row_number() over (partition by team order by team, [date]) as GameOrder
from team

DECLARE @TeamData as Table(GameEntry int identity(1,1), Team char(1),GameDate datetime, GameOrder int, WinLose varchar(10), Streak int, GamesInStreak int)

INSERT INTO @TeamData
Select td_1.Team, td_1.[Date], td_1.GameOrder, td_1.WiLose,
CASE WHEN td_1.WiLose = td_3.WiLose OR td_1.WiLose = td_2.WiLose THEN 1 ELSE 0 END Streak, 0 GamesInStreak
FROM @Teams td_1
LEFT JOIN  @Teams td_2 on
	td_1.Team = td_2.Team
	AND td_1.GameOrder + 1 = td_2.GameOrder
LEFT JOIN @Teams td_3 on
	td_1.Team = td_3.Team
	AND td_1.GameOrder - 1 = td_3.GameOrder
ORDER BY td_1.Team, td_1.Date

DECLARE @GameEntry int, @MaxGameEntry int, @Team varchar(255), @WinLose varchar(25), @LastWinLose varchar(25), @LastStreak int, @Streak int, @Ticker int
SELECT @MaxGameEntry = max(GameEntry), @GameEntry = 0 FROM @TeamData
WHILE(@GameEntry < @MaxGameEntry)
 BEGIN
	SELECT @GameEntry = GameEntry, @Team = Team, 
		@LastStreak = @Streak, @Streak= Streak, 
		@LastWinLose = @WinLose, @WinLose = WinLose,
		@Ticker = CASE WHEN @LastWinLose = @WinLose AND @LastStreak = @Streak AND @Streak = 1 THEN @Ticker+1 ELSE 1 END 
	FROM @TeamData where GameEntry = @GameEntry+1
	UPDATE @TeamData
	SET GamesInStreak = @Ticker
	WHERE GameEntry = @GameEntry	
 END
 
--Select * from @TeamData
SELECT Team, Max(CASE WHEN WinLose = 'Win' THEN Tick ELSE 0 END) MaxWins, Max(CASE WHEN WinLose = 'Lose' THEN Tick ELSE 0 END) MaxLosses
FROM (Select Team, WinLose, Max(GamesInStreak) Tick FROM @TeamData
	GROUP BY Team, WinLose) as t
GROUP BY Team
ORDER BY Team

Lodlaiden

[blue]The doc walks in.[/blue] The good news:[green]"It's just Grumpy Old Man Syndrome."[/green] The bad news:[red]"You're not even 30."[/red]
 
I don't use SQL Server, so can't test this. Seems like this, or a variation, might work.

Starting with your record set, use row_number and partition by Team and WiLose. I would expect you to get something like this:

Code:
Team  Date            WiLose  Seq
A     01/01/2010      Win     1
A     07/01/2010      Lose    2
A     13/01/2010      Lose    2
A     19/01/2010      Lose    2
A     25/01/2010      Lose    2
A     31/01/2010      Lose    2
A     06/02/2010      Win     3
A     12/02/2010      Lose    4
A     18/02/2010      Win     5
B     01/01/2010      Lose    6
B     07/01/2010      Win     7
B     13/01/2010      Lose    8
B     19/01/2010      Win     9
B     25/01/2010      Win     9
B     31/01/2010      Win     9
B     06/02/2010      Win     9
B     12/02/2010      Lose    10
B     18/02/2010      Lose    10
C     01/01/2010      Lose    11
C     07/01/2010      Lose    11
C     13/01/2010      Lose    11
C     19/01/2010      Lose    11
C     25/01/2010      Lose    11
C     31/01/2010      Win     12
C     06/02/2010      Win     12
C     12/02/2010      Lose    13
C     18/02/2010      Win     14

From that, do a count, grouped by Team/WiLose/Seq and you should get something like this (hopefully I don't have copy/paste/counting errors since this was done by hand [pc2]):
Code:
Team  WinLose Seq   Count(1)
A     Win     1     1
A     Lose    2     5
A     Win     3     1
A     Lose    4     1
A     Win     5     1
B     Lose    6     1
B     Win     7     1
B     Lose    8     1
B     Win     9     4
B     Lose    10    2
B     Lose    10    2
C     Lose    11    5
C     Win     12    2
C     Lose    13    1
C     Win     14    1

Obviously not tested, but seems like the concept would give you what you're looking for - length of the longest win/loss streaks overall and/or by team.

Hope this helps,
Larry
 
Larry,
This is what you get (only Team A, for berevity):
A 2010-01-01 00:00:00.000 Win 1
A 2010-01-07 00:00:00.000 Lose 1
A 2010-01-13 00:00:00.000 Lose 2
A 2010-01-19 00:00:00.000 Lose 3
A 2010-01-25 00:00:00.000 Lose 4
A 2010-01-31 00:00:00.000 Lose 5
A 2010-02-06 00:00:00.000 Win 2
A 2010-02-12 00:00:00.000 Lose 6
A 2010-02-18 00:00:00.000 Win 3

The server uses a ticket says this the Xth row with the same TEAM-WiLose value, so that's the number I'm going to use.
The streaks themselves aren't numbered, so there's no way to reset the row_number ticker.

[blue]The doc walks in.[/blue] The good news:[green]"It's just Grumpy Old Man Syndrome."[/green] The bad news:[red]"You're not even 30."[/red]
 

I am still stuck on why row_number does not work.

Anybody with a detailed explaination would be appreciated.

Simi
 
I got 10 mins...let's see if I can beat gmm on this.

So the purpose of the ROW_Number function is to let you number a series of rows. You can define the "reset point" (PARTION BY) and the order in which the rows are numbered. This order by clause does not have to be the same as your query, allow the rows to return with a non sequential row_number.

You can only segregate the data by defined points.

SQL doesn't visually group data, like we do automatically in our head. If we see 7 apples and 2 oranges on the table in piles of 2+1, 2+1, and 3 we know inherently which is the largest pile. The problem is that SQL only has this data:

Location, FruitType

Select Location, FruitType, ROW_NUMBER() OVER(Partion BY Location, FruitType, Order by Location) row_num

Table Apple (1)
Table Apple (2)
Table Orange(1)

Table Apple (3)
Table Apple (4)
Table Orange(2)

Table Apple (5)
Table Apple (6)
Table Apple (7)


The problem with jondow's data is that there isn't enough data to have a clean "reset point" If the data had come with streak number, then it would have been easy enough to ROW_NUMBER over the Team/Streak# and then to a Max().
There is not pre-defined grouping of a set of wins/losses, so the only fields you have available are the Team and the WIN LOSS value.
You can use this to number the wins and losses to get a clean total count of each, but there is still nothing "grouping" these into "Streaks" of wins or losses.


This is why I added a streak column to identify rows that were part of a Streak. I tried every variant I could come up with the ROW_Number over the Streak and the Team, but I still didn't have any way of saying "This is streak 1" "this is streak 2".

Lodlaiden



[blue]The doc walks in.[/blue] The good news:[green]"It's just Grumpy Old Man Syndrome."[/green] The bad news:[red]"You're not even 30."[/red]
 
I got 10 mins...let's see if I can beat gmm on this.

Nice explanation. Couldn't have said it better myself.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
See this solution

Code:
create table team (
Team char(1),
Date datetime,
WiLose varchar(10))

SET DATEFORMAT dmy

insert into team values ('A', '01/01/2010', 'Win')
insert into team values ('A', '07/01/2010', 'Lose')
insert into team values ('A', '13/01/2010', 'Lose')
insert into team values ('A', '19/01/2010', 'Lose')
insert into team values ('A', '25/01/2010', 'Lose')
insert into team values ('A', '31/01/2010', 'Lose')
insert into team values ('A', '06/02/2010', 'Win')
insert into team values ('A', '12/02/2010', 'Lose')
insert into team values ('A', '18/02/2010', 'Win')
insert into team values ('B', '01/01/2010', 'Lose')
insert into team values ('B', '07/01/2010', 'Win')
insert into team values ('B', '13/01/2010', 'Lose')
insert into team values ('B', '19/01/2010', 'Win')
insert into team values ('B', '25/01/2010', 'Win')
insert into team values ('B', '31/01/2010', 'Win')
insert into team values ('B', '06/02/2010', 'Win')
insert into team values ('B', '12/02/2010', 'Lose')
insert into team values ('B', '18/02/2010', 'Lose')
insert into team values ('C', '01/01/2010', 'Lose')
insert into team values ('C', '07/01/2010', 'Lose')
insert into team values ('C', '13/01/2010', 'Lose')
insert into team values ('C', '19/01/2010', 'Lose')
insert into team values ('C', '25/01/2010', 'Lose')
insert into team values ('C', '31/01/2010', 'Win')
insert into team values ('C', '06/02/2010', 'Win')
insert into team values ('C', '12/02/2010', 'Lose')
insert into team values ('C', '18/02/2010', 'Win')  

SET DATEFORMAT mdy

select * 
from team

--Row_number should do it for you but I am not getting the correct result. But someone will find my flaw..

;with cte as (select *, 
row_number() over 
(
partition by team order by date
) - ROW_NUMBER() over (partition by team, wilose order by date) as [GroupID]
from team),
cte1 as (select team, 
SUM(case when WiLose = 'Win' then 1 else 0 end) as [Wins],
SUM(case when WiLose = 'Lose' then 1 else 0 end) as [Loses]
from cte
group by Team, GroupID),
cteLongestWins as (select top 1 with ties * from cte1 order by Wins DESC),
cteLongestLooses as (select top 1 with ties * from cte1 order by Loses DESC)

-- Top number of consecutive wins and loses
select * from cteLongestWins
UNION ALL
SELECT * from cteLongestLooses

Idea is based on the blog post by Plamen Ratchev

PluralSight Learning Library
 
Wow, lots of responses, thank you all!

I'll have a play with the solutions and see where I end up.

Thanks again.

Rick
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top