Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Group Consecutive numbers in a range 1

Status
Not open for further replies.

bmacbmac

IS-IT--Management
Jan 26, 2006
392
US
Hi,

Have submitted this before, but having and issue and want to re-evaluate my procedure.

I have nearly 200,000 bigint numbers in a table:

2005041330
2005041331
2005041332
2005041333
2005041334
2005041337
2005041338
2005041339
2005041341
2005041342
2005041343
2005041347
2005041350
2005041351
2005041352
2005041353

I am looking for a way I can group this list into ranges... like this:
BegNumber EndNumber
2005041330 2005041334
2005041337 2005041339
2005041341 2005041343
2005041347 2005041347
2005041350 2005041353

In my previous post I gathered the following query:
Code:
insert into #temp2
SELECT MIN(docno) AS range_start, 
       MAX(docno) AS range_end
FROM (
SELECT docno,
      (SELECT MIN(B.docno)
       FROM #temp AS B
       WHERE B.docno >= A.docno
         AND NOT EXISTS
                (SELECT *
                 FROM #temp AS C
                 WHERE C.docno = B.docno + 1)) AS grp
FROM #temp AS A) AS T

GROUP BY grp

It's a very slow process. It seems to work with small lists < 10,000. But if I try running it on my entire 200,000 rows it will never complete even after running overnight.

I had this in SQL 2000 before, but have since upgraded to 2008 R2. Is there a better way to go about this now that I have 2008 R2?


Thanks!

Brian
 
This type of "problem" is known as an "island".

I suggest a google search on "SQL Server gaps and islands". You will get a lot of hits with a lot of useful information.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
twitter
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Thanks gmmastros! I ran the search and came up with the following:

Code:
select convert(varchar(10),imin.docno) + ' - ' +  convert(varchar(10),
(select min(docno) from (select docno = docno from #temp where not exists (select * from #temp a2 where #temp.docno-1 = 
a2.docno) and #temp.docno <> (select min(docno) from #temp)) as imax where imax.docno > imin.docno))
from
(select docno = docno from #temp where not exists (select * from #temp a2 where #temp.docno+1 = a2.docno) and #temp.docno <> 
(select max(docno) from #temp)) as imin

This runs MUCH faster. It ran on my test db of 22,000 rows in 13 seconds. My original query took several minutes.
 
That's great.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
twitter
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Nevermind.. that didn't work at all.. upon further review of the data it is not giving the correct ranges... back to the drawing board!
 
Can you try this?

Code:
Select	MinData.DocNo As BegNumber, MaxData.DocNo As EndNumber
From    (
        Select A.DocNo, Row_Number() Over(Order By A.DocNo) As RowId
        From   #temp As A
               Left Join #temp As B
                 On A.DocNo = B.DocNo - 1
        Where  B.DocNo Is NULL
        ) As MaxData
        Inner Join 
        (
        Select A.DocNo, Row_Number() Over (Order BY A.DocNo) As RowId
        From   #temp As A
               Left Join #temp As B
                 On A.DocNo = B.DocNo + 1
        Where  B.DocNo Is NULL
        ) As MinData
          On MaxData.RowId = MinData.RowId

I'm curious to know if this returns the correct data, and also how the performance compares to your existing method.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
twitter
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Try
Code:
DECLARE @t TABLE (num INT)
INSERT INTO @t 
SELECT
2005041330
UNION ALL SELECT
2005041331
UNION ALL SELECT
2005041332
UNION ALL SELECT
2005041333
UNION ALL SELECT
2005041334
UNION ALL SELECT
2005041337
UNION ALL SELECT
2005041338
UNION ALL SELECT
2005041339
UNION ALL SELECT
2005041341
UNION ALL SELECT
2005041342
UNION ALL SELECT
2005041343
UNION ALL SELECT
2005041347
UNION ALL SELECT
2005041350
UNION ALL SELECT
2005041351
UNION ALL SELECT
2005041352
UNION ALL SELECT
2005041353

;WITH cte AS (SELECT Num, ROW_NUMBER() OVER (ORDER BY num) AS Rn
FROM @t),
cte2 AS (SELECT Num, ROW_NUMBER() OVER (ORDER BY num) - DENSE_RANK() OVER (PARTITION BY Num-Rn ORDER BY Num) AS grp
FROM cte)

SELECT MIN(num) AS StartNum, MAX(num) AS EndNum FROM cte2 
GROUP BY grp

PluralSight Learning Library
 
Same idea, but simpler solution (I almost went to bed, but had to go back to post it):

Code:
;WITH cte AS (SELECT Num, ROW_NUMBER() OVER (ORDER BY num) AS Rn
FROM @t)
SELECT MIN(num) AS StartNum, MAX(num) AS EndNum FROM cte
GROUP BY Num - Rn

PluralSight Learning Library
 
Thanks for the replies everyone. The two options by markros and the one by gmmastros both worked perfectly.

My original script took about 7 minutes to run against 25,000 rows. Both of your scripts ran in milliseconds.

I ran all three scripts against my 200,000 rows and they all came up with the same results in less than a second!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top