Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

All possible combinations with a twist 1

Status
Not open for further replies.

ZenRaven

Programmer
Mar 13, 2007
84
US
I've been working on a problem that I can't quite figure out. I've tried different combinations of cross joins, CTEs, windowing functions, etc but could never quite get there. I'm also not wanting to go the dynamic SQL route. Can someone please help?

Given a variable set of grouped values produce all possible combinations vertically (derived group, value)

Additional criteria:
1 - No 2 combinations should have the same set of values, regardless of order. Example: If you already have (1,2) then don't produce (2,1), if (1,2,3) then no (1,3,2),(2,1,3),(2,3,1),(3,1,2),(3,2,1)
2 - Values of the same group should not combine
3 - all values are unique, regardless of group. The only reason for the initial grouping is to apply rule #2

Given the starting groups and values of

1 8
2 7
2 9
3 1
3 6
3 3

produce

1 8
2 7
3 9
4 1
5 6
6 3
7 8
7 7
8 8
8 9
9 8
9 1
10 8
10 6
11 8
11 3
12 7
12 1
13 7
13 6
14 7
14 3
15 9
15 1
16 9
16 6
17 9
17 3
18 8
18 7
18 1
19 8
19 7
19 6
20 8
20 7
20 3
21 8
21 9
21 1
22 8
22 9
22 6
23 8
23 9
23 3

Another way to think about it without the derived grouping and vertical output is this

8
7
9
1
6
3
8 7
8 9
8 1
8 6
8 3
7 1
7 6
7 3
9 1
9 6
9 3
8 7 1
8 7 6
8 7 3
8 9 1
8 9 6
8 9 3
 
I'm sorry, but I don't understand the question. It usually helps when there is sample data and expected results, but this one has me baffled. Can you try explaining this a little more, particularly the inputs and how the outputs are generated.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
twitter
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Thanks, that makes me feel better that it's not only me that has a hard time understanding it.

Here's the manual, non-vertical method for producing the output

SQL:
CREATE TABLE #temp1 (GroupID INT, MyValue INT)

INSERT INTO #temp1 (GroupID, MyValue)
VALUES  (1,8),(2,7),(2,9),(3,1),(3,6),(3,3)

--1st set of possibilities
SELECT MyValue
FROM #temp1

--2nd set of possibilities
SELECT a.MyValue, b.MyValue
FROM #temp1 a
JOIN #temp1 b
ON a.GroupID < b.GroupID

--3rd set
SELECT a.MyValue, b.MyValue, c.MyValue
FROM #temp1 a
JOIN #temp1 b
ON a.GroupID < b.GroupID
JOIN #temp1 c
ON b.GroupID < c.GroupID

DROP TABLE #temp1

My problem is that there can be a variable number of starting values
With this in mind, my output needs to be in grouped vertical sets so I'm only returning 2 columns. 1 that groups the numbers together and the number itself
See OP for what the final output should look like for this example (23 groups of 1,2 or 3 values)
 
I'm so close. The DENSE_RANK in the recursive member of the CTE is what's incorrect. I was hoping I could partition by a number of rows which is why I included the level. I'm using SS2012 and saw a ROWS clause but it kept throwing error about being used in a windowing function and I'm not sure it would do what I want anyway.

SQL:
WITH    MyCTE
          AS (SELECT    1 AS Level, DENSE_RANK() OVER (ORDER BY GroupID, MyValue) AS DgroupID, GroupID, MyValue
              FROM      #temp1
              UNION ALL
              SELECT    a.Level + 1, [COLOR=#EF2929]DENSE_RANK() OVER (ORDER BY b.GroupID, b.MyValue)[/color], b.GroupID, b.MyValue
              FROM      MyCTE a
              JOIN      #temp1 b
                        ON a.GroupID < b.GroupID)

SELECT  DENSE_RANK() OVER (ORDER BY Level, DgroupID), MyValue
FROM    MyCTE


--Adam
--"He who knows best knows how little he knows." - Thomas Jefferson
 
Nevermind, I got too excited and didn't realize I was only returning 23 rows, not 23 groups. Should be 46 rows with 23 distinct groups. Argh! Feeling a bit recursive... how appropriate
 
So there's 2 problems I need to overcome to get the recursive CTE to work.

1) The windowing function I used to give an incremental value to each row didn't work as expected. This is probably due to the way CTEs work. Good for performance, bad for me. The ROW_NUMBER windowing function does the same thing. All I'm trying to do there is to autoincrement the rows within each iteration so I can identify the group when the table gets "unpivoted". I believe the reason CTEs are so fast is because they're actually set-based operations so even though there's recursion I can't rely on the loop/iteration mode of thinking to produce the intended result. Feel free to correct me in all of my assumptions

2) Unpivoting. I need to take a set of rows and unpivot the columns into rows, with each keeping the identifier of the original row to show they are grouped together. SQL Server has a wonderful command called UNPIVOT which doesn't help me at all because you need to know how many columns you're unpivoting at design time. The whole point of this is to be able to provide a variable number of inputs and produce a predictable output

 
Looking at this I have one question...

will there be common numbers between different groups?
e.g. is it possible to have
group value

1 2
1 5
2 5
2 7

e.g. value 5 is common to group 1 and group 2

I think that from your explanation this is not possible but just confirming

Regards

Frederico Fonseca
SysSoft Integrated Ltd

FAQ219-2884
FAQ181-2886
 
You are correct, values will not show up in multiple groups.
 
Is this limited to 3 groups or can you have a dynamic number of groups?


You've got questions and source code. We want both!
Here at tek tips, we provide a hand up, not a hand out.
 
Ok... got something that works (or seems to work).

acknowledgement is due to who did the most difficult part of the code. - part of the code below was taken from That is the good stuff.

The bad stuff e.g. combos, sub_comb and sub2 tables are all mine.

Code:
drop table #set
CREATE TABLE #Set (GroupID int, Value int);

INSERT INTO #Set 
VALUES  (1,8),(2,7),(2,9),(3,1),(3,6),(3,3)
GO


DECLARE @N int = (SELECT Count(*) FROM #Set);
WITH L0 AS (SELECT 1 N UNION ALL SELECT 1),
L1 AS (SELECT 1 N FROM L0, L0 B),
L2 AS (SELECT 1 N FROM L1, L1 B),
L3 AS (SELECT 1 N FROM L2, L2 B),
L4 AS (SELECT 1 N FROM L3, L3 B),
L5 AS (SELECT 1 N FROM L4, L4 B),
Nums AS (SELECT Row_Number() OVER(ORDER BY (SELECT 1)) Num FROM L5),
Nums1 AS (
   SELECT Num, P1 = (Num & 0x55555555) + ((Num / 2) & 0x55555555)
   FROM Nums
   WHERE Num BETWEEN 0 AND Power(2, @N) - 1
), Nums2 AS (
   SELECT Num, P2 = (P1 & 0x33333333) + ((P1 / 4) & 0x33333333)
   FROM Nums1
), Nums3 AS (
   SELECT Num, P3 = (P2 & 0x0F0F0F0F) + ((P2 / 16) & 0x0F0F0F0F)
   FROM Nums2
), BaseSet AS (
   SELECT Ind = Power(2, Row_Number() OVER (ORDER BY Value) - 1), *
   FROM #Set
)
,combos as (select n.num
                  ,s.value
                  ,s.groupid
                  ,count(*) over (partition by num) as x
   
             FROM
                Nums3 N
                INNER JOIN BaseSet S
                   ON N.Num & S.Ind <> 0
           )
,sub_comb as ( select num
                     ,(select count(*)
                        from combos c2
                       where c2.num = c1.num
                         and c2.groupid = c1.groupid
                         and c2.value <> c1.value) as cnt
                        from combos c1
             )
,sub2 as (select num
                ,max(cnt) as cnt
            from sub_comb
            group by num
            having max(cnt) = 0
)
select dense_rank() over (order by c1.num) as rnk
      ,value
from combos c1
inner join sub2 c2
on c1.num = c2.num

Regards

Frederico Fonseca
SysSoft Integrated Ltd

FAQ219-2884
FAQ181-2886
 
I had checked out that same post on SO and was trying to wrap my head around it while waiting for other possible responses on here. I got sidetracked and still haven't had the time to sit down and go through what it's actually doing. I'm going to come up with several test cases to run through your code. Do you know if this will work with a variable number of input groups? I see the L1,L2,L3,Combo,Sub,Sub2,etc references and immediately think that I'm going to be stuck with an input group limit without even looking into it. Nonetheless, thank you for putting together that response
 
Sorry.. I was wrong.. the way that sql is done, it will have a quite short limit due to

WHERE Num BETWEEN 0 AND Power(2, @N) - 1
and
SELECT Ind = Power(2, Row_Number() OVER (ORDER BY Value) - 1)

and that is only 30 rows max in your set


i'll look and see what else can be done

Regards

Frederico Fonseca
SysSoft Integrated Ltd

FAQ219-2884
FAQ181-2886
 
ok...

you will need to try it with a big dataset.. may be very very slow though.

on the sql I gave you , replace power(2 ...
with
Power(convert(float,2)

apart from the above, what may kill you is the time it takes when number of entries goes above 25 - this is because of the recursive cte nums

on a vm I7 with 2 threads allocated to it at 90% cpu usage, it takes 3 minutes just to generate one of the powers for 30 entries on the set (generates 1 073 741 824 entries) - multiply this by 2
if you have hundreds or even thousands of entries it will take ages


Regards

Frederico Fonseca
SysSoft Integrated Ltd

FAQ219-2884
FAQ181-2886
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top