Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Query that matches subgroups of records 1

Status
Not open for further replies.

VickyC

Technical User
Sep 25, 2010
206
CA
hi to all

I'm having trouble writing a query. Consider the table shown below The PK is formed on columns A and B:

Code:
[b]
 A      B          C [/b]
 1      840        3
 1      912        2
 1     1010        3

 2      840        3
 2      912        2
 2     1010        [b]2[/b]

 3      614        5
 3     1260        1

 4      840        3
 4      912        2
 4     1010        3

 5      614        5
 5     1260        1
...    .....      ..
800

Here's what I need to do. For each distinct value of A, I need to assign a value to column GroupNum in the output. These GroupNum values start at 1, and increase by 1 whenever a new (different) set of B and C values is encountered.

In the table above, (A = 1) and (A = 4) have exactly the same set of ordered B and C values, so they have the same GroupNum (namely, 1).
When (A = 2), the set of B and C values is slightly different, so a new value of GroupNum is assigned (namely, 2).
Finally, (A = 3) and (A = 5) share exactly the same sets of B and C values, so they are assigned a common GroupNum (namely, 3).

The desired output for the table shown above is...

Code:
 [b]
 A      GroupNum [/b]
 1        1
 2        2
 3        3
 4        1
 5        3
....
800

Many thanks for any clues. I just can't seem to get this query to work.
Vicky C
 
I waited patiently but nobody posted a proper solution, so I will post this complete and utter hack on the theory it will generate some other responses.

We have data like:

table patterns
Code:
col_a	col_b	col_c
1	840	3
1	912	2
1	1010	3
2	840	3
2	912	2
2	1010	2
3	614	5
3	1260	1
4	840	3
4	912	2
4	1010	3
5	614	5
5	1260	1

To me this is is the wrong format, and I don't like that it doesn't have record id's.

First step: Create new table with the three fields in the old table and a new number field. I made it an autonumber for ease.
Code:
INSERT INTO patterns2 ( cola, colb, colc )
SELECT col_a, col_b, col_c
FROM patterns;

Result:
Code:
ID	cola	colb	colc
1	1	840	3
2	1	912	2
3	1	1010	3
4	2	840	3
5	2	912	2
6	2	1010	2
7	3	614	5
8	3	1260	1
9	4	840	3
10	4	912	2
11	4	1010	3
12	5	614	5
13	5	1260	1

Second step: Make some long records.

If we do this:
3_record_patterns
Code:
SELECT a.cola, a.colb, a.colc, b.colb, b.colc, c.colb, c.colc
FROM (patterns2 AS a INNER JOIN patterns2 AS b ON a.cola=b.cola) INNER JOIN patterns2 AS c ON a.cola=c.cola
WHERE a.id < b.id 
and b.id < c.id
and a.id <c.id
ORDER BY 1, 2, 4;

we get:
Code:
cola	a.colb	a.colc	b.colb	b.colc	c.colb	c.colc
1	840	3	912	2	1010	3
2	840	3	912	2	1010	2
4	840	3	912	2	1010	3

but that only finds the patterns with three components because the join drops out the two pattern records.

So let's get the two pattern records:
2_record_patterns
Code:
SELECT *
FROM (SELECT
a.cola,
a.colb,
a.colc,
b.colb,
b.colc

from patterns2 a
inner join patterns2 b on a.cola=b.cola

where a.id < b.id 
)  AS q
WHERE a.cola in 
(
SELECT cola
from patterns2
group by cola
having count(cola) <3
);

That gets us:
Code:
cola	a.colb	a.colc	b.colb	b.colc
3	614	5	1260	1
5	614	5	1260	1

Now, how about we put them together?
long_records
Code:
SELECT  1 as [patnum], *
from 3_record_patterns
UNION select  1, cola, a.colb, a.colc, b.colb, b.colc, 'null', 'null'
from 2_record_patterns;
I added a number to play with and some place holder nulls.

That gets us:
Code:
patnum	cola	a.colb	a.colc	b.colb	b.colc	c.colb	c.colc
1	1	840	3	912	2	1010	3
1	2	840	3	912	2	1010	2
1	3	614	5	1260	1	null	null
1	4	840	3	912	2	1010	3
1	5	614	5	1260	1	null	null
which looks better to me.

Step three: What we need to do now is find the unique long patterns.

If we do
list_of_unique_patterns
Code:
SELECT patnum, a.colb, a.colc, b.colb, b.colc, c.colb, c.colc
FROM long_records
GROUP BY patnum, a.colb, a.colc, b.colb, b.colc, c.colb, c.colc;
we get:
Code:
patnum	a.colb	a.colc	b.colb	b.colc	c.colb	c.colc
1	614	5	1260	1	null	null
1	840	3	912	2	1010	2
1	840	3	912	2	1010	3

Hmm. Okay. Let's join that back to long_records and pick up the first group id (min) for each group.
numbered_unique_records
Code:
SELECT c.patnum, c.a.colb AS one, c.a.colc AS two, c.b.colb AS three, c.b.colc AS four, c.c.colb AS five, c.c.colc AS six, min(d.cola) AS sorter
FROM list_of_unique_patterns AS c INNER JOIN long_records AS d ON (c.c.colb= d.c.colb) AND (c.c.colc= d.c.colc) AND (c.b.colb= d.b.colb) AND (c.a.colc= d.a.colc) AND (c.a.colb= d.a.colb)
GROUP BY c.patnum, c.a.colb, c.a.colc, c.b.colb, c.b.colc, c.c.colb, c.c.colc
ORDER BY min(d.cola);

which yields:
Code:
patnum	one	two	three	four	five	six	sorter
1	840	3	912	2	1010	3	1
1	840	3	912	2	1010	2	2
1	614	5	1260	1	null	null	3

That sorter just happens to be 1, 2, 3 here but we can't count on that so we need a reliable numbering (do it as a MAX and not a MIN and you'll see it change). With more records it will skip. We'll use patnum.
unique_record_groups
Code:
SELECT (SELECT Sum(patnum) AS Total
FROM numbered_unique_records
WHERE numbered_unique_records.sorter <= T1.sorter) AS Total, one, two, three, four, five, six
FROM numbered_unique_records AS T1;
That is a running total of the 1's in patnum creating a nice 1,2,3... as far as we need.
Gives us:
Code:
Total	one	two	three	four	five	six
1	840	3	912	2	1010	3
2	840	3	912	2	1010	2
3	614	5	1260	1	null	null

Step four: Join back
Now with a numbered list of unique groups, we join back to the long records and id the group for each record:
desired_result
Code:
SELECT x.cola, y.total
FROM long_records AS x INNER JOIN unique_record_groups AS y ON (x.c.colc=y.six) AND (x.c.colb=y.five) AND (x.b.colc=y.four) AND (x.b.colb=y.three) AND (x.a.colc=y.two) AND (x.a.colb=y.one)
ORDER BY 1;
which gives us:
Code:
cola	total
1	1
2	2
3	3
4	1
5	3

Q.E.D.

Like I say, a hack from a hack. But it gets there.
 
Thanks so much, BigRed1212. I think that's about the most detailed explanation I've seen! I'm no expert, so it will take me a little while before I fully understand all of your steps, but on the surface, they seem very clear and well organized. Thanks for taking the time!
Vicky C.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top