Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

fill the gap of hierarchy data

Status
Not open for further replies.

peac3

Technical User
Jan 17, 2009
226
AU
Hi,

we have a task to validate our hierarchy data and whether there are any gaps in our table and compare it with our source table.
The hierarchy that we check is only within our client.

our data structure is below we need to identify the gap

below is the source table which we need to align to.
Code:
childID parentID HierarchyId ClientFlag
012      123         1            Y
123      234         2 
234      456         3
456      789         4            
789      101         5            Y
101      102         6            Y

First, we must connect the gap within our client (ClientFlag = 'Y') by generating a new record.
Hence the desired output would be:

Code:
childID parentID HierarchyId ClientFlag
012      123         1            Y
[b]123      789         2            Y[/b]
789      101         3            Y
101      102         4            Y

We need to generate new record (e.g. second record in bold) to fill the gap before we can compare link children and parent?
Any help would be greatly appreciated.

Thanks.


 
The way I take it is - you want to:
1. keep the BLUE records (where [tt]ClientFlag = 'Y'[/tt]) and
2. out of RED records (where [tt]ClientFlag IS NULL[/tt]) create 1 record where [tt]childID[/tt] is first value out of RED records and [tt]parentID[/tt] is last value out of RED records, and then
3. re-number [tt]HierarchyId[/tt] column:

[pre]
childID parentID HierarchyId ClientFlag[blue]
012 123 1 Y[/blue][red]
[highlight #FCE94F]123[/highlight] 234 2
234 456 3
456 [highlight #FCE94F]789[/highlight] 4 [/red] [blue]
789 101 5 Y
101 102 6 Y [/blue]
[/pre]

[ponder]

---- Andy

"Hmm...they have the internet on computers now"--Homer Simpson
 
Andy,

Of course I don't know what peac3's answer to that will be, but I also understand it this way. So the data is a singly linked list, which is complete and intact in itself, some list elements are just not marked with the ClientFlag "Y" and the task is to skip over these elements with a shortcut. That is established with a new record combining what you highlighted and then renumbering the HierarchyId.

Not impossible to do, but in case there are multiple such linked lists in the data you can easily create shortcuts linking two lists that should not be linked together.

So, in short, more information is needed to handle this well.

For now, I can offer you a way to determine the new record you need with the help of window functions, see this SQLfiddle:

Code:
-- INIT database
CREATE TABLE List (
childID integer,
parentID integer,
HierarchyId integer,
ClientFlag bit)

INSERT INTO List Values (12,123,1,1)
, (123,234,2,0)
, (234,456,3,0)
, (456,789,4,0)
, (789,101,5,1)
, (101,102,6,1);

-- QUERY database
--SELECT parentId, HierarchyId-Lead(HierarchyId) over(order by HierarchyId) as step FROM List Where ClientFlag=1;
--SELECT childId, HierarchyId-Lag(HierarchyId) over(order by HierarchyId) as step FROM List Where ClientFlag=1;

with sublist as (SELECT childId, parentId, HierarchyId,
                Abs(HierarchyId-(Lead(HierarchyId) over(order by HierarchyId))) as step
                FROM List Where ClientFlag=1),
fill as (Select parentId as childId, Lead(childId) over(order by HierarchyId) as parentId,
        HierarchyId+1 as HierarchyID, step from sublist)

Select * from fill where step>1

Disclaimer: I intentionally didn't provide an INSERT statement to add this gap filling record, the step to also renumber the HierarchyId is missing, but I'd do this with a ROW_NUMBER() query as aftermath instead.
And all of this only works with 1 gap, not multiple gaps, that would require grouping by a gap number to be determined, I think. It surely is somethinhg to learn about the LEAD() and LAG() window functions.

Chriss
 
Hi Andrzejek,

Yes, that is correct because if it is the gap we would like to connect who the next parents would be.

is it possible technically?
 
If you can't provide more information, I guess this will get stuck at where it's at.

Could you query this on the table involved and post the result here?
Code:
; with counts as (Select HierarchyID, count(*) as hCount from "thetable")
Select MIN(HierarchyID) as minhid, MAX(HierarchyID) as maxhid, MIN(hCount) as mincount, MAX(hcount) as maxcount from counts

And if there are any more columns in the table, can you list them, too, please?

Chriss
 
peac3 said:
is it possible technically?

Anything is possible if you state the requirements fully and clearly.

Otherwise, we have to do a lot of guessing, assuming and filling the gaps in the logic – which you do not want to happen because (chances are) they will be wrong.


---- Andy

"Hmm...they have the internet on computers now"--Homer Simpson
 
Hi Chris,

Here is the output of your query (after I fixed the bug) with one client

Code:
; with counts as (Select HierarchyID, count(*) as hCount from "thetable" group by 1)
Select MIN(HierarchyID) as minhid, MAX(HierarchyID) as maxhid, MIN(hCount) as mincount, MAX(hcount) as maxcount from counts


Code:
minid maxhid miscount maxcount 
1      9       19221     19221

ClientID childID parentID HierarchyId ClientFlag
 1      012      123         1            Y
 1      123      234         2 
 1      234      456         3
 1      456      789         4            
 1      789      101         5            Y
 1      101      102         6            Y 
 1      102      103         7             
 1      103      104         8             
 1      104      105         9            Y
 
Thanks for that, but why do you only run this query on a subset of the data?

It's good you also indirectly answered that there is another column ClientID Any more? The table contains multiple lists for all or many clientIDs, I assume. The counts would have shown that, too. This is important informatoin you should have given earlier. As Andy already said, otherwise we need to gues/assume and likely produce a solution that will be wrong, i.e. in this case could shortcut one loose end of a list for customer X with another loose end of a list of customer Y. So of course it's important to stay within the list for the same ClientID.

With the new hard won knowledge about the ClientID whats this query giving you?
Code:
; with counts1 as (Select ClientId, HierarchyId, count(*) as hCount from "thetable" group by 1,2),
counts2 as (Select ClientId, MIN(HierarchyID) as minhid, MAX(HierarchyID) as maxhid, MIN(hCount) as mincount, MAX(hcount) as maxcount from counts1 group by 1)
Select MIN(minhid) as minminhid, MAX(minhid) as maxminhid, 
MIN(maxhid) as minmaxhid, MAX(maxhid) as maxmaxhid, 
MIN(mincount) as minmincount, MAX(mincount) as maxmincount,
MIN(maxcount) as minmaxcount, MAX(maxcount) as maxmaxcount from counts2

Chriss
 
On the other front I see my sqlfiddle indeed also works fine with multple gaps in one list, see
Now, if you provide the result of the new counts query, I'll see if this can easily be extended to give the full data for all clients by using PARTITION BY in the OVER clauses.

Chriss
 
Another guess/assumption (hope) that you may (should) have a Primary Key field (ID?) in your table:

[pre]
ID ClientID childID parentID HierarchyId ClientFlag
345 1 123 123 1 Y
346 1 123 234 2
347 1 234 456 3
348 1 456 789 4
349 1 789 101 5 Y
350 1 101 102 6 Y
351 1 102 103 7
352 1 103 104 8
353 1 104 105 9 Y[blue]
354 2 101 102 6 Y
355 2 102 103 7
356 2 103 104 8 [/blue]
357 3 456 789 4
358 3 789 101 5 Y

[/pre]

---- Andy

"Hmm...they have the internet on computers now"--Homer Simpson
 
Thanks Chris,

There are millions of records for the entire table.
I didn't want to put the whole list into the thread, there are multiple clients in the table - yes as per Andy's example.

I just would like to get the logic from you since the data structure may change depending on the solution.

Thanks for all your help though.

 
Heallo peac,

you don't need to list all data, that's why I wrote a counts query, which in the end only has one record.

The solution is already there in the sqlfiddle - for a single clientid. If you add the reduction to data of one clientid to the where clauses, it's already usable. If you need the fully gap reduced data, we just need to know more about it and the counts query would perhaps give us - at least me - the answer.

But since you don't adress Andys and my questions, I have to assume you don't get them - then why don't you ask about what you don't understand about the questions?

You already - again - gave some indirect answer, that what xyou initally posted is not all data. I would assume so, as 6 records are much easier fixed manually than with some code. You don't even get that we're on the track of a solution, as you make wrong assumptions. Sad, that it peters out this way, once more.

Chriss
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top