Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Nested Sets: When an element has more than one parent 1

Status
Not open for further replies.

ookete

Programmer
Oct 5, 2004
180
US
Hello,

I have a SQL Server table uses the "nested sets" approach (ala Joe Celko) to storing data as a hierarchy in a static table. This has worked wonderfully for us for two years, but now I am faced with the challenge of an element having multiple parents... and I could use some advice.

From what I have read, most hierarchical schemes assume that a child resides under one tree. In the Nested Sets, a node cannot span multiple trees due to the sequential nature of the "lft and rgt" columns. A very simple example:

PersonID, NAME, lft, rgt, Parent (un-needed, but a handy visual aid)
1, Roger, 1, 6,
2, Dolly, 2, 3, 1
3, Marie, 4, 5, 1
4, Paul, 7, 10,
5, Ed, 8, 9, 4

But what if Marie reports to both Roger AND Paul (or even more people)? Her "lfg and rgt" colums cannot represent duplicate membership. The idea I have is to remove the hierarchy colums (lft, rgt, Parent) from the above table and move them to another table which and joins to the above table as a one-to-many relationship as follows:

TreeID, PersonID, lft, rgt, Parent
1, 1, 1, 6,
2, 2, 2, 3, 1
3, 3, 4, 5, 1
4, 4, 7, 12,
5, 5, 8, 9, 4
6, 3, 10, 11, 4 <-- Second entry for Marie (ID 3)

Marie still only has one "node", but it is located under two different trees. Does this sound like a valid solution? Am I missing something here that will give me stress further down the road?

Thank you
 
I suppose it is, but it might make things less efficent. And if you have to do calculations for each group (say toal bonuses or salaries or days of vacation taken) then she could easily get double counted or you will need to rewrite all your code to handle the exceptions. And since having more than one supervisor is always a bad idea, I might tell HR that they need to assign her officially to only one person both for her sake and to keep the app from being made less efficent.

[rant]I don't mind changing the structure for a good reason, but for one exception that could be handled differently, I would be sure to let someone know the cost to change the system and the likelihood of new bugs slipping in and what the actual cost to the organization of this convenience of being able to assign someone to two people would be. You may end up having to make this change in the end, but I'd at least give a shot to letting them know there is a cost associated with this. (Of course perhaps, I'm so strong on this because I have been assigned to work for two people simultaneously and it was a horrible nightmare I hope to never revisit again.) Anyway organizationally no one should report directly to more than one person even if he or she supports two or more people. One should be the person who determines priorities and signs leave requests, etc. [/rant]

Questions about posting. See faq183-874
 
SQLSister,

Thank you for your response. Good point about being double-counted, etc.

The example I used was simplified. Most hierarchy articles I have read seem to use the "Employee" table because most people understand it. In my case, the actual table represents different machines in a production plant. A "tree" in the table represents one production line and all the machines in that line are the elements.

I have a case where two production lines share the same freezer. Each line sees the freezer as "belonging" to itself, like a node inside it's tree structure... but in reality, the freezer node actually belongs to two different lines. Since this is a rather expensive setup, you can imagine it's much cheaper for the database to conform to the machinery.

One idea we had was to "mimic" two freezers so that each production line tree in the table had it's own unique freezer, but this will cause other problems with both user clarity and code headaches. I'm not sure which way to go.
 
OK you can ignore my rant.

In that case, yes, I would probably go with the structure you specified, just keeping in mind that you will need to account for the possiblity of double counting when you do anything that involves math. In some cases it might not be a problem except in the grand totals.

Questions about posting. See faq183-874
 
It's too bad that you weren't aware of the multiple-parent requirement when the database was developed. You might have had some luck with The Universal Data Model. This would allow you to have as many children or parents for each relationship as you wanted.

-------------------------------------
It is better to have honor than a good reputation.
(Reputation is what other people think about you. Honor is what you know about yourself.)
 
I was thinking on this quite a bit and I think the answer.You are trying to find descendants of a given node. Or, in other words, you are trying to find whether a path exists between X and Y in a DAG. If there is a path X->A->B->C->Y then you can ignore the edge X->Y because you already found a path.

So when calculating (left, right) from (treeID, parentID) you must be sure that if that there is a path from X to Y then X is calculated before Y. Using BFS to calculate weights of nodes:

CurrentSet := [ <Root> ]
while (not empty(CurrentSet))
ChildSet := Children of CurrentSet
foreach (ChildSet as Child)
weight of Child += Sum(weight of Parents) + Sum(Number of Parents)
CurrentSet := ChildSet

and when calculating (left, right) just order the nodes by weight.
 
bean1975,

Would you elaborate on that? It sounds very interesting. I udnerstand the pseudocode calculating node weight, but I'd like to hear more about DAG, path, edge, and so on. Or could you direct me to some good articles?

Erik
 
I could see a payroll scheme that is more complex. We have a system in our new databse for financial aid where each aid code says if it will pay for books,tuition, or release funds to the student. Then each award code has a priority number. The awards relinquish funds in their order of priority lowest to highest. Plus the funds are split up in an attendance pattern like 9 months for Spring and Summer or 12 months, etc. Then the award is per year and each semester a certain weight or percentage is awarded for that semester.

So each supervisor could actually be assigned a weight for each payroll period and even have a priority of where to draw the funds from first. That way you could enorce a rule that a person can not be paid more than 100% of a normal pay. Then you can bill the department for the time as well. This way you can keep track of which department or supervisors are using the most time in work hours.

If you do not like my post feel free to point out your opinion or my errors.
 
It turns out that I was looking for a topological sort of said DAG. I'll post more as I find it out.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top