Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Avoid double counting 1

Status
Not open for further replies.

peac3

Technical User
Jan 17, 2009
226
AU
Hi,

We have a table containing a hierarchy of data which is account parent and children relationship so 1 parent account would have 1 to many children account.

Then we have a flag for whether account balance should be included as part of accounting calculation and the flag called INCL_COUNT.
So any account with INCL_COUNT flag should be included as part of the calculation total amount.

The problem is we have 1 parent account which has a total balance for all children on the same table so to dedup it we usually use the balance from the parent and ignore the children.
However, we found the data for the parent account has a different INCL_COUNT flag from the children (e.g. parent account has INCL_COUNT flag = 'Y' but the children account INCL_COUNT flag = 'N')

Hence now when the parent account has the flag = 'N', we need to check whether the child account have the flag 'Y' as per below example:

Code:
APP_ID 	PARENT_APP_ID 	BALN_A INCL_COUNT
123                   	3000	N         - because the parent account is flagged N we would have to search the children's flag
124     123          	1000	Y         - we take this child because it is flagged Y and the parent is N
125			2000	Y	  - we take this parent because it is flagged Y
126     123		1000	Y         - we take this child because it is flagged Y and the parent is N
127     125		1000	Y
128     123		1000	Y         - we take this child because it is flagged Y and the parent is N
129			1000 	Y
130	125	        1000 	Y

So the logic is
if the parent account is flagged Y we take the balance amount, if the parent account is flagged N we need to go through the children's flag before taking the amount.


So for the above example, the total balance amount should be 6000 (app_id children: 124+126+128 and parent: 125+129)
Any help would be greatly appreciated.


 
Can you show the data of both tables, not just one list? As far as you describe the INCL_Count is in both tables, is that also valid for the balance column? If so where is what data?


Chriss
 
Hi Chris,

All the data on parents' and children's accounts information are in the same table, not split into two tables which is challenging.
The differences are the children's accounts have value on PARENT_APP_ID and the parent's account does not.

Incl_count is the flag whether the balance should be included in the total or not... and it is on the same table.
 
peac3 said:
All the data on parents' and children's accounts information are in the same table
(app_id children: [highlight #FCE94F]124+126+128[/highlight] and parent: [highlight #E9B96E]125+129[/highlight])

Code:
APP_ID 	PARENT_APP_ID 	BALN_A INCL_COUNT
123                   	3000	N         - because the parent account is flagged N we would have to search the children's flag
[highlight #FCE94F]124     123          	1000	Y[/highlight]         - we take this child because it is flagged Y and the parent is N
[highlight #E9B96E]125			2000	Y[/highlight]	  - we take this parent because it is flagged Y
[highlight #FCE94F]126     123		1000	Y[/highlight]         - we take this child because it is flagged Y and the parent is N
127     125		1000	Y
[highlight #FCE94F]128     123		1000	Y[/highlight]         - we take this child because it is flagged Y and the parent is N
[highlight #E9B96E]129			1000 	Y[/highlight]
130	125	        1000 	Y

So, APP_ID = 124 is a child record to APP_ID = 125 (parent)?
And APP_ID = 126 and 128 are children records to APP_ID = 129 (parent)?
If so, how do you know that? Is there any relation of APP_ID and PARENT_APP_ID data?

peac3 said:
the children's accounts have value on PARENT_APP_ID and the parent's account does not.

APP_ID = 127 and 130 do have values in PARENT_APP_ID. Are they children to any parent record? If so, which one?

Or is it:

Code:
APP_ID 	PARENT_APP_ID 	BALN_A INCL_COUNT
[highlight #FCE94F]123[/highlight]                   	3000	N
124     [highlight #FCE94F]123[/highlight]          	1000	Y
[highlight #FCAF3E]125[/highlight]			2000	Y
126     [highlight #FCE94F]123[/highlight]		1000	Y
127     [highlight #E9B96E]125[/highlight]		1000	Y
128     [highlight #FCE94F]123[/highlight]		1000	Y
129			1000 	Y
130	[highlight #E9B96E]125[/highlight]	        1000 	Y

PARENT_APP_ID = 123 (children) belong to APP_ID = 123 (parent)
PARENT_APP_ID = 125 (children) belong to APP_ID = 125 (parent)
[ponder]
---- Andy

"Hmm...they have the internet on computers now"--Homer Simpson
 
Hello Andrzejek,

Apologise if my explanation is still confusing.
All the data on parents' and children's accounts are in the same table, with no segregation of the parent table and child table.

The way we know whether they are parent or children is from the parent_app_id column.
if app_id with blank parent_app_id means that app_id is the parent otherwise, it is the child of the value of the parent_app_id.

for the data above in summary is:

[ul]
[li]The APP_ID 123 is the parent has 3 children which are 124,126,128[/li]
[li]The APP_ID 125 is the parent has 2 children which are 127,130[/li]
[li]The APP_ID 129 is the parent has no children[/li]
[/ul]
 
So, in short if a parent record has Y, all it's children should have N in INCL_COUNT, but in case of ID 123 that's not the case, for example.

I would say fix that data problem, not with a specialized query but by fixing the flags to simply sum all data with INCL_COUNT='Y', which should be the goal.

The normal solution in data design is to not aggregate data in parent records, you'd always just keep the partial totals in child data and only need to sum child data in a separate child table.

That said, I wonder if the INCL_COUNT is reliable overall. The way youdescribe it is only reliable for parent records, but what if not even that? If you have misflagged records not only as in the 123 example but have a wrong parent total or orphaned child data the summing would never work out fine.

I think the sample is not covering all special casses you need to take into account, because as far as givcen, you could also get the 6000 total by completely ignorig the INCL_COUNT flag and just sum the parent record balances of record 123 (despite INCL_Count being 'N' for it!), record 125, and 129. 3000+2000+1000=6000, too.

Overall, this data, as it is, violates the principle of not storing redudant data and that is the problem to be fixed, not coming up with special queries navigating through unideal stored data.

Chriss
 
Thanks Chris,

Fixing the data problem is not an option for now as it will involve the data architect, etc.
we need to deliver the solution with the current dataset.
 
Weel, a more complete example is necessary to understand why you don't simple SELECT SUM(BALN_A) WHERE PARENT_APP_ID is NULL.

Chriss
 
Chriss' SELECT will add BALN_A for all 'children' 'parent' records, and looks like that's all that you want (need?).

But, if you do keep the redundant/calculated values (adding all 'children' BALN_A values and keeping it in BALN_A in a 'parent' record), then where do you get the value of 1000 for APP_ID = 129 from [ponder]

---- Andy

"Hmm...they have the internet on computers now"--Homer Simpson
 
Andrzejek,

my query adds all parents record balances as PARENT_APP_ID is NULL mens the record is no child as it doesn't point to a parent.

So I disregard the flag and still get the wanted result, which just points out the problem isn't described in full.

I could make it much more complicated and only take the partial sum from the child records in, if the parent record has a 'N' flag, but there would need to be a case proving that to be necessary.

Indeed you poiint out 129 is a parent record with no chilren, so indeed it would be a bad idea to only sum children, because sometimes it seems the parent balance isn't split and the history of data could likely always spanw with a major balance in a parent record that is likely but not always split.

In case of a simpler data structure you#d store that with balances in child records only and in the special case there is no split balance, you would start with a parent with one child record and that's how this should be structured by database normalization rules.

If you have something like invoives and partial payments like by installments, that should be accounted within separate payments data and the partial sums may then form the split up overall amount, but it still would always be unimportant about the overall total, which just needs to sum parent balances.

If you're interested in open payments then the flag should get the importance to flag the child records to sum and you'd do the query you thought I was doing, just summing child partial balances flagged 'Y' for "not yet paid" when thinking of it in the meaning of invoicing.

So I knid of refuse to deliver what could be done, as it makes no sense, and having that feeling about a solution is not providing help. I serve you better pointing out what actually needs to be fixed. It's not much and could be done in a flurry, you're making it seem like an impossible act to address and fix the actual real problem and that's where I refuse to service you.

Chriss
 
Oops! Sorry, 'parent' records. Edited.

It would be easy to divide this 1 table into 2:
'Parent'
[tt]Select * From MyTable
Where PARENT_APP_ID IS NULL[/tt]

'Children'
[tt]Select * From MyTable
Where PARENT_APP_ID IS NOT NULL[/tt]

This way you could SUM the BALN_A values and find where the discrepancies are to fix them.


---- Andy

"Hmm...they have the internet on computers now"--Homer Simpson
 
That's the way to split data, correct. You'd see a discrepancy in the case you already pointed out, 129 has no child balance, so it can#t be compared or verified agains something else.

Doesn't necessarily point out a problem in the balance.

The problem I see in all of this is with the incoimplete and ill definition of the INCL_COUNT flag. Alone the meaning rather is incl_sum, not just count.

If you can only rely on it to tell you what to sum in parent records, it swhould be null or empty in child records, but as it's all 'Y' except for the one parent record of 123. I showcased I can disregard it fully and still get to the desired 6000 result. So either this flag is completely unnecessary, unhelpful or you give a better sample case where it actually becomes important to react to at all. In short, the sample is not providing a complet enough geneeralization of the actual problem.

Overall, I have to say, peac3, no single post you had so far has a meaningful complete problem description to which one could answer confidently what you need as a solution.

Don't misunderstand that as refusing to help, I just have no good feeling giving you something you think is proper and next thing is you encounter a case it doesn't work anyway. So defining a problem is also a hard task, but unavoidable, sorry.

I can point it out so you can give feedback to whoever this all stems from, too, but that's the best I can offer.

Chriss
 
I tried this query on your table (named it PEAC3)
Code:
select
  t1.*
from
  PEAC3 t1
where
  t1.INCL_COUNT = 'Y' and t1.PARENT_APP_ID is null

union all

select 
  t2.* 
from 
  PEAC3 t1 inner join
  PEAC3 t2 on t1.INCL_COUNT = 'N' and 
              t2.INCL_COUNT = 'Y' and
              t1.APP_ID = t2.PARENT_APP_ID
;
and got this result
Code:
....+....1....+....2....+....3....+....4....+.
APP_ID  PARENT_APP_ID      BALN_A   INCL_COUNT
  125           -        2,000.00       Y     
  129           -        1,000.00       Y     
  124        123         1,000.00       Y     
  126        123         1,000.00       Y     
  128        123         1,000.00       Y     
********  End of data  ********

Then summing the column BALN_A gives the result you expected:
Code:
with my_query as (
  select
    t1.*
  from
    PEAC3 t1
  where
    t1.INCL_COUNT = 'Y' and t1.PARENT_APP_ID is null

  union all

  select 
    t2.* 
  from 
    PEAC3 t1 inner join
    PEAC3 t2 on t1.INCL_COUNT = 'N' and 
                t2.INCL_COUNT = 'Y' and
                t1.APP_ID = t2.PARENT_APP_ID
)
select sum(BALN_A) as total_balance_amount from my_query
;

Code:
....+....1....+....2....+....3....+....4..
                     TOTAL_BALANCE_AMOUNT 
                                 6,000.00 
********  End of data  ********
 
The only case where that would differ is if there ever is a child record with INCL_COUNT = 'N'. And that's the single fact that's not clearly stated in the problem. No matter how obvious it seems to be.

Now, there you have the exact point you're missing to tell, peac. Stating the obvious is often enough mandatory in IT problems. Otherwise, things are based on assumptions and that's not well defined.
And on the other side, this violates the "keep it simple" rule, because that flag has a convoluted meaning depending on whether it's a parent or has a parent with opposite flag or not, this is just bad design and this has to be addressed.

Besides that, every child record with INCL_COUNT='Y' is a slap in the face of whoever should handle this data, when the parent has 'Y', too, as it undermines the meaning of the flag. As sson as a parent record is flagged 'Y' that should trigger all its children to be flagged 'N'. And all in all that flag has no bite, as I said it earlier. Unless it can be used to exclude some child balances, but then it only needs to become 'Y' at all to flag child records with parent records flagged 'N'. And even in that case it's a field violating simple design rules of convoluted meanings and unnecessary dependencies.

Unless you can come up with a use case of this INCL_COUNT to be 'Y' in child records where it also is 'Y' in parent records, this is just bad design.

Chriss
 
You could simply use

Code:
UPDATE peac SET INCL_COUNT = 'N' FROM peac INNER JOIN peac t2 on t2.APP_ID = peac.PARENT_APP_ID and p2.INCL_COUNT='Y';

SELECT SUM(BALN_A) FROM peac WHERE INCL_COUNT='Y'

Which demonstrates, good data structure and correct data only need simple queries. I assume you'll not dare to change the data this way, even though it actually would just improve the flag and give it the simpler meaning. Instead of working around the problem that child data has INCL_COUNT='Y' when you don't want to include it in the summing, simply set it 'N' where it should be 'N'.´

And unless you can't explain how this has a meaning which the update undermines, no matter if you're the developer and designer of this datbase or have to take this data structure as given by some application which doesn't maintain the flag as it should, caution is not the price to pay to only solve problems with complex workaround queries.

You could also put it into a transaction you finally roll back and still get your result from that without a final change, with read isolation no other client would even temporarily see the uncommitted changes the update does only in your connection and transaction to determine the total, so you get the total and have no data change.

Chriss
 
Hi Mikrom,

Sorry for the delay I am living in Asia and the timezone is different than you guys :)
That is exactly the query I need, Thank you so much...


 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top