Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Sumproduct or Countif - determine record(s) driving variance 3

Status
Not open for further replies.

BxWill

MIS
Mar 30, 2009
367
US
Have a 60,000+ records on 2010 MS Excel worksheet containing transactions for all of 2012 in the format displayed below.

Of the five accounts listed, accounts 123456, 123458, and 123459 will net out to zero for the Level2 category
for January 2012. I am interested in creating a formula that will display a "Y" in the column titled "DriverAccount"
if the account does not net to zero for the category by month. Data is sorted by Month (Oldest to Newest), then by Category (Asc), then by Amount (Smallest to largest) and then by DataSource (Asc).

Note, the total variance for the Level2 category for January 2012 is 30. Basically, I am trying to determine what account(s) are driving the
variances by month by category.

Is this possible? I am currently trying to formulate a sumproduct formula within a IF statement to perform.

Appreciate any insight as to how this can be accomplished!

Month-------DataSource---AccountNm-----AccountNbr-------Category-----Amount---Difference--DriverAccount---VarianceByCategoryByMonth
1/1/2012----Delta--------PointX----------123456------------Level2------------150------------------N---------------30
1/1/2012----Sigma--------PointX----------123456------------Level2------------150------0-----------N---------------30
1/1/2012----Delta--------PointY----------123457------------Level2------------120------------------Y---------------30
1/1/2012----Sigma--------PointY----------123457------------Level2------------110------10----------Y---------------30
1/1/2012----Delta--------PointZ----------123458------------Level2------------130------------------N---------------30
1/1/2012----Sigma--------PointZ----------123458------------Level2------------100------(30)--------N---------------30
.
.
.
1/1/2012----Delta--------PointT----------123459------------Level2------------125------------------N----------------30
1/1/2012----Sigma--------PointT----------123459------------Level2------------155-------30---------N----------------30
1/1/2012----Delta--------PointS----------123450------------Level2------------145------------------Y----------------30
1/1/2012----Sigma--------PointS----------123450------------Level2------------165-------20---------Y----------------30
2/1/2012----Delta...
 
Still trying to resolve...

Just to ensure that my logic is correct, I am attempting the following;

Step 1: Need a sumproduct to sum by month and category that in essence, is the total variance for the category by month

Step 2: Determine if difference column is netted to zero against any other record for the month for the category. Note, positive and negative values are displayed in the difference column. Therefore, I probably need to consider the absolute amounts are netted out...

Step 3: If value in difference column is not netted to zero for the category for the month, then I can display in another column the ratio of the difference relative to the total variance for the category by month. Further, I would like to display in another column that is maybe labeled "Rank" where 1,2,3, etc. is displayed that indicate the order of the ratio percentage.

Therefore, the two additional columns would be "Ratio" and "Rank."


Currently experimenting with something along the lines of "amount in the difference column divided by a sumproduct formula similar to =SUMPRODUCT((MONTH(A2:A60000)=A1)*(E2:E60000=E1)...

For the rank column, I am experimenting with something like =SUMPRODUCT (--(A2>$A$2:$A$60000),1/COUNTIF($A$2:$A$60000&""))+1

Appreciate any additional insight!

 
If you need only account numbers, use pivot table.
Create pivot table with AccountNbr as row field and Difference ad data field, with sum as domain aggregation function. If additional level of aggregation is required (as Category), add it as row field too. You should get all account numbers (+optional additional aggregation) with sums of differences.
The second step is to filter by data field. The easiest way is to add calculation field to pivot table, name it, say TEST, and create formula =(sum(Difference)=0). You should get an additional data field with 0s if sum of Difference for a given account is not 0 (that is numeric representation of FALSE) and 1s otherwise. Now in advanced settings of field AccountNbr set auto show to one bottom item using "sum of TEST" field.

combo
 
Thanks for the insight.

I currently have the data in a pivot table and then went back to source data - resorting it and considered the use of sumproduct along with countif functionality because pivot tables requires more memory to process.

I will re-visit the use of pivot tables but I believe that there is a less memory-intensive method involving the use of sumproduct to sum by category by month and use this to rank the difference column while allowing for the netting of absolute numbers.
 
FYI, you may want to align your data like this:

[pre]
Month DataSource AccountNm ...
1/1/2012 Delta PointX ...
1/1/2012 Sigma PointX ...
1/1/2012 Delta PointY ...
1/1/2012 Sigma PointY ...
1/1/2012 Delta PointZ ...
1/1/2012 Sigma PointZ ...
[/pre]

Just use [tt] [ignore] [pre] [/pre] [/ignore][/tt]tags

Have fun.

---- Andy

A bus station is where a bus stops. A train station is where a train stops. On my desk, I have a work station.
 
Pivot tables are much faster. May not significantly increase file size if you choose not to save data with pivot table. May consume less memory, as you do not fill cells with hundred thousand of formulas.

combo
 
Thanks for the insight

Reverted back to the use of the pivot table and will just manually go down the list of several hundred records and maybe create a cumulative amount column to assist in solving this issue.
 
Is it possible to create a cumulative amount column in a pivot table?

Currently attempting to create a cumulative amount column in the pivot table to provide a running total of the Variance column based on the total
variance for the month, not the variance at the point that I happen to review at any given moment.

Pivot table structure is:

Report Filter - Month

Column Labels - Category, Source of Data

Row Labels - AccountNbr, AccountNm

Values - Amount, Variance (calculated column)

 
Core problem is not resolved.

In essence, the issue is if MS Excel can distinguish among a list of records that have positive and negative amounts and select
the records that are not netted out and are driving the total variance.

For example, if I just have the following;

[pre]
Account# AccountNm Amount
123452 PointX -100,000
123453 PointY -95,000
123454 PointZ 95,000
123455 PointT 100,000
123444 PointS 75,000[/pre]

In this example, the total variance is 75,000 and the account that should be designated as the driver is "123444."

Are you aware of MS Excel functionality that can handle this problem?


 
I was not successful in creating a formula to flag the account(s) that are the drivers of the variance.

I reverted back to a pivot table and the use of a calculated column titled "variance" that displayed the difference between the
two Data Sources by account number and category.

On the worksheet that is the data source for the pivot table, I sorted by month, then category then by account number in ascending order.

Then, I created a difference column in the data source worksheet similar to =if(and(a7=a6,b7<>b6,d7=d6,e7=e6),f7-f6,"") and copied down to display the variance on every other row.

Finally, I used vlookup to populate every row of a second column titled "Difference2" so I could filter the records in the pivot table by the differences.

Manual process, but it worked!

However, I will continue experimenting with the combination of a sumproduct function, countif and the use of the absolute feature to generate a automated approach going forward.
 
Another attempt with pivot table: with Amount as data field use population variance as domain aggregation. This can be sorted automatically in descending order. No way to hide 0s, but accounts with non-zero variance, so with different entries, will be shown first.

combo
 
Yes, I wanted insight regarding the construction of a sumproduct formula to accomplish objective and still working on it.

My gut feeling is that a combination of a sumproduct along with a Countif function will be the starting point. At one point, I considered a nested if statement with a nested sumproduct/countif formula.

For example, entering a countif formula such as the following will allow me to determine the number of occurences of a specific number within a entire column.

=COUNTIF($J$8:$J$2254,J8)

Maybe, this formula can be tweaked somewhat to allow for the counting of absolute numbers in a column. For example, if I entered -120 and 120, I would receive "2." Then, dividing the number of items by the result....

In conclusion, I would like three additional columns, as previously mentioned, as displayed below;

DriverAccount - Where a "y" would display if the account is not netted out

Ratio - displays the percentage of the difference for the account relative to the total variance for the category for the respective month

Rank - displays the percentage ranking, e.g. 1,2,3, etc. - based on the ratio in descending order

Maybe, this is not even possible with MS Excel!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top