Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Sumproduct formula - determine if records consitent 8

Status
Not open for further replies.

IngDep

MIS
Feb 7, 2008
89
US
Have a 35,000+ row Excel worksheet with the data arranged similar to the following;

ProcCd---CostPerSvc-----CostPerSvcConsistent?
12345----10-------------Yes
12345----10-------------Yes
12345----10-------------Yes
24680----8--------------No
24680----8--------------No
24680----6--------------No

If I recall, the use of a sumproduct formula will allow me to input "Yes" or "No" in column C if the Cost Per Svc figure in column b is not consistent.

The use of the following IF statement is not working properly; =If(A5<>A6,"Yes",If(AND(A5=A6,B5=B6),"Yes","No"))

Any insight as to how I need to use a sumproduct to accomplish this?

 
If you notice issues with the recalc time when trying to determine the offending record, you can shorten the formula (albeit at the cost of a few #N/A errors) like this:
=IF(J2="","",IF(G2<>MODE(IF(B$2:B$40000=B2,G$2:G$40000,"")),"Exception",""))

You can further speed up recalc by shortening the range to something that covers the ProductCode with the most members. Neglecting problems at the beginning of the range, you might try this array formula in row 20. It looks forward and backward 20 rows for matching ProductCodes:
=IF(J20="","",IF(G20<>MODE(IF(B2:B40=B2,G2:G40,"")),"Exception",""))

Brad
 
Two alternate approaches...

Use advanced filter, Copy to new location, unique values only

Create a pivottable based on your extracted data. Rowfields for ProcCD, CostPerSvc and count of ProcCD as datafield.

In the field settings Sort ProcCD field based on the Count of ProcCD, descending and do not total this field. Any rows with a count greater than 1 indicates an inconsistency.

Could then copy these to form the criteria range of an advanced filter on your original data...and only rows where thee is an inconsistency would be visible.

....
Or simply
Create a pivottable based on your extracted data. Rowfields for ProcCD, CostPerSvc and count of ProcCD as datafield. Any blanks in the ProcCD column indicate inconsistent data.

Gavin
 
byundt,

Did try your formula -

=IF(J2="","",IF(ISNA(MODE(IF(B$2:B$40000=B2,G$2:G$40000,""))),"Exception", IF(G2<>MODE(IF(B$2:B$40000=B2,G$2:G$40000,"")),"Exception","")))

Specifically, I used the following;

{=IF(H22="","",IF(ISNA(MODE(IF(D$6:D$13376=D22,H$6:H$13376,""))),"Exception", IF(H22<>MODE(IF(D$6:D$13376=D22,H$6:H$13376,"")),"Exception","")))}

Note, column D contains the procedure code and column H contains the Cost/service figure.

"Exception is displayed down the column for each record.

Upon reviewing a sample of records within the worksheet, I noticed that "FALSE" does appear in the column where I have the original formula for five records that appear to have the same cost per service figure.

Then, when I drilled down, it appears that 26.85 is displayed on the worksheet and in the formula bar for four records but one record have 26.845 displayed in the formula bar.

Questions - 1) Shouldn't the exception formula display "Exception" for the record where 26.845 is displayed in the formula bar?

2) What is the most expedient method to not only display 26.85 on the worksheet but also have 26.85 displayed in the formula bar for the five records?

 
IngDep,
Are you saying that the array formula returns "Exception" for every single row of data? This can happen if column H contains text that looks like numbers. The MODE function needs real numbers.

To convert text into real numbers:
1) Copy a blank cell
2) Select the data to be converted
3) Open the Edit...Paste Special menu item, and choose the option for "Add"

Are you saying that the array formula is not detecting the odd value? Could you post a workbook that reproduces the problem? I don't need to see more than 10 or 20 values.

Brad
 
For alternative approach:
In Gavin's suggestion concerning using pivot table: ProcCd as row field, CostPerSvc as data field with population variance or standard deviation aggregation. Row field can be set (in advanced settings) sorted in descending order using CostPerSvc aggregated.
Top non zero rows will be inconsistent. They can be found with autofilter in raw data.

combo
 
The pivot table approach is a really quick way of identifying if there is inconsistent data (or with minor adaptations duplicate data) and doesn't fill your table with formulae.

Great idea Combo (have a star). Resultant list can be used as criteria for advanced filter as per my first suggestion (to hide all consistent records in the original list) or you can autofilter to look at one at a time.

My second suggestion has merit if you just want to view the inconsistent values and don't need to go back to the original data to resolve.

Even if you use the helper field approach it can be useful to summarise in a pivot table - just add CostPerSvcConsistent? as a page field to enable you to just display the inconsistent items.

Gavin
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top