Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Eliminating Duplicates in Excel 1

Status
Not open for further replies.

zidol

Technical User
Feb 9, 2016
24
US
We are in need of help with an Excel formula.

Below are two examples, the first one of the raw data and the second one shows the data how we need it to display after the formulas are applied.

Column A – Pt ID (patient id) - This column in the spreadsheet can have multiple rows with the same PT ID
Column B - Impression - There is only one choice for this field. The formula =IF(Sheet1!A3=Sheet1!A2," ",Sheet1!B2) creates a blank cell if there is multiple Impressions for the same PT ID as in Example 2
Column C – Treatment – There can be more than one choice for this field. We need a formula to eliminate duplicates and only shows one of each type of choice for the same PT ID as in Example 2.

Example 1 – Sheet 1
A ``````````````` ``````B ``````````````` ``````````C
PT ID ``````````````` Impression ``````````````` Treatment

111 ``````````````` Cardiac Arrest ``````````````` Epinephrine 1:10
111 ``````````````` Cardiac Arrest ``````````````` Sodium Bicarb 8.4%
111 ``````````````` Cardiac Arrest ``````````````` Epinephrine 1:10
111 ``````````````` Cardiac Arrest ``````````````` Sodium Bicarb 8.4%
112 ``````````````` Fever/Infection ``````````````` Albuterol
112 ``````````````` Fever/Infection ``````````````` Atrovent

Example 2 - Sheet 2 with formulas (the expected outcome after the missing formula is applied)
A ``````````````` ``````B ``````````````` ``````````C
PT ID ``````````````` Impression ``````````````` Treatment

111 ``````````````` Cardiac Arrest ``````````````` Epinephrine 1:10
111 ``````````````` Cardiac Arrest ``````````````` Sodium Bicarb 8.4%
111 ``````````````` Cardiac Arrest ``````````````` Epinephrine 1:10
111 ``````````````` Cardiac Arrest ``````````````` Sodium Bicarb 8.4%
112 ``````````````` Fever/Infection ``````````````` Albuterol
112 ``````````````` Fever/Infection ``````````````` Atrovent

 
HI,

There are no "blanks" in sheet 2.

Having a hard time reconciling your explanation with your example.

Skip,
[sub]
[glasses]Just traded in my OLD subtlety...
for a NUance![tongue][/sub]
 
Sorry - this is a formatting problem. I had changed the type to white, thinking it would display with a white background, therefore looking like blank fields. The white type is suppose to be blank fields in example 2. Hope this explains it.
 
Why would you keep the first two rows, for instance? They are TOTALLY IRRELEVANT it seems.

Skip,
[sub]
[glasses]Just traded in my OLD subtlety...
for a NUance![tongue][/sub]
 
There are 2 different patients, 111 and 112. There are several other columns (not shown on this example)the user is able to select more than one option. Because of that, it is creating multiple rows for the same patient. I'm not sure why the options chosen are showing up multiple time as in example 1, but I need for the duplicates to be blank cells.

For example - patient 111 - I only want to see displayed in column B "Cardiac Arrest" only once. In column C, I only want to see "Epinephrine 1:10" once and "Sodium Bicarb 8.4%" once. It is ok that there a multiple rows for the same patient, just need the duplicates to be blank. Once that is accomplished I do a total of each type of impression or treatment. Without the blanks, I would not get an accurate count.
 
This is how duplicates are usually eliminated, via Data > Data Tools > Remove Duplicates.

[pre]
PT ID Impression Treatment
111 Cardiac Arrest Epinephrine 1:10
111 Cardiac Arrest Sodium Bicarb 8.4%
112 Fever/Infection Albuterol
112 Fever/Infection Atrovent
[/pre]

Then use Conditional Formatting to shade the FONT white or same color as cell interior for the repeated values to make it APPEAR like this...

[pre]
PT ID Impression Treatment
111 Cardiac Arrest Epinephrine 1:10
111 Sodium Bicarb 8.4%
112 Fever/Infection Albuterol
112 Atrovent

[/pre]

Skip,
[sub]
[glasses]Just traded in my OLD subtlety...
for a NUance![tongue][/sub]
 
Here's a few suggestions, take them or leave them:
1. I think things would be better using a macro (or macros, see below) instead of using formulae in cells
2. I think your idea of having the Impression only once and with the only record could cause confusion. If you want the Impression only once, I would have it with the beginning Treatment, not the end.
3. I understand that you want to leave blanks for the duplicates in order to get an accurate count (I'm guessing so both the spreadsheets have the same row count). The problem I see with this is that depending upon the number of duplicate records, you could have a lot of gaps in your data. I would suggest that you have 2 macro's. The first is to highlight or delete the duplicates, and the 2nd is to trim the spreadsheet to eliminate duplicates/the blank spaces.
4. In order to prevent/minimize duplicates in the first place, I would make a conditional format using a =COUNTIFS(A:A,A2,B:B,B2,C:C,C2)>1 function to highlight cells if there are duplicates when they are inputting data.

If you need any help, just let me know how to do any of the above.
 
That will not work because I need to eliminate duplicates within the same PT Id, not the entire database. There are multiple PT Id's with the same impressions, treatment, etc.
 
I'm not certain by your statement
zidol said:
That will not work because I need to eliminate duplicates within the same PT Id, not the entire database. There are multiple PT Id's with the same impressions, treatment, etc.

Whether you use a macro to copy information from your source sheet or using formulae in cells, you can get data from your source spreadsheet without corrupting it. Using a macro gives you greater flexibility with what you want done with the information (e.g., you can copy the source sheet, make changes to the data (e.g., highlight, strikethrough or delete information) and will probably save you space on your hard drive that using formulae in cells. I realize you'll need to run the macro whenever you want to update the data, but you'll need to do the same thing if you're using formulae in the cells (e.g., copy & paste).

If item 4 above about using conditional formatting is the item that wouldn't work, I just thought that it might be a suggestion to minimize the amount of data that is duplicated so that whomever is suggesting the treatment would notice that it was already prescribed.
 
Zidol, if your comment was in response to Skip's suggestion, why don't you just copy the information from the source spreadsheet onto a new spreadsheet and then do what he suggested? Nobody is saying that you must be doing everything to your source datasheet.
 
ABSOLUTELY! Work with the appropriate subset in a separate table.

Skip,
[sub]
[glasses]Just traded in my OLD subtlety...
for a NUance![tongue][/sub]
 
Okay, here's an approach I've used many times. I have uploaded my sample workbook.

Sheet1 is the source data. You can have as many columns/rows as you need. Just need these three columns headings:
PT ID, Impression, Treatment

PT_ID List is a list of unique PT ID values. When you ACTIVATE this sheet, the list will be refreshed.

Sheet2 is the unique values in these three columns, based on TWO values you select in A1, A2, which are Data > Validation Drop Down selections, based on the PT_ID List. Hit the Run Query button to refresh the query based on your selections.

Make sure your Enable Macros when you open the workbook.

Skip,
[sub]
[glasses]Just traded in my OLD subtlety...
for a NUance![tongue][/sub]
 
Here's a solution using no macros. It simply uses an incomplete Pivot Table, where you can select the PT IDs to display.

Pivot looks like this:
[pre]
PT ID Impression Treatment
111 Cardiac Arrest Epinephrine 1:10
Sodium Bicarb 8.4%
112 Fever/Infection Albuterol
Atrovent
[/pre]

Skip,
[sub]
[glasses]Just traded in my OLD subtlety...
for a NUance![tongue][/sub]
 
 http://files.engineering.com/getfile.aspx?folder=ff6c8210-115a-4a7a-a9eb-fcf421c13641&file=tt-selectQuery.xlsm
Sorry for the delay in responding - I have been in marathon meetings!

A big thank you to both of you! I appreciate how much time you are spending to help me.

Thanks for sending me the test files. I think the pivot table is the way to go. I'm going to test it out on my database. I'm not that good at pivot tables, but I think I can follow your example.

I'll let you know how it turns out. Hopefully you will be done with me LOL

 
I can not thank you enough. This works perfectly with the pivot table!!
Again - thanks for your time and expertise.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top