Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Complex Lookup - Extracting instance of text in cell

Status
Not open for further replies.

JimLes

IS-IT--Management
Feb 27, 2006
119
US
Okay, I thought I could almost figure out anything in excel but this has me stumped. I am trying to extract the number corresponding to each alleged in a cell. Now, the instance of "Alleged" can be anywhere in the cell and multiple times. Also, if "Alleged" is by itself in the cell, there is no preceding number.
Below is a sample of what I am dealing with. The goal is to pull out the "alleged" value in Cell B is equal to Cell A. In this sample, Cell B number 4. did not have anything so I need to return "".

Cell A:
1. Alleged2. 3rd Party Witness3. Alleged4. Alleged5. Complainant

Cell B:
1. Non-Management2. Supervisor3. Non-Management4.

Any assistance is greatly appreciated.
 
Hi,

This is not a solution but may help a little towards one.

Firstly VLOOKUP handles wildcards, so "*Alleged*" will be found if "Alleged" is anywhere in the cell.

Secondly, is there anyway you can break up the text with each "Alleged" in its own cell. It seems to me this would make things much simpler.

Thirdly, VLOOKUP will only find the first instance in a normal lookup. I have code which will find any nominated subsequent instance in a normal table but not sure if multiple data in the one cell will work.

I can supply the code if you wish.

HTH

Peter Moran
 
I am not opposed to pulling alleged into multiple columns. The problem is that there is not a space between the numbers. My thoughts: I need to plug a formula in Cell C that pulls the number of the first alleged which would be 1. Then I need to pull the value of the corresponding cell B which would be Non-Management. I need to do this for each alleged so the next alleged is 3, I need to pull Supervisor out of cell B. I am pretty close to a solution by using an example at the link at the bottom.


Cell A: 1. Alleged2. 3rd Party Witness3. Alleged4. Alleged5. Complainant

Cell B: 1. Non-Management2. Supervisor3. Non-Management4.

Cell C: =MID(A2,FIND("A",A2)-3,1) This gets me the first number of the alleged = 1



 
Hi,

Excel is an er, ummm.... a SPREADSHEET application and as such does its best and most effective work on tables, using some very standard analysis and formula tools.

Mining multiple occurrence of data buried within a cell, is not one of those standard tool tasks.

IMNSHO, you ought to code a solution. This might help. It is a general function named GetNthItem, using a delimiter and occurrence value as arguments
Code:
Function GetNthItem(sString As String, iItem As Integer, sDelimiter As String) As String
'SkipVought/2007 Feb 23/
'--------------------------------------------------
':returns the Nth Item in a delimited string
'--------------------------------------------------
    If iItem > UBound(Split(sString, sDelimiter)) + 1 Then
        GetNthItem = ""
    Else
        If iItem < LBound(Split(sString, sDelimiter)) + 1 Then
            GetNthItem = ""
        Else
            GetNthItem = Split(sString, sDelimiter)(iItem - 1)
        End If
    End If
End Function
So here's a way to use it. Set up your spreadsheet like this with the displayed results.
[tt]
1 2 3 4 5 6
1. Alleged2. 3rd Party Witness3. Alleged4. Alleged5. Complainant [highlight]1. [/highlight] 3. 4. ant
[/tt]
where the formula in [highlight]THIS[/highlight] cell is
[tt]
[highlight]=RIGHT(GetNthItem($A2,B$1,"Alleged"),3)[/highlight]
[/tt]
and it gets copied across for as many cells as is required (max number of occurrences of Alleged) with corresponding occurrence numbers in row 1.

Skip,

[glasses]Just traded in my old subtlety...
for a NUANCE![tongue]
 
Thanks, but this data is coming out of a database so I am trying to extract information from the columns.

So now I am able to pull the position of each alleged in the cell but I cannot pull the number just before the instance of the word Alleged:

Cell A: 1. Alleged2. 3rd Party Witness3. Alleged4. Alleged5. Complainant

Cell B: 1. Non-Management2. Supervisor3. Non-Management4.

Helper Column Cell C: =FIND("Alleged",A2,1) This gets me the position of the first alleged = 4

Helper Column Cell D: =FIND("Alleged",A2,D2+1) This gets me the position of the second alleged = 34

Now what I need to do is to add 2 more helper columns to pull the number before the positions of 4 and 34. The number will always be 3 characters behind.

Thx,
 
Then simply use the FIND() function using the PREVIOUS find value+1 as the [START NUM] reference.

Skip,

[glasses]Just traded in my old subtlety...
for a NUANCE![tongue]
 
Using that approch, I get
[tt]
4 34 44
1 3 4
[/tt]

Skip,

[glasses]Just traded in my old subtlety...
for a NUANCE![tongue]
 
BTW, the first row is POSITION, while the second row is the Number Value.

Skip,

[glasses]Just traded in my old subtlety...
for a NUANCE![tongue]
 
Perfect skip!! Can you give me a sample of the Find/Start Num formula as a reference?
 
You tried to do TOO MUCH in your formula that contains the FIND() function.

Check Excel HELP for the FIND() function.

If your data is in A2, then in C2 you can fut the FIND() function, just like you did, but there's a THIRD argument [Start_num], and THAT reference would be the PREVIOUS cell, in this case B2+1. So B2 should be EMPTY, plus 1 EQUALS 1; so start your fist search with position 1.

Your FIND() would return 4.

If you copy the formula to the right, being sure that the text references are made absolute to COLUMN, then the column D formula will reference c2 in the [Start_Num] and will return 34.

Skip,

[glasses]Just traded in my old subtlety...
for a NUANCE![tongue]
 
Great, so I was able to add a formula to find the position of the Alleged. Now, I need to extract the data between the numbers:

Cell A: 1. Alleged2. 3rd Party Witness3. Alleged4. Alleged5. Complainant

Cell B: 1. Non-Management2. Supervisor3. Non-Management4.

Helper Column Cell C: =FIND("Alleged",A2,1) This gets me the position of the first alleged = 4
Helper Column Cell D: =FIND("Alleged",A2,D2+1) This gets me the position of the second alleged = 34
Helper Column Cell E: =MID($A2,FIND("Alleged",$A2,C2)-3,1) This returns the number 1 for the first alleged
Helper Column Cell F: =MID($A2,FIND("Alleged",$A2,D2)-3,1) This returns the number 3 for the second alleged

I am using the following formula to try and get the value after the number of the alleged but it is not working so good.
=MID(A2,FIND(E2,A2,1)+2,FIND(F2,A2,1)-FIND(E2,A2,1)-3)

This works well if two alleged are together, but in the case above, they are 1 and 3 and my formula pulls all the data between them.
What I want is top pull in Column G is the first alleged value of Non-Management and in Column H, I need to pull Non-Management to match 1 and 3 values in column b.

Any ideas on how to extract the word after the number? I am so close!!!
 
Well your approach may have a glaring fault. What if there are more than one 'steps' between Alleges?

Skip,

[glasses]Just traded in my old subtlety...
for a NUANCE![tongue]
 
Yes, I agree. Somehow I need to maybe use "ISNumber" to extract the word between numbers. For example, in my example, if I can get all the text after the 1 but before the 2, Ishould be good in all scenarios within the data set.
 
Well you have the START of each Alleged and the LENGTH of Alleged, so the In betweens should be pretty simple.

Skip,

[glasses]Just traded in my old subtlety...
for a NUANCE![tongue]
 
Well, this formula works pretty well:

=MID(A2,FIND(E2,A2,1)+2,FIND(F2,A2,1)-FIND(E2,A2,1)-3)

But it returns something like this: Non-Management2. Supervisor

Cell B: 1. Non-Management2. Supervisor3. Non-Management4.

I can't seem to get it right when there is another value number in between. In the example I gave, the value in column B for 1 is Non Management and for 3 it is Non Management. When the values in column B that I am trying to pull are in sequence with a number, this works, but the values have something in between, it throws it off.

You are correct, I have the start but I think I need to find the value between the next number for this to work.
 
Frankly, if it were me, I'd do this as a user defined function in VBA.

Skip,

[glasses]Just traded in my old subtlety...
for a NUANCE![tongue]
 
Okay. BOTTOM LINE: What is your parsing objective here? EVERY SINGLE one of them, for the entire project? Big picture. How do you intend to use this data? I'm getting the picture that 'Alleged' is not only what you are looking for.

Your data is being extracted from a database. Do you have ANY control over the data in this database of the query that extracts the data. It is a severe crime to store data in this way. And it is a sore transgression to join data results like this in a query. How has this data come to be???

Skip,

[glasses]Just traded in my old subtlety...
for a NUANCE![tongue]
 
Trust me, I am pulling my hair out. I have never see such a mess come out of a database before. It is a third party vendor and I have already complained about this. From what I know, other clients have complained as well. With that said, my goal has always been to pull the alleged value from the corresponding cell.

I managed to solve it on my own........here is the logic:

Column AN: 1. Alleged2. 3rd Party Witness3. Alleged4. Alleged5. Complainant

Column AP: 1. Non-Management2. Supervisor3. Non-Management4.

Helper Column E: =LEN(A2)-LEN(SUBSTITUTE(UPPER(A2),".","")) This counts the periods in the first column cell value. I used this to count the number of values in the cell. The one thing about this data set output is that if there is a number, it is always followed by a period.

Helper Column F: =IF(ISERROR(FIND("Alleged",$AN2,1)),"",FIND("Alleged",$AN2,1)) This gets me the position of the first alleged = 4
Helper Column G: =IF(ISERROR(FIND("Alleged",$AN2,F2+1)),"",FIND("Alleged",$AN2,F2+1)) This gets me the position of the second alleged = 34
Helper Column H: =IF(ISERROR(FIND("Alleged",$AN2,G2+1)),"",FIND("Alleged",$AN2,G2+1)) This gets me the position of the third alleged
Helper Column I: =IF(ISERROR(FIND("Alleged",$AN2,H2+1)),"",FIND("Alleged",$AN2,H2+1)) This gets me the position of the fourth alleged
Helper Column J: =IF(ISERROR(FIND("Alleged",$AN2,I2+1)),"",FIND("Alleged",$AN2,I2+1)) This gets me the position of the fifth alleged
** I stopped at 5 columns because I did not ever see more than 5 alleged.

Helper Column K: =IF(ISERROR(IF(F2="","",MID($AN2,FIND("Alleged",$AN2,F2)-3,1))),"",IF(F2="","",MID($AN2,FIND("Alleged",$AN2,F2)-3,1))) Returns the number 1 for the first alleged
Helper Column L: =IF(ISERROR(IF(G2="","",MID($AN2,FIND("Alleged",$AN2,G2)-3,1))),"",IF(G2="","",MID($AN2,FIND("Alleged",$AN2,G2)-3,1))) Returns the number 3 for the second alleged
Helper Column M: =IF(ISERROR(IF(H2="","",MID($AN2,FIND("Alleged",$AN2,H2)-3,1))),"",IF(H2="","",MID($AN2,FIND("Alleged",$AN2,H2)-3,1))) Returns third number if there is one
Helper Column N: =IF(ISERROR(IF(I2="","",MID($AN2,FIND("Alleged",$AN2,I2)-3,1))),"",IF(I2="","",MID($AN2,FIND("Alleged",$AN2,I2)-3,1))) Returns fourth number if there is one
Helper Column O: =IF(ISERROR(IF(J2="","",MID($AN2,FIND("Alleged",$AN2,J2)-3,1))),"",IF(J2="","",MID($AN2,FIND("Alleged",$AN2,J2)-3,1))) Returns fifth number if there is one

Output Column P: This works for all contingencies for finding the corresponding value for alleged in another cell.
=IF(A2="",IF(K2="","",IF(AND(E2+0>K2+0,K2<>"",L2=""),MID(AP2,FIND(K2,AP2,1)+3,FIND(K2+1,AP2,1)-FIND(K2,AP2,1)-3),IF(AND(E2+0=K2+0,K2<>"",L2=""),TRIM(RIGHT(AP2,LEN(AP2)-FIND(K2,AP2)-2)),IF(K2="1",MID(AP2,FIND(K2,AP2,1)+3,FIND("2",AP2,1)-FIND(K2,AP2,1)-3),IF(K2="2",MID(AP2,FIND(K2,AP2,1)+3,FIND(L2,AP2,1)-FIND(K2,AP2,1)-3))))))

Output Column Q: This pulls the value of the second alleged.
=IF(L2="","",IF(AND($E2+0>L2+0,L2<>"",M2=""),MID($AP2,FIND(L2,$AP2,1)+3,FIND(L2+1,$AP2,1)-FIND(L2,$AP2,1)-3),IF(AND($E2+0=L2+0,L2<>"",M2=""),TRIM(RIGHT($AP2,LEN($AP2)-FIND(L2,$AP2)-2)),IF(AND($E2+0>L2+0,L2<>"",M2<>""),MID($AP2,FIND(L2,$AP2,1)+3,FIND(L2+1,$AP2,1)-FIND(L2,$AP2,1)-3)))))

I won't go on but I think I have provided enough examples of how to solve for this if anyone is reading. I have other values to pull for each alleged in other columns so I can re-use columns K thru O again to pull the values.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top