Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Excel - Extract Part of a string 1

Status
Not open for further replies.

EliseFreedman

Programmer
Dec 6, 2002
470
GB
Hi There

I am trying to extract the email address from a string. The email address is in the format
Freedman, Elise (UK) (first.last@domain.com).

The only part i am interested in is the part which occurs between the 2nd two lots of brackets. How do I extract this.
 
To find the position of the 2nd "("

=FIND("(",A1,FIND("(",A1)+1)

To find the 2nd ")"

=FIND(")",A1,FIND(")",A1)+1)

so:

=MID(A1,FIND("(",A1,FIND("(",A1)+1)+1,FIND(")",A1,FIND(")",A1)+1)-FIND("(",A1,FIND("(",A1)+1)-1)

Rgds, Geoff

We could learn a lot from crayons. Some are sharp, some are pretty and some are dull. Some have weird names and all are different colours but they all live in the same box.

Please read FAQ222-2244 before you ask a question
 
Thanks for this

It works like a dream.

Unfortunately since posting, I have discovered that in some cases there are more than 2 sets of brackets e.g

Last1, First1 (UK) (Scotstoun) (first1.last1@domain.com)
Last2, Last2 (Secretary to First3 Last3) (UK) (First2.Last2@domain.com)

How would I alter the formula to extract the text between the last set of brackets regardless of how many sets of brackets there are.

 
For an unknown number of brackets you're pretty much stuffed with standard formulae. a VBA based udf will probably be needed

Code:
Function Get_Email(rng As Range)
Dim stPos As Integer

stPos = InStrRev(rng.Text, "(")

Get_Email = Mid(rng.Text, stPos + 1, Len(rng.Text) - stPos - 1)

End Function

Rgds, Geoff

We could learn a lot from crayons. Some are sharp, some are pretty and some are dull. Some have weird names and all are different colours but they all live in the same box.

Please read FAQ222-2244 before you ask a question
 
This can be done with formulas.

Take this in two parts -
[tab]A) How many times does "(" appear?
[tab]B) How many characters to the right of the final "("?

[tab][green]=LEN(A1)-LEN(SUBSTITUTE(A1, "(", ""))[/green]
will tell you how many "("s you have.

Now replace the final "(" with a character that will never appear in your dataset. Double pipes should be safe. Notice that we use the above formula to determine which "(" is replaced:
[tab]=SUBSTITUTE(A1, "(","||", [green]=LEN(A1)-LEN(SUBSTITUTE(A1, "(", ""))[/green])
will return:
[tab]Last1, First1 (UK) (Scotstoun) [red]||[/red]first1.last1@domain.com)
[tab]Last2, Last2 (Secretary to First3 Last3) (UK) [red]||[/red]First2.Last2@domain.com)
[tab]Freedman, Elise (UK) [red]||[/red]first.last@domain.com)

Now throw all of that into a MID statement to extract everything to the right of the last parenthesis except for one character (the close paren.) and you get:
[tab][COLOR=blue white]=MID(A1, FIND("||", SUBSTITUTE(A1, "(", "||", LEN(A1) - LEN(SUBSTITUTE(A1, "(", "")))) + 1, LEN(A1) - FIND("||", SUBSTITUTE(A1, "(", "||", LEN(A1) - LEN(SUBSTITUTE(A1, "(", "")))) - 1)[/color]

I'm sure there's an array formula that is shorter - I'll tinker with it and post back if no one else beats me to it....

[tt]_____
[blue]-John[/blue][/tt]
[tab][red]The plural of anecdote is not data[/red]

Help us help you. Please read FAQ 181-2886 before posting.
 
Copy'n'Paste is a dangerous tool.... The second formula should (obviously) be:
[tab]=SUBSTITUTE(A1, "(","||", LEN(A1)-LEN(SUBSTITUTE(A1, "(", "")))
without the extra "=" in there

[tt]_____
[blue]-John[/blue][/tt]
[tab][red]The plural of anecdote is not data[/red]

Help us help you. Please read FAQ 181-2886 before posting.
 
Hi AnotherHiggens

I have tried the formula you suggested

MID(A1, FIND("||", SUBSTITUTE(A1, "(", "||", LEN(A1) - LEN(SUBSTITUTE(A1, "(", "")))) + 1,  LEN(A1) - FIND("||", SUBSTITUTE(A1, "(", "||", LEN(A1) - LEN(SUBSTITUTE(A1, "(", "")))) - 1)

However I am getting a pop up saying that there is an error in my formula. It is highlighting the second LEN. Not sure where I am going wrong
 
Well, this isn't much prettier. I only managed to shave a few characters off of my first formula.

But here it is anyway, just as an example of a different method:
[tab][COLOR=blue white]=MID(A1, LEN(A1) - MATCH("(", MID(A1, LEN(A1) - ROW(INDIRECT("1:" & LEN(A1))) + 1, 1), 0) + 2, MATCH("(", MID(A1, LEN(A1) - ROW(INDIRECT("1:" & LEN(A1))) + 1, 1), 0) - 2)[/color]
Again, this is an array formula and as such must be entered using [Ctrl] + [Shift] + [Enter].

This is adapted from something I saw here on Tek-Tips a while back - posted by yogia, I think. Or possibly KenWright - he's amazing with array formulas.

Anyway, the idea is that it looks at the string in A1 in reverse order and returns the placement of that character. We then use that to populate the MID function.

[tt]_____
[blue]-John[/blue][/tt]
[tab][red]The plural of anecdote is not data[/red]

Help us help you. Please read FAQ 181-2886 before posting.
 
Just saw your post, Elise.

I copied the formula from your post and pasted it into Excel with the three examples you've provided. It works fine for me.

What exactly does the error say? Is it too many/too few arguments?

[tt]_____
[blue]-John[/blue][/tt]
[tab][red]The plural of anecdote is not data[/red]

Help us help you. Please read FAQ 181-2886 before posting.
 
Its OK.

I just tried the formula in your 2nd post and it works like a dream. Thanks for all your help on this.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top