Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Extracting numbers from alphanumeric strings

Status
Not open for further replies.

SubsMah

Programmer
Jan 15, 2012
3
US
Hi,

I need some help to write a formula in excel to extract some numberic fileds from a particular string.

I will have strings like:
ABC.T816911.IBB.C423800.TC.CD
QBR.R816411.IBB.CB532401.TC06.RAK.NOR

I would like to extract the numeric field of length 6 and starting with a 4 or 5.

In the above example I would like to get 423800 and 532401 respectively.

Please ignore my unawareness of these silly things.

Thanks
 
You need to specify some more information. Do the numbers always appear at the same place in the string? Are they always terminated by "."?
Can there only ever be one valid number in each string? If there are more than one valid numbers in a string, do you want only the first one, or the last, or all of them? If there is more than one found, how do you want the answers returned? As an array, as a comma-separated string, what?

Meanwhile, look up the MID function in help and you can probably figure out the rest for yourself. If you do, let us know what your solution was. If not, come back with the answers to the above questions.

Tony

 
Hi Tony,

Thanks for your quick response.
the number will be terminated by a "." always and there will not be more than one such numbers staring with 4/5 and of lenght 6.

So in a string I will have only one 4NNNNN. or 5NNNNN.

As of now I have tried with something like the below and have been able to do it for numbers starting with 4.

=ISNUMBER(1*1*(MID(A1,IF(ISNUMBER(1*MID(A1,FIND(4,A1,1),7))=FALSE,FIND(4,A1,B1+1),0),7)))

Thanks
 
I don't understand what you are trying to do here. I thought you wanted the number itself extracted. Your function returns a boolean.

Are you saying this function works with your data? What is in cell A1? What is in cell B1?

Tony
 

I'd use a UDF...
Code:
Function TheNumber(s As String) As String
    Dim i As Integer, x As String
    
    For i = 1 To Len(s) - 6
        Select Case Mid(s, i, 1)
            Case "4", "5"
                x = Mid(s, i, 6)
                If IsNumeric(x) Then
                    TheNumber = x
                    Exit Function
                End If
        End Select
    Next
End Function
Paste this code into a MODULE.

Use this function just as you would any other spreadsheet function in another cell.

Skip,
[sub]
[glasses]Just traded in my old subtlety...
for a NUANCE![tongue][/sub]
 
To be honest so would I, but, given that this was posted in the office forum, I bet Skip could do it via a formula if he really wanted to ;)

If you are always sure that there will ne a certain number of characters before the first possible location of the number then you could modify Skip's code to include a start point, to avoid searching through impossible locations. Also, if you think the length might ever change, or be variable, then you could include the number length as an input too.

Perhaps something like this:
Code:
Function TheNumber(s As String, Optional start As Long = 1, Optional length As Long = 6) As String
    Dim i As Integer, x As String
    
    For i = start To Len(s) - length
        Select Case Mid(s, i, 1)
            Case "4", "5"
                x = Mid(s, i, 6)
                If IsNumeric(x) Then
                    TheNumber = x
                    Exit Function
                End If
        End Select
    Next
End Function

The default values mean that if you want to start at 1 and the length is 6, you don't need to enter anything but the location of the cell where the string is stored.

Do you know how to use the above suggestion?

If not, it is very straightforward. Look up user functions or post back here.

Tony
 
Thanks a lot for the help. It worked with the UDF properly.

And sorry if I was not clear with my requirements.

 


and you can include an argument for the values that you might look for, rather than assuming ONLY 4 & 5...
Code:
Function TheNumber(s As String, LookFor As String, Optional start As Long = 1, Optional length As Long = 6) As String
    Dim i As Integer, x As String, a, idx As Integer
    
    a = Split(LookFor, ",")
    
    For i = start To Len(s) - length
        For idx = 0 To UBound(a)
            If a(idx) = Mid(s, i, 1) Then
                x = Mid(s, i, 6)
                If IsNumeric(x) Then
                    TheNumber = x
                    Exit Function
                End If
            End If
        Next
    Next
End Function
where the argument LookFor would have a Text value like [highlight]"4,5"[/highlight] ie the numeric digit(s) of interest, separated by a COMMA.

Skip,
[sub]
[glasses]Just traded in my old subtlety...
for a NUANCE![tongue][/sub]
 


oops, Tony. I just noticed that the argument length ALSO needs to be applied to the Mid function...
Code:
Function TheNumber(s As String, LookFor As String, Optional start As Long = 1, Optional length As Long = 6) As String
    Dim i As Integer, x As String, a, idx As Integer
    
    a = Split(LookFor, ",")
    
    For i = start To Len(s) - length
        For idx = 0 To UBound(a)
            If a(idx) = Mid(s, i, 1) Then
                x = Mid(s, i, [highlight]length[/highlight])
                If IsNumeric(x) Then
                    TheNumber = x
                    Exit Function
                End If
            End If
        Next
    Next
End Function


Skip,
[sub]
[glasses]Just traded in my old subtlety...
for a NUANCE![tongue][/sub]
 
Taking the above ideas and sticking them in a regex solution...
Code:
Public Function TheNumber(s As String, LookFor As String, Optional length As Long = 6) As String

Dim m As Variant

With CreateObject("VBScript.Regexp")

    .MultiLine = False
    .Global = False
    .IgnoreCase = True
    .Pattern = "[" & LookFor & "][0-9]{" & length - 1 & "}"
    For Each m In .Execute(s)
        TheNumber = m.Value
    Next m

End With

End Function
The only difference to Skip's use of the LookFor parameter is that it shouldn't be separated with a comma in this variant. So if you were looking for numbers starting 4 or 5, then you would use "45" as the input.

I know this post isn't needed, but I've not had the chance to use Regex in VBA (or anything in VBA for that matter) for a long while and wanted to keep my skills up [tongue]

Andy
---------------------------------
Zebracorn: 50% Zebra, 50% Unicorn = 100% Real.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top