Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Check if any text follows a numeric string and truncate the text

Status
Not open for further replies.

waubain

Technical User
Dec 13, 2011
200
US
I am screen scraping data from our legacy system. In our system a provider enters a score into a template freetext textbox (which I cannot change). The scraped data can take any of the below formats:

2
2.5
2-2.5
2.0-2.5
na

The problem arises when a provider enters some free text after the score:

2-2.5 no headaches
2-2.5 (no headaches)
2-2.5 **no headaches**
etc

I have used InStr and IfNumeric in the past to validate and truncate the string, but I am at a lost when the beginning of the string could be any letter or character and the string itself could contain a hyphen. The possibility is that there may be thousands of rows of data to manually fix one at a time.

Eventually I will do a calculation on all the scores. If the string is a range then take the average of two values first. This part of the code is working.

Any suggestions. Thank you in advance.

You don't know what you don't know...
 
Duane, thanks for the suggestion. I probably was not clear but if the value is a range such as 1-1.5, then I eventually have to average those two numbers to get 1.25. Thus the Val() function would not work, but was not familiar with that function so learned something new. On the way home I thought about using the InStr(MyString, " "). It is not bullet proof, but for this project should work.
Thanks again.

You don't know what you don't know...
 
If your text starts with required data (the first "word") and can be followed by space + any other string, then you can extract the first part using Split(YourInputString," ")(0)

combo
 
I would even go a step further: Split the test by the space, and then Split it again by the ‘-‘, and then evaluate each piece.

Code:
Dim ary() 
Dim aryDetl()
Dim i As integer
Dim x As Integer

ary = Split(YourInputString, “ “)
For i = LBoud(ary) To UBound(ary)
    aryDetl = Splir(ary(i), “-“)
    For x = LBoud(aryDetl) To UBound(aryDetl)[green]
        'Evaluate every element of aryDetl here[/green]
    Next x
Next i

Have fun.

---- Andy
 
Why?
If you have this: [tt]2-2.5 no headaches[/tt]
and Split it by a space (Split(YourInputString," ")(0)), you get "[tt]2-2.5[/tt]"

"if the value is a range such as 1-1.5, then I eventually have to average those two numbers to get 1.25"

How do you take [tt]2-2.5[/tt] and calculate the average?

If you Split it again by -, you get 2 element: 2 and 2.5, both are numeric, easy to add them and divide by 2, you get the average (2.25)

Unless there is another way to have [tt]2-2.5[/tt] and get 2.25 out of it.... :)

Have fun.

---- Andy
 
Sure, however the OP has already dealt with this:

>If the string is a range then take the average of two values first. This part of the code is working
 
It seems risky to count on there being a space between the last number and the trailing text.

This stuff was manually typed apparently without any input controls or validation.

So if you can have "2-2.5 no headaches" then you might also have "2-2.5no headaches", and if you do there is no space to split on.

Safer to parse through character-by-character and test each to check if it's a number or a letter or a dot or a dash and build numbers from that.
 
I ended up splitting on the space, although, as mintjulep noted this is not bullet proof. Luckily in this case there was always a space, but in other portions of the text I am obtaining, the free text with no data validation has been a major nuisance.

As always, thank you everyone for your suggestions.

You don't know what you don't know...
 
>this is not bullet proof

Quite true, but given that this is free text entry, I'd have thought that simply missing a space is often going to be the least of your worries concerning getting valid data ...
 
Analyzing free text in order to extract data is most often a iterative manual process of analysis, logical categorization, and extract technique.

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top