Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

macro to find&replace variable length number from alphanumeric string 1

Status
Not open for further replies.

rkirbygbr

Programmer
Feb 15, 2003
4
GB
Hi all

Please see the attached WORD 2003 file. If the file doesn't attach here's sample data:

Dix, NM,Total, 4Dike, CJ,Total, 23Eddy, RM,Total, 21Eustacio, JR,Total, 5Ewan, TB,Total, 111Fairy, HK,Total, 3333Fields, AS,Total, 33487Fields, T,Total, 1Fleming, OJ,Total, 2Forster, DM,Total, 1Freud, U,Total, 813Garban, CK,Total, 1Montford, SD,Total, 5Goodfellows, RD,Total, 1

This data is a dummy extract of corrupted data whereby numbers (of variable length) have appended themselves to the first part of each surname.

What I need to do is to separate the alphanumeric surname string into it's component number and text parts and separate them with a comma i.e.

Dix, NM,Total, 4Dyke, CJ,Total, 23 etc

becomes

Dix, NM,Total, 4,Dyke, CJ,Total, 23, etc

You'd think it would be easy using Find and Replace, but I'm stumped.

If anyone knows how to handle this I'd be eternally grateful as I have a load of files where I have to do just this.

Cheers
 
Can you dump them into Excel columns? It would be a lot easier to manipulate the data, and then port them back to Word.
Doing the conversion there would be a snap.


Best Regards,
Scott

"Everything should be made as simple as possible, and no simpler."[hammer]
 
Hi Scott - I realise I can use the Excel route, but I'm doing the job for someone else and they will receive a whole bundle of Word files soon and I want them to have a one touch solution within that application. i.e. I won't be intervening in the future.

Cheers

-Richard
 
So, let me ask couple questions. (I can't get your file to download.)

Is the data always 4 fields? And is the 4th field always "appended" to the 1st field of the next record set? And is it continuous, or are their line feeds at the end of a line somewhere, and are they always a fixed number of "records"?


Best Regards,
Scott

"Everything should be made as simple as possible, and no simpler."[hammer]
 
If you are not going through the Excel route there is a rather brute-force aspect to achieving this in Word. It would be difficult even using the Words collection if the data is as you posted.

Dix, NM,Total, 4Dike, CJ,Total, 23Eddy, RM,Total,

In most cases the comma is followed by a space, BUT NH, CJ and RM ar enot followed by a space.

So I have to reiterate the request for details mentioned by Scott regarding the structure of the strings. Is every third item two characters, comma, NO SPACE?

Gerry

Och ammmmm, I think I need a shave.
- hirsute Scot, trying to decide
 
What's the problem with Find and Replace? It seems pretty straightforward ...

Find: ([0-9]{1,})([A-z])
Replace with: \1,\2

Check "Use wildcards" and hit "Replace All"

Enjoy,
Tony

------------------------------------------------------------------------------------
We want to help you; help us to do it by reading this: Before you ask a question.

I'm working (slowly) on my own website
 
Dear all

Thanks for the replies. Scott and Gerry - two characters, comma, NO SPACE is not quite the shape of things as there can be 1 or more initials (4 is the most I've noticed).

I can't check the NO SPACE query at the moment as I can't access the complete set of files from home (where I am now), but I'll eliminate all the spaces if there are any. Therefore, it's safe to assume NO SPACE is the way I'll go.

Tony, thanks - I'll try your Find and Replace when I get into work again on Monday. As you can tell, I'm not too hot on the subject, although I did try something similar, but it wouldn't select the whole number and the first letter. I did try your Find and Replace in OpenOffice Writer though and the data looked like this after:

Dix, NM,Total, \1,\\1,\2ke, CJ,Total, \1,\\1,\2dy, RM,Total, \1,\\1,\2stacio, JR,Total, \1,\\1,\2an, TB,Total, \1,\2airy, HK,Total, \1,\2ields, AS,Total, \1,\2ields, T,Total, \1,\2leming, OJ,Total, \1,\2orster, DM,Total, \1,\2reud, U,Total, \1,\2arban, CK,Total, \1,\2ontford, SD,Total, \1,\2oodfellows, RD,Total, 1

I guess it will behave differently in Word.

Will report back on Monday.

Cheers

-Richard

 
Hi Tony

Just to let you know - the find and replace works perfectly, so thanks again.

Best wishes

-Richard
 
Glad it worked. On re-reading it, my post seems a bit brusque - my apologies. Word's wildcard pattern matching is one of a kind and can be enormously frustrating sometimes.

Enjoy,
Tony

------------------------------------------------------------------------------------
We want to help you; help us to do it by reading this: Before you ask a question.

I'm working (slowly) on my own website
 
The syntax Tony used is part of 'regular expressions' and is indeed a very powerful search/replace technique. I have found this website to be very helpful when creating a regular expression:
I wasn't aware you could use regular expressions in Word, thanks for pointing that out, Tony.
 
Actually, the syntax Tony used is Word wildcards, not regex.


unknown
 
Oh yes, it does. However, the major difference is that to use RegEx in VBA you either need a set reference to Microsoft VBScript Regular Expression 5.5 (to use as early-binding) or use explicit reference to the RegEx library.
Code:
Function RegExpTest(patrn, strng)
   Dim regEx, Match, Matches   ' Create variable.
   Set regEx = New RegExp   ' Create a regular expression.
   regEx.Pattern = patrn   ' Set pattern.
' yadda yadda yadda

Word wildcards are available to use with no extra effort.


unknown
 
jges and Tony both mention regular expressions rather than regex. Word's 'wild cards' are indeed regular expressions, just not derived from the same sources as those used in the VBScript Regular Expressions (RegEx) library - Word wild cards are basically based on regular expressions initially used by old Unix text parsing applications such as vi, sed etc; RegEx triess to emulate Perl's regular expressions)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top