macro to find&replace variable length number from alphanumeric string 1

rkirbygbr · Sep 10, 2010

Hi all

Please see the attached WORD 2003 file. If the file doesn't attach here's sample data:

Dix, NM,Total, 4Dike, CJ,Total, 23Eddy, RM,Total, 21Eustacio, JR,Total, 5Ewan, TB,Total, 111Fairy, HK,Total, 3333Fields, AS,Total, 33487Fields, T,Total, 1Fleming, OJ,Total, 2Forster, DM,Total, 1Freud, U,Total, 813Garban, CK,Total, 1Montford, SD,Total, 5Goodfellows, RD,Total, 1

This data is a dummy extract of corrupted data whereby numbers (of variable length) have appended themselves to the first part of each surname.

What I need to do is to separate the alphanumeric surname string into it's component number and text parts and separate them with a comma i.e.

Dix, NM,Total, 4Dyke, CJ,Total, 23 etc

becomes

Dix, NM,Total, 4,Dyke, CJ,Total, 23, etc

You'd think it would be easy using Find and Replace, but I'm stumped.

If anyone knows how to handle this I'd be eternally grateful as I have a load of files where I have to do just this.

Cheers

Scott24x7 · Sep 10, 2010

Can you dump them into Excel columns? It would be a lot easier to manipulate the data, and then port them back to Word.
Doing the conversion there would be a snap.

Best Regards,
Scott

"Everything should be made as simple as possible, and no simpler." [hammer]

rkirbygbr · Sep 10, 2010

Hi Scott - I realise I can use the Excel route, but I'm doing the job for someone else and they will receive a whole bundle of Word files soon and I want them to have a one touch solution within that application. i.e. I won't be intervening in the future.

Cheers

-Richard

Scott24x7 · Sep 10, 2010

So, let me ask couple questions. (I can't get your file to download.)

Is the data always 4 fields? And is the 4th field always "appended" to the 1st field of the next record set? And is it continuous, or are their line feeds at the end of a line somewhere, and are they always a fixed number of "records"?

Best Regards,
Scott

"Everything should be made as simple as possible, and no simpler." [hammer]

fumei · Sep 10, 2010

If you are not going through the Excel route there is a rather brute-force aspect to achieving this in Word. It would be difficult even using the Words collection if the data is as you posted.

Dix, NM,Total, 4Dike, CJ,Total, 23Eddy, RM,Total,

In most cases the comma is followed by a space, BUT NH, CJ and RM ar enot followed by a space.

So I have to reiterate the request for details mentioned by Scott regarding the structure of the strings. Is every third item two characters, comma, NO SPACE?

Gerry

Och ammmmm, I think I need a shave.
- hirsute Scot, trying to decide

TonyJollans · Sep 10, 2010

What's the problem with Find and Replace? It seems pretty straightforward ...

Find: ([0-9]{1,})([A-z])
Replace with: \1,\2

Check "Use wildcards" and hit "Replace All"

Enjoy,
Tony

------------------------------------------------------------------------------------
We want to help you; help us to do it by reading this: Before you ask a question.

I'm working (slowly) on my own website

rkirbygbr · Sep 10, 2010

Dear all

Thanks for the replies. Scott and Gerry - two characters, comma, NO SPACE is not quite the shape of things as there can be 1 or more initials (4 is the most I've noticed).

I can't check the NO SPACE query at the moment as I can't access the complete set of files from home (where I am now), but I'll eliminate all the spaces if there are any. Therefore, it's safe to assume NO SPACE is the way I'll go.

Tony, thanks - I'll try your Find and Replace when I get into work again on Monday. As you can tell, I'm not too hot on the subject, although I did try something similar, but it wouldn't select the whole number and the first letter. I did try your Find and Replace in OpenOffice Writer though and the data looked like this after:

Dix, NM,Total, \1,\\1,\2ke, CJ,Total, \1,\\1,\2dy, RM,Total, \1,\\1,\2stacio, JR,Total, \1,\\1,\2an, TB,Total, \1,\2airy, HK,Total, \1,\2ields, AS,Total, \1,\2ields, T,Total, \1,\2leming, OJ,Total, \1,\2orster, DM,Total, \1,\2reud, U,Total, \1,\2arban, CK,Total, \1,\2ontford, SD,Total, \1,\2oodfellows, RD,Total, 1

I guess it will behave differently in Word.

Will report back on Monday.

Cheers

-Richard

rkirbygbr · Sep 13, 2010

Hi Tony

Just to let you know - the find and replace works perfectly, so thanks again.

Best wishes

-Richard

TonyJollans · Sep 13, 2010

Glad it worked. On re-reading it, my post seems a bit brusque - my apologies. Word's wildcard pattern matching is one of a kind and can be enormously frustrating sometimes.

Enjoy,
Tony

------------------------------------------------------------------------------------
We want to help you; help us to do it by reading this: Before you ask a question.

I'm working (slowly) on my own website

jges · Sep 13, 2010

The syntax Tony used is part of 'regular expressions' and is indeed a very powerful search/replace technique. I have found this website to be very helpful when creating a regular expression:

http://www.regular-expressions.info/index.html

I wasn't aware you could use regular expressions in Word, thanks for pointing that out, Tony.

TonyJollans · Sep 13, 2010

Word's version of regular expressions is not very regular, but it is better than nothing.

Enjoy,
Tony

------------------------------------------------------------------------------------
We want to help you; help us to do it by reading this: Before you ask a question.

I'm working (slowly) on my own website

fumei · Sep 13, 2010

Actually, the syntax Tony used is Word wildcards, not regex.

unknown

jges · Sep 13, 2010

It appears to borrow heavily from regex.

fumei · Sep 15, 2010

Oh yes, it does. However, the major difference is that to use RegEx in VBA you either need a set reference to Microsoft VBScript Regular Expression 5.5 (to use as early-binding) or use explicit reference to the RegEx library.

Code:

Function RegExpTest(patrn, strng)
   Dim regEx, Match, Matches   ' Create variable.
   Set regEx = New RegExp   ' Create a regular expression.
   regEx.Pattern = patrn   ' Set pattern.
' yadda yadda yadda

Word wildcards are available to use with no extra effort.

unknown

strongm · Sep 15, 2010

jges and Tony both mention regular expressions rather than regex. Word's 'wild cards' are indeed regular expressions, just not derived from the same sources as those used in the VBScript Regular Expressions (RegEx) library - Word wild cards are basically based on regular expressions initially used by old Unix text parsing applications such as vi, sed etc; RegEx triess to emulate Perl's regular expressions)

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

macro to find&replace variable length number from alphanumeric string 1

rkirbygbr

Programmer

Scott24x7

Programmer

rkirbygbr

Programmer

Scott24x7

Programmer

fumei

Technical User

TonyJollans

Programmer

rkirbygbr

Programmer

rkirbygbr

Programmer

TonyJollans

Programmer

jges

Technical User

TonyJollans

Programmer

fumei

Technical User

jges

Technical User

fumei

Technical User

strongm

MIS

Similar threads

Part and Inventory Search

Sponsor

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

macro to find&amp;replace variable length number from alphanumeric string 1

Programmer

Programmer

Programmer

Programmer

Technical User

Programmer

Programmer

Programmer

Programmer

Technical User

Programmer

Technical User

Technical User

Technical User

MIS

Similar threads

Log in

Part and Inventory Search

Sponsor

macro to find&replace variable length number from alphanumeric string 1