Remove many carriage returns 5

Quintios · Jul 23, 2003

I searched but the other two threads on the same subject didn't really fit.

I have various bits of text that are encapsulated in parens, like this (xxx). I want to get them each on their own lines, like this:

(xxx)
(xxx)
(xxx)

The problem is, the file looks like this:

(xxx)(xxx)
(xxx)
(xxx)(xxx)

I've tried this code:

Code:

Dim RegX, SearchPattern, ReplacedText, ReplaceString
Set RegX = NEW RegExp

SearchPattern = &quot;\)&quot;
ReplaceString = &quot;)&quot; & vbCrLf
RegX.Pattern = SearchPattern
RegX.Global = True
RegX.IgnoreCase = True
CurrentLine = RegX.Replace(CurrentLine, ReplaceString)

But that gives me:

(xxx)

(xxx)
(xxx)

(xxx)... and so on...

The ascii code is 0A 0D 0A 0D when the extra space shows up, but I have been unable as of yet to replace the two carriage returns with just one. I've tried using ASCII and I've tried using the "\n" character, but to no avail. Truthfully, I am not very familiar with RegExp.Replace and I'm probably using the wrong syntax, so if you know of the correct syntax I'd love to get another opinion.

Yes, one solution is to close the file, open it back up, and copy every non-blank line into a new file, but, well, yuk. I should be able to accomplish this without doing that using the RegExp object.

Any ideas?

Onwards,

Q-

onpnt · Jul 23, 2003

have you tried to use the match and not replace

____________________________________________________
_{Python????? The other programming language you never thought of!

thread333-584700}

http://www.onpntwebdesigns.com

Quintios · Jul 23, 2003

In the code I listed, there is a match pattern. When a match is found, then the replace should take place. So far I've been unable to get a match to work, or it turns it into an odd file with the black block characters (visible in Notepad).

Do you know what code to use to take

(xxx)(xxx)
(xxx)
(xxx)(xxx)

and turn it into

(xxx)
(xxx)
(xxx)
(xxx)
(xxx)

?????

Onwards,

Q-

spazman · Jul 23, 2003

Have you tried just

Replace(TextFromFile, &quot

", &quot

" & VbCrLf)

clarkin · Jul 23, 2003

the problem with spazman's Replace(TextFromFile, &quot

", &quot

" & VbCrLf) is that it will match &quot

" even when there is already a new line right after it - and insert another one.

how about this:

Code:

strFixedText = Replace(TextFromFile, &quot;)(&quot;, &quot;)&quot; & VbCrLf & &quot;(&quot;)

that should work for you

Posting code? Wrap it with code tags: [ignore]

Code:

[/ignore][code]CodeHere

[ignore][/code][/ignore].

Quintios · Jul 23, 2003

Well, I guess I should have been more specific. I'm sorry.

The elements are actually like this:

AI(XXX)DI(XXX)
AC(XXX)
DC(XXX)

But actually, the first character will always be an A or a D, so I could run it through twice (it runs pretty quick even on a PentiumII 400) with the pattern set to:

)A and
)D
to be
) vbcrlf A
) vbCrLf D

I'll check and see if it works. Good suggestions! Stars for ya'll!

Onwards,

Q-

clarkin · Jul 23, 2003

if you wanted to generalize it - ie, not hardcode )A and )D you will probably have to use regular expression matching as mentioned above..

Just use a reg exp pattern to find [close bracket NOT followed by carriage return] and replace them with [close bracket carriage return]. I know its possible with reg exps, but i'm afraid the correct syntax escapes me, hence my [] natural language notation

good luck

Posting code? Wrap it with code tags: [ignore]

Code:

[/ignore][code]CodeHere

[ignore][/code][/ignore].

Quintios · Jul 23, 2003

I may have to do that. However, I got an error when attempting to use:

ReplaceString = &quot

" & vbCrLf & "A"

Can't see a syntax error there. The error is:

Microsoft VBScript runtime error: Syntax error in regular expression

and it errors on this line:

CurrentLine = RegX.Replace(CurrentLine, ReplaceString)

Oh well. I've been copying the entire file from one tmp file to another and removing the empty lines. Completely inefficient, but nonetheless it works. I'm getting a bit better with the RegExp pattern stuff, but the "\n" character still escapes me...

Onwards,

Q-

Quintios · Jul 23, 2003

Dang it! Forgot to escape the parens:

&quot

A"

needs to be

"\)A"

Now it works!

Onwards,

Q-

clarkin · Jul 23, 2003

if you are only matching )A and )D you can just use replace twice instead of a reg exp - much simpler:
[tt]
strFixedText = Replace(Replace(TextFromFile, &quot

A", &quot

" & VbCrLf & "A&quot

, &quot

D", &quot

" & VbCrLf & "D&quot

[/tt]

In an effort to get up to speed with reg exps i'm trying to get my earlier natural language thing to work with a reg exp .. I can get it to match all the ) where there is NOT a new line after it, but when I replace them to include the newline its eating the following character..

Let you know if I succeed.

Posting code? Wrap it with code tags: [ignore]

Code:

[/ignore][code]CodeHere

[ignore][/code][/ignore].

clarkin · Jul 23, 2003

ok, I think I have it working with reg exps now...

(this was done in an ASP page, but you should be able to see what its doing):

Code:

	Dim RegX, mtch, SearchPattern, strReplaceWith, strFileContents
	
	strFileContents = &quot;AB(xxxx)DA(xxx)&quot; & vbCrLf & &quot;ADDA(xx)&quot; & vbCrLf & &quot;D(x)&quot;
	Set RegX = NEW RegExp
	SearchPattern = &quot;\)([^\r\n])&quot;
	strReplaceWith = &quot;)&quot; & vbCrLf & &quot;$1&quot;
	RegX.Pattern = SearchPattern
	RegX.Global = True
	RegX.IgnoreCase = True
	
	Response.Write(strFileContents & &quot;->&quot;)
	strFileContents = RegX.Replace(strFileContents, strReplaceWith)
	Response.Write(strFileContents)

the extra character that was being 'eaten' is being put back in with that $1, as I understand it

let me know how it works for you..

Posting code? Wrap it with code tags: [ignore]

Code:

[/ignore][code]CodeHere

[ignore][/code][/ignore].

Quintios · Jul 24, 2003

Man, that works like a charm! That took out about 60 lines of repetitive code; nice.

How did you come up with that? What do all the characters mean? I have the reference paper here from the WSH documentation.

\) -> escapes the parens
( -> starts a pattern sequence
[ -> no idea why that is there
^ -> Matches the beginning of input?
\r -> Carriage Return
\n -> linefeed
] -> well, you had one at the beginning, guess you gotta close
) -> closing parens

I just don't understand...

Now the only problem I'm having is that sometimes there are two or three carriage returns in a row. I would imagine that I could replace all of them like this:

SearchPattern = "([^\r\n]+)"
ReplaceString = vbCrLf

I'll check and see if that'll work...

Nope. All that did was erase everything..

I wish I could give you more than one star. I'm almost there! Just one more little trick and I've got it.

Any ideas how to remove more than one CRLF in a row?

Onwards,

Q-

PHV · Jul 24, 2003

Try this:

Code:

SearchPattern = &quot;([\r\n]+)&quot;
ReplaceString = vbCrLf

The ^ just after a [ mean : NOT the next characters (until the ].

Hope This Help
PH.

Quintios · Jul 24, 2003

Your explanation is helpful! But for some reason I'm still getting blank lines. I'm not too sure what's going on, but I'm sure that I'll eventually figure it out.

Thank you!

Onwards,

Q-

spazman · Jul 24, 2003

try this then

TextFromFile = Replace(Replace(TextFromFile, vbCrLf, "&quot

, &quot

", &quot

" & VbCrLf)

clarkin · Jul 24, 2003

Code:

TextFromFile = Replace(Replace(TextFromFile, vbCrLf, &quot;&quot;), &quot;)&quot;, &quot;)&quot; & VbCrLf)

Just shows its always worth getting a new pair of eyes to look at your problem

Nice solution

Posting code? Wrap it with code tags: [ignore]

Code:

[/ignore][code]CodeHere

[ignore][/code][/ignore].

clarkin · Jul 24, 2003

and to explain my Reg Exp search pattern (as I understand it!)

[tt]SearchPattern = "\)([^\r\n])"[/tt]

[tt]\)[/tt] : matches a close parenthesis )
[tt]( )[/tt] : saves whatever matches inside these 2 parenthesis for later reference ($1 in my replace) - this was what was being eaten in my earlier attempts
[tt][ ][/tt] : this means match one of the characters inside these square brackets - in my case \n or \r (CR or LF)
[tt]^[/tt] : this was to change the above to NOT match one of the two chars

I get very confused by Reg Exp pattern syntax, so the above was found by .. quite a lot of trial & error.

Enjoy

Posting code? Wrap it with code tags: [ignore]

Code:

[/ignore][code]CodeHere

[ignore][/code][/ignore].

Quintios · Jul 24, 2003

Actually, I got a nice solution from *another* website.

Read the entire file into one string variable. Remove *all* vbCrLf's. Then for every ), replace it with ) & vbCrLf.

Works like a champ. Course, if its a large file there could be issues.

Lots of ways to do this. Thank you all!

Onwards,

Q-

clarkin · Jul 24, 2003

just to clarify Quintios - that is exactly what spazman's final post does.

[tt]TextFromFile = Replace(Replace(TextFromFile, vbCrLf, "&quot

, &quot

", &quot

" & VbCrLf)[/tt]

The inner (green) replace removes all vbCrLf's, then the outer one adds a vbCrLf after every ).

Actually i'm surprised we didn't think of it earlier

Posting code? Wrap it with code tags: [ignore]

Code:

[/ignore][code]CodeHere

[ignore][/code][/ignore].

Quintios · Jul 24, 2003

All you guys rule. Thanks for your help!

Onwards,

Q-

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Remove many carriage returns 5

Technical User

Programmer

Technical User

Programmer

Programmer

Technical User

Programmer

Technical User

Technical User

Programmer

Programmer

Technical User

MIS

Technical User

Programmer

Programmer

Programmer

Technical User

Programmer

Technical User

Similar threads

Log in

Part and Inventory Search

Sponsor