newbie - multiline substitution

preferawk · Aug 28, 2004

I am a perl newbie.

I am trying to write a simple preprocessor to strip out some lines from a file. Basically if "somestring" is found, it should delete all content from that point until ";" is found.

It does this ok right now if it is only on the same line. Can anyone please help?

Code:

open(INPUT, "test.in") or die $!;

my $contents = do { local $/; <INPUT> };

$contents =~ s/.*somestring.*;//mg;

print $contents;

close (INPUT);

Thanks in advance, and please forgive my ignorance.

PaulTEG · Aug 28, 2004

somestring - is this likely to wrap on newlines? if so it may require more thought, because the newline could occur on a space

something like this;

Code:

$contents=~/(.?)somestring(.?);(.?)/mg;
$contents=$1." ".$3;
$deletion="somestring".$2";";

Haven't tested it, and I'm not sure how the mg options will work, you may have to loop through the file and check that $3 is defined

HTH
--Paul

It's important in life to always strike a happy medium, so if you see someone with a crystal ball, and a smile on their face ... smack the fecker

preferawk · Aug 28, 2004

Thanks for responding, Paul.

I see what you are trying to do. Unfortunately, it just returns a space.

Also, I think that you are right- in doing something like that I would have to somehow loop through, because if it did match, it might just do it for the first occurrence in the file, and not all. But, I am trying to do it across multiple lines, for all matches, so I am not sure how to go about looping. I guess I could do it manually by brute force.

Thanks again.

M

naq2 · Aug 28, 2004

ok, here is an answer to both your questions:

to be able to recognise on more than one line, you have to use the /s modifier... it looks stupid, but it's like this!

Code:

$contents =~ s/.*somestring.* ;//sg;

For your second question, you have to use a NON-GREEDY operator so it will only match what it need to match to succeed and it won't try to match more.

Code:

$contents =~ s/.*somestring.*? ;//sg;

Hope it works... coz I didn't test it.

But I have a question... what is the first .* doing in your regex... I don't understand it's meaning.

naq2 · Aug 28, 2004

OUPS! NO SPACE BEFORE THE ';' IN THE REGEX! SORRY!

rharsh · Aug 28, 2004

I have a feeling that the code naq2 posted won't have the intended effect. It will probably remove a lot more text than expected.

Give this a try - it should point you in the right direction anyway:

Code:

my ($begin_del, $end_del) = ('somestring', ';');
while ($contents =~ s/(.+?)$begin_del(.*?)$end_del/$1/is) {}

naq2 · Aug 28, 2004

rhash, when you're talking about me removing a lot of inapropriate text, you must be talking about the first .* (the one before somestring).
I thought the same as you, but preferawk put it in coz he had it in his first regex that was working... so...

If it is not this that you're talking about... I don't understand coz your regex is exactly like mine!

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

newbie - multiline substitution

preferawk

Programmer

PaulTEG

Technical User

preferawk

Programmer

naq2

Programmer

naq2

Programmer

rharsh

Technical User

naq2

Programmer

Similar threads

Part and Inventory Search

Sponsor