Regular Expression Syntax

kthureen · Jun 13, 2003

Given that this string is a line from a file "test.txt":

"Two words more stuff here til the end of the line"

Can anyone help me with the correct regular expression sytax for the following code that will
1) identify the line in the file by its first two words
2) allow me to print off just the contents of the line that come AFTER the first two words and the large space (i.e. "more stuff...&quot

?

What I've got does not seem to work. Thanks!

infile = open('test.txt', 'r')
linesin = infile.readlines()

for line in linesin:
pattobj = re.compile("Two\swords\s+\.+&quot

matchobj = pattobj.search(line)
if matchobj:
print matchobj.group()

sebsauvage · Jun 16, 2003

Hello !

First remark:
You'd better move the re.compile() out of your loop so that the regular expression is not recompiled for each line of the input file.

Second remark:
I think regular expressions are overkill here.
string.find() or .startwith() methods would be better suited for what you want (because they're much faster).
(Besides, regular expression can raise exceptions, and you did not enclose your search() call in try/except block.)

String methods like find() or startwith() are extremely fast.

Here's how I would write it:

[tt]for line in linesin:
if line.startswith("Two words &quot

:
print line[10:].lstrip()[/tt]

kthureen · Jun 16, 2003

Thanks for the help, sebsauvage!

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Regular Expression Syntax

kthureen

Programmer

sebsauvage

Programmer

kthureen

Programmer

Similar threads

Part and Inventory Search

Sponsor