Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regular Expression Syntax

Status
Not open for further replies.

kthureen

Programmer
May 28, 2003
6
US
Given that this string is a line from a file "test.txt":

"Two words more stuff here til the end of the line"

Can anyone help me with the correct regular expression sytax for the following code that will
1) identify the line in the file by its first two words
2) allow me to print off just the contents of the line that come AFTER the first two words and the large space (i.e. "more stuff...")?

What I've got does not seem to work. Thanks!


infile = open('test.txt', 'r')
linesin = infile.readlines()

for line in linesin:
pattobj = re.compile("Two\swords\s+\.+")
matchobj = pattobj.search(line)
if matchobj:
print matchobj.group()

 
Hello !

First remark:
You'd better move the re.compile() out of your loop so that the regular expression is not recompiled for each line of the input file.

Second remark:
I think regular expressions are overkill here.
string.find() or .startwith() methods would be better suited for what you want (because they're much faster).
(Besides, regular expression can raise exceptions, and you did not enclose your search() call in try/except block.)

String methods like find() or startwith() are extremely fast.


Here's how I would write it:

[tt]for line in linesin:
if line.startswith("Two words "):
print line[10:].lstrip()[/tt]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top