Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regular expression in files 2

Status
Not open for further replies.

bkowlagi

Programmer
Apr 8, 2003
28
US
I probably think there has been a post on something similar to this but I am unable to find it. Anyway here's my problem.

I have to search for a string in a text file which is like "0?????*" in a line and then copy that line into a new file.The ? marks are wildcard characters which have to be a number(0-9). problem is there can be any number of ?characters.

Is there any similar code/post to this?


Thanks.
 
Could you say, what exactlly you expect as result? You have above a regular expression what matches with enything, including empty and nonempty strings. Let's explain...
The "0?????*" erxpression you have entered in regular expressions is treated like this:

0 - character '0'
? - the expression before (ie 0) could be there one time, or coun be missing
? - the expression before (ie 0?) could be there one time, or coun be missing
? - the expression before (ie 0??) could be there one time, or coun be missing
.....
All above as result is the same thing like simple 0?
* - means expression before (ie 0?????) could be there zero of more times, so 0?????* is the same with 0?* and as result is the same with 0*.
Also there is not specified any place, so this expression could be in any places. So, you try to find an axpression what matches with something with could have 0 inside 0 or more times. So, with that expression matches any strings, including an empty string.

Ion Filipski
1c.bmp
 
>The ? marks are wildcard characters which have to be a number(0-9). problem is there can be any number of ?characters.

Your reg. exp should be something like
"0[0-9]+" ie a 0 followed by at least one digit.

Boost has aid for implementing regular expressions in C++.

/Per
[sub]
"It was a work of art, flawless, sublime. A triumph equaled only by its monumental failure."[/sub]
 
If you mean <<0 followed by any other 5 digits>> anywhere in the text, your regular expression should look like this:
&quot;0[0-9]{5}&quot;

Ion Filipski
1c.bmp
 
examples for the kind of strings are

000123*
019833*
021375*

and so on in different lines.

now i tried a regular expression like
0 + [0-9] + [0-9] + [0-9] + [0-9] + [0-9] + [*]
note: i need to match the asterisk char and not use it as a wild card.

this gave me very strange results like sometimes strings lesser number of chars were getting match and in some strings there was no ending '*'

does the MS VC++ compiler treat regex a different way.

A colleague suggested to use Perl which is the best for regex and file manipulation.
 
Code:
Remember, + means it could eb there at least one time, so + = 1 or more occurences. * means it could be there 0 or more occurences.
So let's explain your regular expression:
1. 0+ will be 0, 00, 000, 00000 or any more occurences of 0
2. [0-9]+ will be anything, for example 341234314, at least one digit, so your number could be for example 00000342342303434238
....
...

You need an expression like this:
&quot;0[0-9]{5}\\*&quot;
in other words say, {0 ; 5 of any digits ; asterix}

remember, * is treates as any number of occurences from 0 to infinite. So if you want character '*' you should use backslash \*, in C++ string it will be &quot;\\*&quot;

Ion Filipski
1c.bmp
 
Yep, the reg exp worked.

I was just looking at some Perl code for regular expressions. Since I intend to use a lot of them the next few days just to build some useful tools. I was wondering if Perl was better of should I stick to C/C++ using those regular expression libraries at boost.org
 
Yes, I'd say Perl is regular expression oriented language. There is an other forum for Perl programmers, forum219

Ion Filipski
1c.bmp
 
Microsoft C++ has at least two variants of regexp
first one is ATL and second one is for managed C++, the same as used in C#.

Ion Filipski
1c.bmp
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top