Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

parsing strings 3

Status
Not open for further replies.

keano6

Programmer
Feb 25, 2003
4
CN
Hi,

I'm new to c++, and am trying to parse a string and find the number of lines...

I am having problems finding all the carriage returns with the following code:

char* tmp;

//skip to the end of the line
while (*tmp != '\n')
tmp++;

Is there a better way of finding this character?

Thanks.
 
for(i = 0; tmp != '\n'; i++);

Ion Filipski
1c.bmp
 
you could also use the strchr() function to give you a pointer to the carriage return

Skute

"There are 10 types of people in this World, those that understand binary, and those that don't!"
 
in STL you can also use algorithm find on class string. Also there is a possibility in ATL to use regular expression for more complicated situations.

Ion Filipski
1c.bmp
 
>while (*tmp != '\n')
> tmp++;

>Is there a better way of finding this character?

Depends on what you mean by better. You can't be more efficient that this. (perhaps also check for *tmp!=0 as well in case there are no more line breaks)

/Per
[sub]
"It was a work of art, flawless, sublime. A triumph equaled only by its monumental failure."[/sub]
 
you can also use this:
Code:
while( *(tmp++) != '\n' );
if you want to count the number of lines,you could use this:
Code:
int count = 0;
while( *(tmp++) != 0 ) if( *tmp == '\n' ) count++;
 
Compacting code like that has its disadvantages.

1. It is harder to read:
* Is it obvious that the *tmp compared to 0 and the *tmp compared to '\n' aren't the same *tmp due to the *(tmp++)?

I guess not, since it holds a bug...

Code thats hard to read makes it easy for bugs to be introduced. For example what would the count be on a string like "\n123\n"? There are 2 '\n' so count should be 2, right? Is it?

2. You might be tricked into thinking that compact C++ code == efficient machine code, which is not nescerc...necsces...always the case.



/Per
[sub]
"It was a work of art, flawless, sublime. A triumph equaled only by its monumental failure."[/sub]
 
well,this isn't a big problem,you just need to add one more line:
Code:
int CountLines( char *str )
{
    int count = 0;
	if( *str == '\n' ) count = 1;
	while( *(str++) ) if( *str == '\n') count++;
	return count;
}
 
thanks for all the help guys - i got it working ok now.

One more question though -

Is it a bad idea to append values to a char*?

I have the following:

char* tmp;

while(isalnum(report[index])==TRUE)
{
*tmp += report[index];
index++;
}

"report" is a character array that I must read values from.

The code seems to work for the first iteration, but then I get a segmentation fault.

I'm not able to access the STL string classes from this application, so my parsing options are fairly limited.

Thanks again,

Keano.
 
1:Where do you allocate the memory for tmp? do you use new or point it at a char[] declared on the stack
2:That code just changes the first char, it doesn't append.
3:a \0 never gets put in so any string handling functions are going to run off the end.

WR
 
>well,this isn't a big problem,

Sure, it's a trivial example

>you just need to add one more line:

Ah, yes. Insert a special beginning-with-newline hack to compensate for an erroneous algorithm, ie fix the symptom not the cause. Not my cup 'o tea though...

>Is it a bad idea to append values to a char*?

Usually yes. I think you actually want to do a strcat (see doc) rather than a *tmp+=




/Per
[sub]
"It was a work of art, flawless, sublime. A triumph equaled only by its monumental failure."[/sub]
 
Per, Tokra

thanks for the replies.

so I could have something like this...

char* tmp;

while(isalnum(report[index])==TRUE)

{
strcat(tmp, (const char*)report[index]);
index++;
}

I guess though that my problem is that I don't know how big tmp is going to be at the outset so I can't allocate memory for it.

Some sort of vector would be ideal, but this isn't catered for outside of the STL classes.

Thanls,

Keano.
 
>my problem is that I don't know how big tmp is going to be

std::string solves that problem magnificently.

>Some sort of vector would be ideal, but this isn't catered for outside of the STL classes.

So, whats wrong with STL? Is IS a part of the C++ standard.



/Per
[sub]
"It was a work of art, flawless, sublime. A triumph equaled only by its monumental failure."[/sub]
 
>an erroneous algorithm

>i think that you are exagerating a little bit, actualy there is no problem at all with this algorithm,maybe you would like to explain what is wrong with it:
Code:
int CountLines( char *str )
{
    int count = 0;
    if( *str == '\n' ) count = 1;
    while( *(str++) ) if( *str == '\n') count++;
    return count;
}


 
"erroneous" refers to the original algorithm, which was corrected by making a special-case fix.

The point I was trying to make is that obfuscated code attracts bugs.

As for your last question "whats wrong with it", I'd say its unneccesarily complex (=another level of ofuscating the code). Its an immensly trivial function but still you have to think twice before you figure out what its actually doing. Compared to somthing like
Code:
int CountLines(const char *str )
{
    int count = 0;
    while(*str != 0)
    {
        if(*str=='\n')
            count++;
        str++;  
    }
    return count;
}
which is pretty straight forward, and just as efficient.
Perhaps its just me that thinks that (since I just wrote it) *shrug*

Im not saying that everything else is bad/ugly/wrong, I just want to point out that code has more properties than just to twiddle with bits and bytes - properties such as readability, flexibility, maintainability, and if you have a choice between two technically equivalent functions, choose the one that can be the most aid to you or your collegues.

Perhaps it's a bit off topic but code-design is something I think is immensly important. Note that the code-example is an illustration of concept, not an in-depth analysis of how to count linefeeds.

/Per
[sub]
"It was a work of art, flawless, sublime. A triumph equaled only by its monumental failure."[/sub]
 
>readability, flexibility, maintainability

>i think that my previous code can be easily understood by any one who have a good knowledge of C and pointers manipulation except for someone who would be a total beginner.And for the flexibility and maintainability,i dont see any problem here either.


> "everything else is bad/ugly/wrong"

>well,i think that each programmer have their own stile,it would be a bit risky to actualy go and say which one who is the best.
 
>i dont see any problem here either

Ok, I guess I failed in making my point understood. Sorry about that.

>I think that each programmer have their own stile

When coding at home for fun, or maybe in a small one-man-project that may be. But I can assure you in the real-world of projects, professional environments and complex code there are no lone-ranger programmers. Not for long anyway...

>"everything else is bad/ugly/wrong"
Sub-quote from:
>>"Im not saying that everything else is bad/ugly/wrong..."

>it would be a bit risky to actualy go and say which one who is the best

I'm glad I didn't say that then.


/Per
[sub]
"It was a work of art, flawless, sublime. A triumph equaled only by its monumental failure."[/sub]
 
You should always prefer to write as PerFnurt has shown you. His code reads like a book, yours does not. You dont even have to think to understand his code whereas yours even though its perfectly legal is obfuscated by its dependance on operator precedence. Any good compiler will generate same object code for both examples so obviously the more readable version should be preferred.
 
*************************************************************************
what's the big deal with this readability thing??
is it the first time that you've seen an expression such as: while( *(str++) ) ??
i think that it's all have to do with pointers manipulation,if you know pointers,there is no problem at all,if you dont,sure it can be difficult to read.
Anyway,i have downloaded a lot of big projects on the internet and i can't say that readability is the first preocupation of programmers who wrote the codes for such projects.
 
>what's the big deal with this readability thing??

I'm actually shocked to see that question. May I ask if youre working as a professional developer/software engineer? If not, I can understand your question, because that'd be apparent should you ever fix bugs in other peoples code. Combined with stuff such as "deadline" and "code may not break", it'd be even more apparent.

>if you know pointers,there is no problem at all

1) Since two persons has noted it's a bit obfuscated that should be proof enough (no, Im not a total C++ novice).

2) There is a problem which YOU proved by posting a bug.
I'm only trying to say that proper code design is a tool for eliminating bugs.

I'll try to clarify by recapping the previous posts.

Original algorithm:
while( *(tmp++) != 0 ) if( *tmp == '\n' ) count++;

This has as I see it 2 bugs:
A) One technical bug - doesn't check with first char.
B) A "bug" in readability, which is actually the root cause of the technical bug.

You fixed it by putting in a special treatment, which only corrected bug A).

I have a feeling I'm not getting through...

/Per
[sub]
"It was a work of art, flawless, sublime. A triumph equaled only by its monumental failure."[/sub]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top