Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Fast way to trim trailing whitespace 1

Status
Not open for further replies.

mbaranski

Programmer
Nov 3, 2000
421
0
0
US
I'm writing a program that does a huge database import, and after profiling my code, I've found that my program spends over .25 of it's exec time trailing whitespace from each field it imports. I need a (clever) and fast way to remove trailing spaces, and insert a null where the first trailing space begins.

If I don't remove them, it takes even longer to run. All of the strings are of type:

char str[256];

Right now, I'm starting at str[255] and walking back until I hit a char other than a space (isspace() == false) and putting the '\0' on position ahead of that. Most fields are less than 20 char's long, so this is very time consuming.


Any suggestions...

Thanks
MWB.
Disclaimer:
Beware: Studies have shown that research causes cancer in lab rats.
 
Does the database suports rtrim? You can pit instead of select x from xxx
select ltrim(x) as x from xxx.
The last one is supported by Access and SQLServer. I did not try it in MySQL and Oracle.
John Fill
1c.bmp


ivfmd@mail.md
 
I'm working on something for that right now, but in the mean time let me get some more information from you.

Are you getting the string from user input, where there would automatically be a '\0' appended or from elsewhere that a '\0' would not be appended?

Is it possible that you can append a deliminator on the end of the data, for instance a '~' or ';', then start at the begining of the string so when you hit one of those characters you can replace it with '\0'?

What is the exact code you are using to remove the whitespace? I know the concept you are talking about, but you'd be amazed how different something someone explains and the actual code is.

Thanks for the info in advance and I will start working on this for ya.
 
/* Remove trailing whitespace and put a null at the end */
void rmTrail(char *str, int size)
{
/* cerr << &quot;Before &quot; << str << &quot;_END&quot; << endl; */
int i = size - 2;
while(isspace(str[i--])){
}
str[i+2] = '\0';
/* cerr << &quot;After &quot; << str << &quot;_END&quot; << endl; */

}

This is the code, and there's no way to put anything marking the end of the string, and there may be spaces in the string.

I'm using Informix esql/c, and there is no way to zero the memory or anything before I use it. I guess I just need a faster way to scan from the back of the string. I did'nt find an rtrim function, that would be ideal.

Thanks Folks!
MWB

Disclaimer:
Beware: Studies have shown that research causes cancer in lab rats.
 
Also, any ideas on whether the STL would be faster with this? Such as using:

find_last_not_of(&quot;\t &quot;);
and putting a null after that?

MWB.

Disclaimer:
Beware: Studies have shown that research causes cancer in lab rats.
 
You could also skip the function call (isspace) // i'm not very familiar with C++, but i guess this is a function call, but if it is a macro, it should be ok.

You imagine the number of times that function is called in the while loop... Why don't use
Code:
while (str[i--] == 32)
?

Another idea would be to use some binary search - scanning parts of the string for 0's. eg: last half of the string... I think it could be faster...
 
You can try to customize your check.
For example if you have found that 99% of records take less then 20 chars. Allocate empty string with 256 characters.

static char strZero[256]; /* init by 256 space characters*/
void rmTrail(char *str, int size)
{
/* cerr << &quot;Before &quot; << str << &quot;_END&quot; << endl; */

memcmp(str+20,strZero+20,size-20);
.......
}
I hope memcmp is quick
if memcmp return zero you can copy only 20 left characters and traid off is not so big or use your algorithm from 20th char only, if it is not start from 256 or change 20 to 100 and try again. Anyway this will requered testing and tunning. Defently do not call any function for each character.


 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top