Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Insert text at beg of line & remove leading zeroes from a field 6

Status
Not open for further replies.

gumbie

Technical User
Jun 18, 2002
15
0
0
AU
Hi All,

Sorry if this is a really dumb question, however I'm not the best at shell scripting (let alone awk!) and I've exhausted all other avenues. I would appreciate any asssistance!

Environment is NCR MP-RAS 3.02

I need to be able to do a couple of things to some ascii files that contain >= 1 line/s prior to inserting said file into a database table:-

1. insert the filename & delimiter at the beginning of each line
2. strip leading zeroes from a 4 char field in each line that may have 1 or 2 leading zeroes

Thanks in advance for any light that may be shed on this.

Cheers,
-gumbie.
 
Could you post some sample data please?
 
Hi,

Ooops, here t'is:-

23/04/2002,0101,0259,0,0,0,0,0,0,0,0,0,0,0,11,276,248,146,159,109,8,0,0,0,0,0,0,957
24/04/2002,0101,0259,0,0,0,0,0,0,0,3,39,38,160,141,172,262,145,159,108,51,0,0,0,0,0,0,1278

Above are two sample lines. The third field is where I need to strip off the leading zero/es... this field is a store code that can be two or three characters.

Thanks again.

Cheers,
-gumbie.
 
I am getting some bizarre stuff back from this.
Maybe vgersh or cakiwi can tell me why.

awk ' BEGIN {
FS = ","
}
{
if ($3 ~ /0[1-9]+/) {
x = index($3,0) ; $3 = substr($3,(x + 1),length($3) - x)
}
$0 = FILENAME FS $0 ; print $0
}' filename

result:
scrap.txt,23/04/2002 0101 259 0 0 0 0 0 0 0 0 0 0 0 11 276 248 146 159 109 8 0 0 0 0 0 0 957
scrap.txt,24/04/2002 0101 259 0 0 0 0 0 0 0 3 39 38 160 141 172 262 145 159 108 51 0 0 0 0 0 0 1278

The fs is gone bye-bye.

A possible kludge would be:
function replacefs(src,sep) {
sub(" ",sep,src)
return src
}

and $0 = replacefs($0,FS)
The solution is probably buggy too.
Hopefully somebody else proposed one.
 
FS=OFS="," Vlad
+---------------------------+
|#include<disclaimer.h> |
+---------------------------+
 
That's funny! Haven't seen that one before.
 
Howdy marsd & vgersh99,

Thanks so much! marsd' solution coupled with vgersh99 2cents worth works a treat.

[2thumbsup]

I'm indebted to you both!

Cheers,
-gumbie.
 
Well, I would prefer:
AWK 'BEGIN{
OFS=FS=&quot;,&quot;
}
{$3 = $3 + 0;
print FILENAME,$0;
}

This will convert field $3 to Number
and get rid of ALL leading Zeroes.
Furthermore, IMHO your solution may
mess with Zeroes INSIDE $3

Jayr Magave
 
No, the regexp and index() will not allow this to happen.
BTW: very nice, economical solution.
 
The requirement said:

&quot; ...each line that *may* have 1 or 2 leading zeroes..&quot;

To me *may* means may or may not.

So your regex /0[1-9]+/ will match positive to &quot;1023&quot;, and
the index() will miss the &quot;1&quot; before the &quot;0&quot;.
I guess /^0[1-9]+/ would correct this. Regexes can be trickier than we expect.

BTW, $3 = $3 + 0 is easier to understand but to be
more economical $3+=0 would be sufficient.

Jayr Magave

 
No, we have had this discussion before and I have run through rep samples.
Index() will only pick up the first match.
So it is fairly safe.
A file like this:
23/04/2002,0101,0259,0,0,0,0,0,0,0,0,0,0,0,11,276,248,146,159,109,8,0,0,0,0,0,0,957
24/04/2002,0101,0259,0,0,0,0,0,0,0,3,39,38,160,141,172,262,145,159,108,51,0,0,0,0,0,0,1278
23/06/09,1234,0012,1,0,0,0,,2,3,4,3,39,38,160,141,172,262,145,159,108,51,0,0,0,0,0,0,1278
23/04/2002,0101,0250,0,0,0,0,0,0,0,0,0,0,0,11,276,248,146,159,109,8,0,0,0,0,0,0,95723/04/2002,0101,0259,0,0,0,0,0,0,0,0,0,0,0,11,276,248,146,159,109,8,0,0,0,0,0,0,957
24/04/2002,0101,0259,0,0,0,0,0,0,0,3,39,38,160,141,172,262,145,159,108,51,0,0,0,0,0,0,1278
23/78/2002,0150,0403,0,0,0,0,0,0,0,2,45,66,123,123,123,322,768,123,321,32,0,0,0,0,0,0,1234

Comes out like this:
scrap.txt,23/04/2002,0101,259,0,0,0,0,0,0,0,0,0,0,0,11,276,248,146,159,109,8,0,0,0,0,0,0,957
scrap.txt,24/04/2002,0101,259,0,0,0,0,0,0,0,3,39,38,160,141,172,262,145,159,108,51,0,0,0,0,0,0,1278
scrap.txt,23/06/09,1234,012,1,0,0,0,,2,3,4,3,39,38,160,141,172,262,145,159,108,51,0,0,0,0,0,0,1278
scrap.txt,23/04/2002,0101,250,0,0,0,0,0,0,0,0,0,0,0,11,276,248,146,159,109,8,0,0,0,0,0,0,95723/04/2002,0101,0259,0,0,0,0,0,0,0,0,0,0,0,11,276,248,146,159,109,8,0,0,0,0,0,0,957
scrap.txt,24/04/2002,0101,259,0,0,0,0,0,0,0,3,39,38,160,141,172,262,145,159,108,51,0,0,0,0,0,0,1278
scrap.txt,23/78/2002,0150,403,0,0,0,0,0,0,0,2,45,66,123,123,123,322,768,123,321,32,0,0,0,0,0,0,1234

So like the aussies say: No worries mate!
 
I agree with jmagave.
marsd: your examples are right but try
Code:
23/04/2002,0101,
[tt]2059[/tt][/code],0,0,0,0,0,0,0,0,0,0,0,11,276,248,146,159,109,8,0,0,0,0,0,0,957[/code]
and you will get
Code:
scrap.txt,23/04/2002,0101,
[tt]59[/tt][/code],0,0,0,0,0,0,0,0,0,0,0,11,276,248,146,159,109,8,0,0,0,0,0,0,957[/code]
with 2059 changed to 59 (!?!?).

In fact our both right:
index only find the first '0' but, from gumbie's description of the problem, the first '0' in not always first in the string.

 
Hi again,

Just to fill you all in on the progress with the above.... marsd solution worked for me whilst I was getting data files for locations with a 3 character code (meaning there is one 0 to strip out prior to insert into a table). However once I starting getting files for locations with a 2 character code, that was padded with 2 0's, I ran into only 1 zero being removed as Jayr suggested in his first post.

So, to cap it all off, marsd started me off in a working direction (as opposed to where I was going on my own!), Vlad came up with the missing delimiter solution and Jayr has put the icing on the cake with the &quot;$3 = $3 + 0&quot; contribution. My sincere thanks again to you all.

Cheers,
-gumbie.
 
I feel impelled to demonstrate a solution now using the original codebase, even though it is ridiculous: jmagave's solution is better.
Thanks for the criticisms.


BEGIN {
FS = OFS = &quot;,&quot;
}
{
if ($3 ~ /^0[1-9]/) {
i = 1
while (i < length($3)) {
if (substr($3,i,1) == 0) {
$3 = substr($3,(i + 1),length($3) - 1)
} else {
break
}
}
}
$0 = FILENAME FS $0
print $0
}
Output against:
23/04/2002,0101,0259,0,0,0,0,0,0,0,0,0,0,0,11,276,248,146,159,109,8,0,0,0,0,0,0,957
24/04/2002,0101,0259,0,0,0,0,0,0,0,3,39,38,160,141,172,262,145,159,108,51,0,0,0,0,0,0,1278
23/06/09,1234,0012,1,0,0,0,,2,3,4,3,39,38,160,141,172,262,145,159,108,51,0,0,0,0,0,0,1278
23/04/2002,0101,0250,0,0,0,0,0,0,0,0,0,0,0,11,276,248,146,159,109,8,0,0,0,0,0,0,95723/04/2002,0101,0259,0,0,0,0,0,0,0,0,0,0,0,11,276,248,146,159,109,8,0,0,0,0,0,0,957
24/04/2002,0101,0259,0,0,0,0,0,0,0,3,39,38,160,141,172,262,145,159,108,51,0,0,0,0,0,0,1278
23/78/2002,0150,0403,0,0,0,0,0,0,0,2,45,66,123,123,123,322,768,123,321,32,0,0,0,0,0,0,1234
21/54/2001,0987,2306,0,0,0,0,0,0,0,0,9,8,7123,345,11,23,544,0,0,0,0,0

output:
scrap.txt,23/04/2002,0101,259,0,0,0,0,0,0,0,0,0,0,0,11,276,248,146,159,109,8,0,0,0,0,0,0,957
scrap.txt,24/04/2002,0101,259,0,0,0,0,0,0,0,3,39,38,160,141,172,262,145,159,108,51,0,0,0,0,0,0,1278
scrap.txt,23/06/09,1234,12,1,0,0,0,,2,3,4,3,39,38,160,141,172,262,145,159,108,51,0,0,0,0,0,0,1278
scrap.txt,23/04/2002,0101,250,0,0,0,0,0,0,0,0,0,0,0,11,276,248,146,159,109,8,0,0,0,0,0,0,95723/04/2002,0101,0259,0,0,0,0,0,0,0,0,0,0,0,11,276,248,146,159,109,8,0,0,0,0,0,0,957
scrap.txt,24/04/2002,0101,259,0,0,0,0,0,0,0,3,39,38,160,141,172,262,145,159,108,51,0,0,0,0,0,0,1278
scrap.txt,23/78/2002,0150,403,0,0,0,0,0,0,0,2,45,66,123,123,123,322,768,123,321,32,0,0,0,0,0,0,1234
scrap.txt,21/54/2001,0987,2306,0,0,0,0,0,0,0,0,9,8,7123,345,11,23,544,0,0,0,0,0


Also does anyone know why a GP function like this would fail here? (CH = 0, in this case.)

function truncstr(str,CH) {
if (substr(str,1,1) == CH) {
str = substr(str,2,length(str) - 1)
print str
truncstr(str)
} else {
print &quot;Returning: &quot;, str
return str
}
}

main() portion cannot retrieve the value.
$3 = truncstr($3) , reads an empty string??
 
> Also does anyone know why a GP function like this would > > > fail here? (CH = 0, in this case.)

> function truncstr(str,CH) {
> if (substr(str,1,1) == CH) {
> str = substr(str,2,length(str) - 1)
> print str
> truncstr(str)
> } else {
> print &quot;Returning: &quot;, str
> return str
> }
>}

> main() portion cannot retrieve the value.
> $3 = truncstr($3) , reads an empty string??

By including &quot;CH&quot; on function's definition, you're making the scope of this parameter 'local' to function definition. But at the same time you're NOT including the second parameter to the function at the point of the call - I think you're assuming/defining 'CH' somewhere else in the code. This is a bit confusing. When the function gets called 'CH' is local - which is never beining initialized or modified. The 'if' confition never gets executed.

Either:
$3 = truncstr($3, 0)

OR
function truncstr(str) vlad
+---------------------------+
|#include<disclaimer.h> |
+---------------------------+
 
You're right vlad, but the original form function
was as below.

function trucstr(str) {
if (substr(str,1,1) == 0) {
str = substr(str,2,length(str) - 1)
print str
truncstr(str)
} else {
print &quot;Returning: &quot;, str
return str
}
}

Inside:
Returning: ,403
main():
scrap.txt,23/78/2002,0150,,0,0,0,0,0,0,0,2,45,66,123,123,123,322,768,123,321,32,0,0,0,0,0,0,1234





 
You're right vlad, but the original form function
was as below.

function truncstr(str) {
if (substr(str,1,1) == 0) {
str = substr(str,2,length(str) - 1)
print str
truncstr(str)
} else {
print &quot;Returning: &quot;, str
return str
}
}

Inside:
Returning: ,403
main():
scrap.txt,23/78/2002,0150,,0,0,0,0,0,0,0,2,45,66,123,123,123,322,768,123,321,32,0,0,0,0,0,0,1234





 
Any recursion is 'somewhat' tricky:

function truncstr(str) {
if (substr(str,1,1) == 0) {
str = substr(str,2,length(str) - 1)
printf(&quot;Recursing-> [%s]\n&quot;, str);
return truncstr(str)
} else {
print &quot;Returning: &quot;, str
return str
}
}

{
$0=truncstr($0)
print $0
} vlad
+---------------------------+
|#include<disclaimer.h> |
+---------------------------+
 
Yep, That's probably it.
Thanks Vlad.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top