Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

print output to same file

Status
Not open for further replies.

chz013

Programmer
Jan 5, 2003
73
US
Hi
I have a data file, named a_data, which contains like this:
3100340001|02.02.29|28800|03/20/2002|

I manage to write a simple awk script that converts
03/20/2002 to 2002-03-20:
ie
mlc_year = substr($4,7,4)
mlc_month = substr($4,1,2)
mlc_day = substr($4,4,2)
mlc_full_date = mlc_year"-" mlc_month "-" mlc_day
$4 = mlc_full_date
print $4

How do I write back to a_data file such that it contains
3100340001|02.02.29|28800|2002-03-20|

Any help is fully appreciated.
Thanks



 
you cannot - not with awk. Either:

1. save the output of awk in a temp file and then 'mv' into the original file.

2. use 'ex' with 'here document' for your string manupulation and save the result to the original file.

BTW, your awk conversion can be simplified:

BEGIN {
FS="\\|"
OFS="|"
{
split($4, arr, "/");
$4=arr[3] "-" arr[1] "-" arr[2]
print
}


vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Vlad
thanks for the quick response.
But I dont know how to copy the entire
record and at the same time, replace $4
with 2003-03-20 in the new file, say b_data ?

 
either:

1. nawk -f myAWK.awk myText.txt > b_data

OR

BEGIN {
FS=&quot;\\|&quot;
OFS=&quot;|&quot;
outFile=&quot;b_data&quot;
{
split($4, arr, &quot;/&quot;);
$4=arr[3] &quot;-&quot; arr[1] &quot;-&quot; arr[2]
print > outFIle
}

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
I'm not sure if I completely misunderstood the problem, but as I understand it there is a way to write output to the same file with awk. Simply read the file into an array and output during END into FILENAME. Isn't that right?

Also substituting &quot;/&quot; with &quot;-&quot; can be done using gsub(), can it not?

try this:
[tt]
BEGIN {
FS=&quot;|&quot; # input field seperator
OFS=&quot;|&quot; # output field seperator
}
{
if ( gsub(/\//, &quot;-&quot;, $4) ) {
# substitute &quot;/&quot; with &quot;-&quot;
myARR[++i]=$0
# write line into array
}
else {
# only applies if no substitution was made
myARR[++i]=$0
}

}
END {
printf(&quot;&quot;) > FILENAME
# empty file
for( line=1; line<=NR; line++)
print myARR[line] >> FILENAME
}
[/tt]
Remove the red part if you do not have any input which would otherwise be skipped.

There will be an extra empty line at the end of the output, which could be omitted by implementing some minor modifications.

HTH
 
FILENAME A pathname of the current input file. Inside a
BEGIN action the value is undefined. Inside an END
action the value is the name of the last input
file processed.


Regarding the 'gsub'..... if you look closer the OP wanted to change the order of the data specification - for that you need to 'swap' the data sub-fields.

Regarding reading everything into an array - considering the file size and the 'simplicity' of the operation this operation might be considered 'cost/limitation prohibitive'. I try to stay away from reading the whole file(s) into memory if I can avoid it.

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
FILENAME works on the tests I made on HP-UX-11.00.

You are absolutely right that reading everything into an array depends on file size. Drawback is that you use massive ressources. But then the ownership of the original is unchanged. Creating a new file and moving it to the original might change the ownership.

I actually overread the swapping year month day part - sorry. Vlads approach would then have been mine too.

One thing I do not understand though:
why use [tt]FS=&quot;\\|&quot;[/tt] instead of [tt]FS=&quot;|&quot;[/tt]? Did I overlook something else?
 
I believe the 'ownership' of a text file will be changed if the original owner is different from the user running the awk script.

Creating a new file and doing the 'mv' will change the ownership as well. That's the reason I've recommended doing everything with 'ex' with the 'here-document' editing the file 'in place' (if the ownership/permissions do matter).

Regarding the FS definition:

FS Input field separator regular expression; a space
character by default.

Given FS being an ERE, the '|' is ERE's OR construct. Therefore you have to escape like so:
FS=&quot;\\|&quot;

or

FS=/\|/

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
damn TGML - sorry about that:

I believe the 'ownership' of a text file will be changed if the original owner is different from the user running the awk script.

Creating a new file and doing the 'mv' will change the ownership as well. That's the reason I've recommended doing everything with 'ex' with the 'here-document' editing the file 'in place' (if the ownership/permissions do matter).

Regarding the FS definition:

FS Input field separator regular expression; a space
character by default.

Given FS being an ERE, the '|' is ERE's OR construct. Therefore you have to escape like so:
FS=&quot;\\|&quot;

or

FS=/\|/


vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
what did I do wrong, vlad ?

myawk.awk

#! /usr/bin/awk -f


BEGIN {
FS=&quot;\\|&quot;
OFS=&quot;|&quot;
outFile=&quot;b_data&quot;}
{
split($4, arr, &quot;/&quot;);
$4=arr[3] &quot;-&quot; arr[1] &quot;-&quot; arr[2]
print > outFIle
}


bash-2.03# test.awk a_data
awk: can't open file
record number 1

a_data contains
3100340001|02.02.29|28800|2002-03-20|
3100340001|02.02.29|28900|2002-03-20|


 
ooops - sorry - misstyped the 'outFile':

print > outFile

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Vlad
I still get the same result

3100340001|02.02.29|28800|03/20/2002|
 

is your awk file like that?

#!/usr/bin/awk -f


BEGIN {
FS=&quot;\\|&quot;
OFS=&quot;|&quot;
outFile=&quot;b_data&quot;}
{
split($4, arr, &quot;/&quot;);
$4=arr[3] &quot;-&quot; arr[1] &quot;-&quot; arr[2]
print > outFile
}

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Vlad

1- copying from my file:

#! /usr/bin/awk -f


BEGIN {
FS=&quot;\\|&quot;
OFS=&quot;|&quot;
outFile=&quot;b_data&quot;}
{
split($4, arr, &quot;/&quot;);
$4=arr[3] &quot;-&quot; arr[1] &quot;-&quot; arr[2]
print > outFile
}


2- If another file has FS as ,
3100340001,02.02.29,28800,03/20/2002,

what has to change in the awk file so this script
can be used for both | as , field separators at the
same time ?

 
typo -
what has to change in the awk file so this script
can be used for both | AND , as field separators at the
same time ?
 
FS=&quot;(\\|)|(,)&quot;

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Vlad
1 - it doesnt work; no change from input file after running awk script with
FS=&quot;(\\|)|(,)&quot;

2 - When I ran this awk script with another input file, test.awk input_file
I get error showing this below:

awk: record `3111200001|02.02.29|...' has too many fields
record number 1

Is there a workaround to this ?


Thanks
 
For #2, input file, each record has
303 columns
 
if you're on Solarism make sure you're NOT using old 'awk', but rather either:
#!/usr/bin/nawk -f

OR

#!/usr/xpg4/bin/awk -f

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top