Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Field removal using awk 1

Status
Not open for further replies.

mag01

MIS
Jan 29, 2002
10
GB
I have an input file in this format :

C^3^^^6036^1^19920401^39991231
C^3^^^52604^1^20021017^39991231
E^1^^^2224^6036^1
E^1^^^7187^52604^1
E^12^^^2224^6036^27083^20021129^1^20021129^^^706895^2619028^
E^12^^^2224^6036^27085^20021203^1^20021203^^^1964367^4583395^
E^15^^^7187^52604^27082^20021121^1^20021121^^^21764706^82080711^
E^20^^^27084
What I need to do is for any lines that start with E and the 2nd field is between 2-19 then remove field 5.

Thanks in advance,
Mark.
 
Pick up a good book an awk, "Effective Awk Programming"
There are plenty of examples of things like this.

You could try this:

BEGIN {
OFS = FS ="^"
}
{
if ($0 ~ /^E/ && $5 > 2 && $5 < 19) {
$5 = &quot;&quot;
}
print
}
 
Thanks,

I will check out the book recommendation as I currently have no books on awk.

Mark.
 
Looking at your data again, you will have problems with
the ^ FS due to the three, than one ^ seps and I misread
your requirement.

You could do this which I think is what you intended:
{
OFS = FS = &quot;^&quot;
i = 0
if ($0 ~ /^E/ && $2 >=2 && $2 <=19) {
gsub(/\^+/,&quot;^&quot;,$0)
$5 = &quot;&quot;
}
print
}

The problem is now the OP is bad:
C^3^^^6036^1^19920401^39991231
C^3^^^52604^1^20021017^39991231
E^1^^^2224^6036^1
E^1^^^7187^52604^1
E^12^2224^6036^ ^20021129^1^20021129^706895^2619028^
E^12^2224^6036^ ^20021203^1^20021203^1964367^4583395^
E^15^7187^52604^ ^20021121^1^20021121^21764706^82080711^
E^20^^^27084

Your original data had to be amended to
find the appropriate record.
I'll have to work on this some more.
 
Thanks for your efforts so far. I have your second script working, but as you say the output is not quite what I need :
Input -
E^4^^^2359^6244^27081^20021121^1^20021128^0.977402^^^3913100^
Required output -
E^4^^^2359^6244^20021121^1^20021128^0.977402^^^39131000^

Mark.
 
This is ugly, and is a throwaway solution since it assumes that the field lengths and records to be replaced are
the same length. You change the data you have to amend
the code.
It does seem to work according to your data.

function parseLine(str,chn, mstr,cnt,i) {
#print &quot;Working&quot;, str
for (i=1 ; i <= length(str) ; i++) {
if (substr(str,i,1) == chn) {
cnt++
if (cnt == 6) {
mstr = substr(str,1,i - 1) substr(str,(i + 6) ,length(str))
}
}
}
return mstr
}




{
OFS = FS = &quot;^&quot;
if ($0 ~ /^E/ && $2 >=2 && $2 <=19) {
line = parseLine($0,&quot;^&quot;)
$0 = line
}
print
}

OP:
C^3^^^6036^1^19920401^39991231
C^3^^^52604^1^20021017^39991231
E^1^^^2224^6036^1
E^1^^^7187^52604^1
E^12^^^2224^6036^20021129^1^20021129^^^706895^2619028^
E^12^^^2224^6036^20021203^1^20021203^^^1964367^4583395^
E^15^^^7187^52604^20021121^1^20021121^^^21764706^82080711^
E^20^^^27084
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top