Stripping White Space from Fields 2

reallyawkward · Feb 17, 2012

Hi,
I am reading in a file which consists of fixed length records with white spaces and tabs separating fields. Some fields have white spaces with them.
Here are 2 sample records:[tt]
----+----|----+----|----+----|----+----|----+----|----+----|----+---
John Smith 5551111212 TECHNICAL LEAD DATABASE
MICHAEL ANDERSON-KLEIN 5555678765 ADMIN ASSISTANT
[/tt]
I am currently reading the records as follows:
{
FIRST=substr($0,1,12)
LAST=substr($0,13,18)
PHONE=substr($0,31,10)
TITLE=substr($0,41,22)
}

How can I strip the trailing spaces from each field?

Thanks in advance.

Really Awkward
;-)

PHV · Feb 17, 2012

A startin point:

Code:

FIRST=substr($0,1,12); sub(/[ \t]*$/,"",FIRST)

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886

mikrom · Feb 18, 2012

reallyawkward,
You do it a little bit complicated.
Instead of using substr on $0 use columns (i.e. $1, $2, ...) and you don't need to solve problems with white spaces, for example:

Code:

{
   FIRST=$1
   LAST=$2
   PHONE=$3
   TITLE = $4
   # get rest of columns into TITLE
   for (i=5; i<=NF; i++)
     TITLE= TITLE " " $i
   #
   print "record #" NR ":"
   print "FIRST = '" FIRST "'"
   print "LAST = '" LAST "'"
   print "PHONE = '" PHONE "'"
   print "TITLE = '" TITLE "'"
}

Output:

Code:

record #1:
FIRST = 'John'
LAST = 'Smith'
PHONE = '5551111212'
TITLE = 'TECHNICAL LEAD DATABASE'
record #2:
FIRST = 'MICHAEL'
LAST = 'ANDERSON-KLEIN'
PHONE = '5555678765'
TITLE = 'ADMIN ASSISTANT'

LKBrwnDBA · Feb 21, 2012

Sorry mikrom, but it will not work if the first or last name contain spaces, and there are many out there...
[noevil]

----------------------------------------------------------------------------
The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb

mikrom · Feb 21, 2012

I combined in the inputfile spaces and tsbs between fields and it works for me anyhow. I get the output I posted above.

mikrom · Feb 21, 2012

LKBrwnDBA said:
Sorry mikrom, but it will not work if the first or last name contain spaces, and there are many out there...

Sorry, now I'm understand what you say (english isn't my mothers language).
Yes, you are right in that case it will not work, but the file posted by OP doesn't contain such case.

LKBrwnDBA · Feb 22, 2012

mikrom said:
..., but the file posted by OP doesn't contain such case.

True, but that is why we are experts, we look ahead for all possibilities.
[3eyes]

----------------------------------------------------------------------------
The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb

mikrom · Feb 22, 2012

Then I would try what PHV suggested - i.e. something like this:

Code:

{
   FIRST=trim([COLOR=#008080]substr[/color]([COLOR=#6a5acd]$0[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]1[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]12[/color]))
   LAST=trim([COLOR=#008080]substr[/color]([COLOR=#6a5acd]$0[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]13[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]18[/color]))
   PHONE=trim([COLOR=#008080]substr[/color]([COLOR=#6a5acd]$0[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]31[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]10[/color]))
   TITLE=trim([COLOR=#008080]substr[/color]([COLOR=#6a5acd]$0[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]41[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]22[/color]))
   [COLOR=#804040][b]print[/b][/color] [COLOR=#ff00ff]"record #"[/color] [COLOR=#6a5acd]NR[/color] [COLOR=#ff00ff]":"[/color]
   [COLOR=#804040][b]print[/b][/color] [COLOR=#ff00ff]"FIRST = '"[/color] FIRST [COLOR=#ff00ff]"'"[/color]
   [COLOR=#804040][b]print[/b][/color] [COLOR=#ff00ff]"LAST = '"[/color] LAST [COLOR=#ff00ff]"'"[/color]
   [COLOR=#804040][b]print[/b][/color] [COLOR=#ff00ff]"PHONE = '"[/color] PHONE [COLOR=#ff00ff]"'"[/color]
   [COLOR=#804040][b]print[/b][/color] [COLOR=#ff00ff]"TITLE = '"[/color] TITLE [COLOR=#ff00ff]"'"[/color]
}
[COLOR=#0000ff]#----------------------------------------------------------[/color]
[COLOR=#804040][b]function[/b][/color] trim(str) {
  [COLOR=#0000ff]# remove blanks from beginning[/color]
  [COLOR=#008080]gsub[/color]([COLOR=#ff00ff]/[/color][COLOR=#6a5acd]^[/color][COLOR=#ff00ff][[/color][COLOR=#6a5acd] [/color][COLOR=#804040][b]\t[/b][/color][COLOR=#ff00ff]][/color][COLOR=#6a5acd]+[/color][COLOR=#ff00ff]/[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]""[/color][COLOR=#6a5acd],[/color] str)
  [COLOR=#0000ff]# remove blanks from end[/color]
  [COLOR=#008080]gsub[/color]([COLOR=#ff00ff]/[[/color][COLOR=#6a5acd] [/color][COLOR=#804040][b]\t[/b][/color][COLOR=#ff00ff]][/color][COLOR=#6a5acd]+$[/color][COLOR=#ff00ff]/[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]""[/color][COLOR=#6a5acd],[/color] str)
  [COLOR=#804040][b]return[/b][/color] str
}

reallyawkward · Feb 23, 2012

The TITLE field is not the only field which contains data. The data I presented here are just an example.
The actual data record is more than 100 bytes in length.

reallyawkward · Feb 23, 2012

Sorry. I should have read this before I posted it. What I really wanted to say is:
The TITLE field is not the only field which contains spaces. The data I presented here are just an example.
The actual data record is more than 100 bytes in length.

mikrom · Feb 23, 2012

reallyawkward,
ok, but say us, if the code we posted works for you or not.

reallyawkward · Feb 23, 2012

I had the awk statement embedded within a bash script.
What would the whole script look like if I actually created it as an awk script?
Assume that the data file is called phone.data

Sorry, but as a bash scripter I am still awkward (sorry) around awk scripts.

Annihilannic · Feb 23, 2012

Normally it's fine to embed awk scripts within other shell scripts using awk 'awk script here' inputfile however in this case it might get quite messy because your script contains single quotes and would require some escaping.

You can place your script, like mikrom's solution above, in a file, say mikrom.awk, and either invoke it using awk -f mikrom.awk inputfile or add a shebang line such as #!/usr/bin/awk -f to the beginning and invoke it using /path/to/mikrom.awk inputfile.

When it is a pure awk script, hard-coding the input file is a little messy because you need to open and read the file manually... or maybe stuff the input filename into the positional parameters before processing.

Annihilannic
[small]tgmlify - code syntax highlighting for your tek-tips posts[/small]

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Stripping White Space from Fields 2

reallyawkward

Technical User

PHV

MIS

mikrom

Programmer

LKBrwnDBA

MIS

mikrom

Programmer

mikrom

Programmer

LKBrwnDBA

MIS

mikrom

Programmer

reallyawkward

Technical User

reallyawkward

Technical User

mikrom

Programmer

reallyawkward

Technical User

Annihilannic

MIS

Similar threads

Part and Inventory Search

Sponsor