Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Stripping White Space from Fields 2

Status
Not open for further replies.

reallyawkward

Technical User
Feb 17, 2012
4
CA
Hi,
I am reading in a file which consists of fixed length records with white spaces and tabs separating fields. Some fields have white spaces with them.
Here are 2 sample records:[tt]
----+----|----+----|----+----|----+----|----+----|----+----|----+---
John Smith 5551111212 TECHNICAL LEAD DATABASE
MICHAEL ANDERSON-KLEIN 5555678765 ADMIN ASSISTANT
[/tt]
I am currently reading the records as follows:
{
FIRST=substr($0,1,12)
LAST=substr($0,13,18)
PHONE=substr($0,31,10)
TITLE=substr($0,41,22)
}

How can I strip the trailing spaces from each field?

Thanks in advance.

Really Awkward
;-)
 
A startin point:
Code:
FIRST=substr($0,1,12); sub(/[ \t]*$/,"",FIRST)

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
reallyawkward,
You do it a little bit complicated.
Instead of using substr on $0 use columns (i.e. $1, $2, ...) and you don't need to solve problems with white spaces, for example:
Code:
{
   FIRST=$1
   LAST=$2
   PHONE=$3
   TITLE = $4
   # get rest of columns into TITLE
   for (i=5; i<=NF; i++)
     TITLE= TITLE " " $i
   #
   print "record #" NR ":"
   print "FIRST = '" FIRST "'"
   print "LAST = '" LAST "'"
   print "PHONE = '" PHONE "'"
   print "TITLE = '" TITLE "'"
}
Output:
Code:
record #1:
FIRST = 'John'
LAST = 'Smith'
PHONE = '5551111212'
TITLE = 'TECHNICAL LEAD DATABASE'
record #2:
FIRST = 'MICHAEL'
LAST = 'ANDERSON-KLEIN'
PHONE = '5555678765'
TITLE = 'ADMIN ASSISTANT'
 

Sorry mikrom, but it will not work if the first or last name contain spaces, and there are many out there...
[noevil]






----------------------------------------------------------------------------
The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb
 
I combined in the inputfile spaces and tsbs between fields and it works for me anyhow. I get the output I posted above.
 
LKBrwnDBA said:
Sorry mikrom, but it will not work if the first or last name contain spaces, and there are many out there...
Sorry, now I'm understand what you say (english isn't my mothers language).
Yes, you are right in that case it will not work, but the file posted by OP doesn't contain such case.
:)
 
mikrom said:
..., but the file posted by OP doesn't contain such case.
True, but that is why we are experts, we look ahead for all possibilities.
[3eyes]


----------------------------------------------------------------------------
The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb
 
Then I would try what PHV suggested - i.e. something like this:
Code:
{
   FIRST=trim([COLOR=#008080]substr[/color]([COLOR=#6a5acd]$0[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]1[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]12[/color]))
   LAST=trim([COLOR=#008080]substr[/color]([COLOR=#6a5acd]$0[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]13[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]18[/color]))
   PHONE=trim([COLOR=#008080]substr[/color]([COLOR=#6a5acd]$0[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]31[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]10[/color]))
   TITLE=trim([COLOR=#008080]substr[/color]([COLOR=#6a5acd]$0[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]41[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]22[/color]))
   [COLOR=#804040][b]print[/b][/color] [COLOR=#ff00ff]"record #"[/color] [COLOR=#6a5acd]NR[/color] [COLOR=#ff00ff]":"[/color]
   [COLOR=#804040][b]print[/b][/color] [COLOR=#ff00ff]"FIRST = '"[/color] FIRST [COLOR=#ff00ff]"'"[/color]
   [COLOR=#804040][b]print[/b][/color] [COLOR=#ff00ff]"LAST = '"[/color] LAST [COLOR=#ff00ff]"'"[/color]
   [COLOR=#804040][b]print[/b][/color] [COLOR=#ff00ff]"PHONE = '"[/color] PHONE [COLOR=#ff00ff]"'"[/color]
   [COLOR=#804040][b]print[/b][/color] [COLOR=#ff00ff]"TITLE = '"[/color] TITLE [COLOR=#ff00ff]"'"[/color]
}
[COLOR=#0000ff]#----------------------------------------------------------[/color]
[COLOR=#804040][b]function[/b][/color] trim(str) {
  [COLOR=#0000ff]# remove blanks from beginning[/color]
  [COLOR=#008080]gsub[/color]([COLOR=#ff00ff]/[/color][COLOR=#6a5acd]^[/color][COLOR=#ff00ff][[/color][COLOR=#6a5acd] [/color][COLOR=#804040][b]\t[/b][/color][COLOR=#ff00ff]][/color][COLOR=#6a5acd]+[/color][COLOR=#ff00ff]/[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]""[/color][COLOR=#6a5acd],[/color] str)
  [COLOR=#0000ff]# remove blanks from end[/color]
  [COLOR=#008080]gsub[/color]([COLOR=#ff00ff]/[[/color][COLOR=#6a5acd] [/color][COLOR=#804040][b]\t[/b][/color][COLOR=#ff00ff]][/color][COLOR=#6a5acd]+$[/color][COLOR=#ff00ff]/[/color][COLOR=#6a5acd],[/color][COLOR=#ff00ff]""[/color][COLOR=#6a5acd],[/color] str)
  [COLOR=#804040][b]return[/b][/color] str
}
 
The TITLE field is not the only field which contains data. The data I presented here are just an example.
The actual data record is more than 100 bytes in length.
 
Sorry. I should have read this before I posted it. What I really wanted to say is:
The TITLE field is not the only field which contains spaces. The data I presented here are just an example.
The actual data record is more than 100 bytes in length.
 
reallyawkward,
ok, but say us, if the code we posted works for you or not.
 
I had the awk statement embedded within a bash script.
What would the whole script look like if I actually created it as an awk script?
Assume that the data file is called phone.data

Sorry, but as a bash scripter I am still awkward (sorry) around awk scripts.

:)
 
Normally it's fine to embed awk scripts within other shell scripts using awk 'awk script here' inputfile however in this case it might get quite messy because your script contains single quotes and would require some escaping.

You can place your script, like mikrom's solution above, in a file, say mikrom.awk, and either invoke it using awk -f mikrom.awk inputfile or add a shebang line such as #!/usr/bin/awk -f to the beginning and invoke it using /path/to/mikrom.awk inputfile.

When it is a pure awk script, hard-coding the input file is a little messy because you need to open and read the file manually... or maybe stuff the input filename into the positional parameters before processing.

Annihilannic
[small]tgmlify - code syntax highlighting for your tek-tips posts[/small]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top