Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

scripting help (awk or sed)

Status
Not open for further replies.

tballs

Technical User
Jul 27, 2011
4
Hello,
I have a file that has something like this:

1245|123 CHURCH CT STRATFORD CT 02323|$1234.56|MORE INFO
1246|234 GREEN ST SALEM OR 97305|$11223.43|MORE INFO
1247|12 DAYLIGHT LN ALBANY OR 97321|$23.00|MORE INFO

I'm trying to format as follows so I can use awk to pull fields and place in a different order if I choose to do so:

number|address|city|st|zip|more info

There is 2 spaces between the address and city (or there should be). But what's happening if I key off the 2 spaces is there is also sporatic (2 spaces) like between "DAYLIGHT LN". Any ideas?
Thanks in advance
 
What does your current code for splitting up that field look like?

My suggestion would be to work backwards, i.e. anything after the last occurrence of 2 spaces is the city/zip/state combination, anything before it is the address.

Annihilannic.
 
Hello...
Currently I'm trying to run a sed routine to change to | if there is a double space. Problem being that I found one record (could be more) that have a double space in the address line other then between the city/state.

sed -e 's/,/|/g' -e 's/ CT 0/|CT|0/g' -e 's/ /|/g' -e 1,5d HMDA2>HMDA3

I'm new to using awk and sed, so not sure how I'd work backwards to accomplish this?
 
It's not straightforward, here is a way to do it with awk (using the data in your original post as input):

Code:
awk '
        [green]BEGIN[/green] { [blue]FS[/blue]=[blue]OFS[/blue]=[red]"[/red][purple]|[/purple][red]"[/red] }
        {
                r=[blue]$2[/blue]
                f=[red]"[/red][purple][/purple][red]"[/red]
                [gray]# match each double space, accumulating the[/gray]
                [gray]# first part "f" and the remainder "r"[/gray]
                [olive]while[/olive] ([b]match[/b](r,[red]"[/red][purple]  [/purple][red]"[/red])) {
                        f=f [b]substr[/b](r,1,[blue]RSTART[/blue]-1) [red]"[/red][purple]  [/purple][red]"[/red]
                        r=[b]substr[/b](r,[blue]RSTART[/blue]+[blue]RLENGTH[/blue])
                }
                [gray]# remainder contains city/state/zip, insert[/gray]
                [gray]# field separators[/gray]
                [b]gsub[/b]([green]/  */[/green],[blue]OFS[/blue],r)
                [gray]# reinsert to original string and output[/gray]
                [blue]$2[/blue]=f[red]"[/red][purple]|[/purple][red]"[/red]r
                [b]print[/b]
        }
' inputfile


Annihilannic.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top