Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Sort Help Request 2

Status
Not open for further replies.

Kipnep70

Technical User
Nov 18, 2003
81
US
This is the type of data I'm working with:

GUEST
DN: apple
DS: G-WE01-E20A-SLS-4
DS: G-WE01-E203-SLS-4
GUEST
DN: grape
DS: G-WE01-E20B-SLS-4
GUEST
DN: banana
DS: G-WE01-E20A-SLS-4
DS: G-WE01-E200-SLS-4


I have two records here, the "GUEST" line I can use as my record separator.

Basically I'd like to organize my records by the line that begins with "DS" and include the "DN" data.

Desired output:


G-WE01-E20A-SLS-4
apple
banana

G-WE01-E20B-SLS-4
apple
grape

G-WE01-E200-SLS-4
banana

In other words I'm trying to sort this so I can see all the "DN" names affiliated with each "DS".




 
A starting point:
Code:
awk '/^DN:/{dn=$2}/^DS:/{t[$2]=t[$2]"\n"dn}END{for(i in t)print i t[i]}' /path/to/input

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Thanks PHV .. I still novice when it comes to associative arrays.
 
One more question that I can't figure out..

There's one more field I'd like to track in the equation..

GUEST
DN: apple
DS: G-WE01-E20A-SLS-4
DS: G-WE01-E203-SLS-4
GI: windows
GUEST
DN: grape
DS: G-WE01-E20B-SLS-4
GI: unix
GUEST
DN: banana
DS: G-WE01-E20A-SLS-4
DS: G-WE01-E200-SLS-4
GI: Solaris


I'd like that output to display the GI field in the sorting....

So:

G-WE01-E20A-SLS-4
apple - windows
banana - Solaris

G-WE01-E20B-SLS-4
apple - windows
grape - unix

G-WE01-E200-SLS-4
banana - Solaris



This is what I came up with:

awk '/^DN:/{dn=$2}/^GI:/{os=$2}/^DS:/{t[$2]=t[$2]"\n"dn" - "os}END{for(i in t)print "Datastore: "i t"\n"}'

But it's not associating the os with the correct record.
 
So I ended up trying to rearrange the lines so the last line was moved to the first line of each record (separated by GUEST).

It works.. but its long and ugly:

sed -n -e 's/GUEST/Q/p' -e '/^DN/p' -e '/^DS/p' -e '/^GI/p' /tmp/vStorage | awk 'BEGIN{RS="Q"}{print "GI: "$6,$0}'|awk 'BEGIN{RS=""}{del=NF-1;$NF="";$del="";print}'|sed 's/: /:/g' |tr ' ' '\n' | sed '/^$/d'| sed 's/:/: /g' |awk '/^DN:/{dn=$2}/^GI:/{os=$2}/^DS:/{t[$2]=t[$2]"\n"dn" - "os}END{for(i in t)print "Datastore: "i t"\n"}'
 
Actually I updated this from..

....awk 'BEGIN{RS="Q"}{print "GI: "$6,$0}......

to

....awk 'BEGIN{RS="Q"}{print "GI: "$NF,$0}.......

 
I would do something like this:

Code:
awk -F ': ' '
        [green]/^DN/[/green] { dn=[blue]$2[/blue] }
        [green]/^DS/[/green] { dshosts[[blue]$2[/blue]]=dshosts[[blue]$2[/blue]]dn[red]"[/red][purple] [/purple][red]"[/red] }
        [green]/^GI/[/green] { os[dn]=[blue]$2[/blue] }
        [green]END[/green] {
                [olive]for[/olive] (ds [olive]in[/olive] dshosts) {
                        [b]print[/b] ds
                        n=[b]split[/b](dshosts[ds],a,[green]/ /[/green])
                        [olive]for[/olive] (i=1;i<n;i++) { [b]print[/b] a[i][red]"[/red][purple] - [/purple][red]"[/red]os[a[i]] }
                        [b]print[/b] [red]"[/red][purple][/purple][red]"[/red]
                }
        }
' inputfile


Annihilannic
[small]tgmlify - code syntax highlighting for your tek-tips posts[/small]
 
Can you help me understand how the populating of dshosts[$2] works...

this is what I understand:

If I have:

DN: castle
DS: disk1
DS: disk2
GI: windows

then

/^DS/ { dshosts[$2]=dshosts[$2]dn" " }

really means

/^DS/ { dshosts[disk1]=disk1castle" " }
/^DS/ { dshosts[disk2]=disk2castle" " }

then when I print the values

I would expect something like

for(i in dshosts){print i dshosts}

disk1 disk1castle
disk2 disk2castle

but I'm seeing
disk1 castle
disk2 castle

I don't quite understand.


Thanks
 
Kipnep70 said:
really means

/^DS/ { dshosts[disk1]=disk1castle" " }
/^DS/ { dshosts[disk2]=disk2castle" " }

No, that's not quite correct. I'm using that syntax to append host names to a space-separated list of hosts.

Assuming no hosts have been encountered previously for "disk1", then dshost[disk1] will be empty, so it would really mean:

Code:
/^DS/ { dshosts[disk1]=""castle" " }

Therefore dshosts[disk1] now contains the word "castle" followed by a space.

Then when you encounter another host, say "fortress" containing "disk1", it would become:

Code:
/^DS/ { dshosts[disk1]="castle "fortress" " }

I use this method instead of a doubly-subscripted array because in awk it's really awkward (hehe) to iterate through such an array. Instead I just string 'em all together and then split() them up again later.

This is a case when I often switch to perl...

Annihilannic
[small]tgmlify - code syntax highlighting for your tek-tips posts[/small]
 
thank you so much for the explanation... I feel almost embarrassed now for not seeing that. I really appreciate your help.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top