Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Sorting nested columns 2

Status
Not open for further replies.

IMAUser

Technical User
May 28, 2003
121
CH
Hi ,
I have a file with data like
FamilyP:
: ChildD
: ChildC
: ChildA
FamilyD:
: ChildC
: ChildB
FamilyA:
: ChildP
: ChildB

Is there a way in which I could sort the data so that it is sorted by family and children. I tried the sort command but there is no way to say sort the first column and within that sort the second column . The o/p I need is as below...

FamilyA:
: ChildB
: ChildP
FamilyD:
: ChildB
: ChildC
FamilyP:
: ChildA
: ChildC
: ChildD

Any help much appreciated. If this cannot be done by shell scripting, can this be done using AWK ?

Thanks.
 
The easiest way I could think of was to convert it to this format:

[tt]FamilyA: ChildB
FamilyA: ChildP
FamilyD: ChildB
FamilyD: ChildC
FamilyP: ChildA
FamilyP: ChildC
FamilyP: ChildD[/tt]

... sort it, and then convert it back again. So:

Code:
awk '
        BEGIN { FS=OFS=":" }
        /^Fam/ { family=$1 }
        /^:/ { print family,$2 }
' input | sort -t: -k 1,1 -k 2,2 | awk -F: '
        $1 != last { print $1":" }
        { print ":" $2; last = $1 }
'


Annihilannic.
 
Sorry.

I copied the code into a script and called it one.sh

On the command prompt I said

$one.sh my_data_file.txt

That came back with an error saying
awk: can't open input.

So obviously I am doing something wrong there. Can you tell me please how do I need to run that code.

Thanks.
 
Replace the word 'input' with $1 so that it substitutes your data filename in the correct place.

Annihilannic.
 
I am alwyas amazed at the power of AWK. Thanks. It works very well.
 
Sorry, I understand bits of that script but not every line. CAn you maybe add a comment for each line please.
 
This part converts it to the format shown above:

Code:
awk '
        # Set input and output field separater to ":"
        BEGIN { FS=OFS=":" }
        # If the line begins with Fam, set family
        # variable to that value
        /^Fam/ { family=$1 }
        # If the line begins with :, print the
        # family and child name
        /^:/ { print family,$2 }
' input

This part sorts it first by the family field, then by the child field:

Code:
| sort -t: -k 1,1 -k 2,2

And then this part converts it back to your original format (-F: is a shortcut to set the input separator only):

Code:
| awk -F: '
        # If the family name is not the 
        # same as the last one printed,
        # print it.
        $1 != last { print $1":" }
        # Print the child name, store
        # the last family name printed.
        { print ":" $2; last = $1 }
'


Annihilannic.
 
Thats very helpful. Thanks very much.
 
newLisp:
Code:
(while (read-line)
  (if (ends-with (current-line) ":")
    (push (list (current-line)) lst)
    (push (current-line) lst 0 -1)))
(sort lst)
(dolist (fam lst)
  (println (first fam))
  (println (join (sort (rest fam)) "\n")))
Run it with
Code:
newlisp famsort.lsp <infile
 
Looks pretty nifty, this newlisp stuff... even if it goes go a bit overboard with the brackets. :)

Annihilannic.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top