Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

AWK - Sorting 1

Status
Not open for further replies.

nvhuser

Programmer
Apr 10, 2015
48
DE
Hello,
I would like to sort the following text:

Code:
GRID         112        1413.987-822.42180.59229
GRID         108         1453.57-820.634 79.0209
GRID         109        1457.868-820.76280.09176
GRID         111        1454.621-819.14574.98186
GRID         113        1408.708-822.44679.95545
GRID         110        1458.919-819.27476.05272
like this:

Code:
GRID         108         1453.57-820.634 79.0209
GRID         109        1457.868-820.76280.09176
GRID         110        1458.919-819.27476.05272
GRID         111        1454.621-819.14574.98186
GRID         112        1413.987-822.42180.59229
GRID         113        1408.708-822.44679.95545
How can I do it using awk, sorting the field comprised by the digits 9 to 16.
Thank you
 
the unix shell command sort can do this quite easily - any reason why you need to do awk?
(for help with the unix sort command, type man sort at a unix/linux command prompt).

==================================
adaptive uber info galaxies (bigger, better, faster, and more adept than agile big data clouds)


 
Hi Johnherman,

There is no reason to use awk, but I would like to learn how to do it using this language (it seems a problem for awk).
By the way, can you post the solution using unix shell? Thank you
 
Hi

Depends on what kind of Awk implementation you have. GNU Awk has some proprietary functions provided for such tasks, but in other implementations you may need to code the sorting by hand. ( [tt]man mawk[/tt] has an example code for this. )

You seems to need simple sorting by text of each line, for which [tt]asort()[/tt] is enough :
Code:
[teal]{[/teal]
    a[teal][[/teal]NR[teal]] =[/teal] [navy]$0[/navy]
[teal]}[/teal]
[b]END[/b] [teal]{[/teal]
    n [teal]=[/teal] [COLOR=orange]asort[/color][teal]([/teal]a[teal])[/teal]
    [b]for[/b] [teal]([/teal]i [teal]=[/teal] [purple]1[/purple][teal];[/teal] i [teal]<[/teal] n[teal];[/teal] i[teal]++)[/teal]
        [b]print[/b] a[teal][[/teal]i[teal]][/teal]
[teal]}[/teal]

If you want to sort by the value of a certain column, for distinct values is easy to solve it with [tt]asorti()[/tt] :
Code:
[teal]{[/teal]
    a[teal][[/teal][navy]$2[/navy][teal]] =[/teal] [navy]$0[/navy]
[teal]}[/teal]
[b]END[/b] [teal]{[/teal]
    n [teal]=[/teal] [COLOR=orange]asorti[/color][teal]([/teal]a[teal],[/teal] o[teal])[/teal]
    [b]for[/b] [teal]([/teal]i [teal]=[/teal] [purple]1[/purple][teal];[/teal] i [teal]<[/teal] n[teal];[/teal] i[teal]++)[/teal]
        [b]print[/b] a[teal][[/teal]o[teal][[/teal]i[teal]]][/teal]
[teal]}[/teal]

But if you want to sort by a column and there may be duplicated values, things become more complicated.


Feherke.
feherke.ga
 
I would like to learn how to do it
What have you tried so far and where in your code are you stuck ?

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Thank you for your answer feherke, since I am not so experienced with this language, could you tell me how to read the file "grids.txt" together with awk? Normally I use awk language in the linux terminal like this:

Code:
>> cat file.txt | awk '{printf("%s\n",$1)}'

Hi PHV, regarding your comment, I am learning awk in my free times, being on an early stage I dont have code where I am stuck. Currently, I would like to expose just some questions and see some different approaches to solve a given problem. I think, it constitutes a valuable source for me and for all the users of this forum.



 
Hi

Exactly like that. Maybe avoiding the UUOC :
Code:
[blue]master #[/blue] awk '{a[NR]=$0}END{n=asort(a);for(i=1;i<n;i++)print a[i]}' grids.txt
GRID         108         1453.57-820.634 79.0209
GRID         109        1457.868-820.76280.09176
GRID         110        1458.919-819.27476.05272
GRID         111        1454.621-819.14574.98186
GRID         112        1413.987-822.42180.59229

Or put the code I posted earlier in a file, for example grids.awk and call it like this :
Code:
[blue]master #[/blue] awk -f grids.awk grids.txt
GRID         108         1453.57-820.634 79.0209
GRID         109        1457.868-820.76280.09176
GRID         110        1458.919-819.27476.05272
GRID         111        1454.621-819.14574.98186
GRID         112        1413.987-822.42180.59229

Or ask the system where your Awk interpreter is ( [tt]which awk[/tt] ) and add a shebang pointing to it in grids.awk's first line :
Code:
[gray]#!/usr/bin/awk -f[/gray]

[teal]{[/teal]
    a[teal][[/teal]NR[teal]] =[/teal] [navy]$0[/navy]
[teal]}[/teal]

[b]END[/b] [teal]{[/teal]
    n [teal]=[/teal] [COLOR=orange]asort[/color][teal]([/teal]a[teal])[/teal]
    [b]for[/b] [teal]([/teal]i [teal]=[/teal] [purple]1[/purple][teal];[/teal] i [teal]<[/teal] n[teal];[/teal] i[teal]++)[/teal]
        [b]print[/b] a[teal][[/teal]i[teal]][/teal]
[teal]}[/teal]
... then make the file executable ( [tt]chmod +x grids.awk[/tt] ) and run it :
Code:
[blue]master #[/blue] ./grids.awk grids.txt 
GRID         108         1453.57-820.634 79.0209
GRID         109        1457.868-820.76280.09176
GRID         110        1458.919-819.27476.05272
GRID         111        1454.621-819.14574.98186
GRID         112        1413.987-822.42180.59229

Of course, if you move ( or symlink ) grids.awk into one of the directories enumerated in your [tt]PATH[/tt] environment variable, you will be able to run it without the [tt]./[/tt] ( or whatever else ) path.

Feherke.
feherke.ga
 
Thank you for your reply Feherke!

I still have some questions regarding your script:
1- Why I don't need to use "BEGIN" in this case?

2- How do the script knows that I am sorting the lines regarding the field 2?

 
Hi

nvhuser said:
1- Why I don't need to use "BEGIN" in this case?
Well, the code's logic requires nothing to be done before processing any input line.

Theoretically in our case array a's declaration could be in the [tt]BEGIN[/tt] block, but no such thing exists in Awk.

nvhuser said:
2- How do the script knows that I am sorting the lines regarding the field 2?
The first code I posted does not. It works for your sample input because the 1[sup]st[/sup] column is identical, so in this case sorting by whole lines and sorting 2[sup]nd[/sup] column have the same result.

The second code I posted sorts by 2[sup]nd[/sup] column as array a is indexed by [tt]$2[/tt] then the array is sorted nut by values, but by indexes. ( As mentioned, that indexing by [tt]$2[/tt] has the drawback that [tt]$2[/tt] has to be unique. )


Feherke.
feherke.ga
 
Hi Feherke, thank you for your contribution and your detailed answers.
I will keep learning awk!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top