Intersection of values

rammalamma · May 13, 2010

awk '
FNR==1 { FileCount++ }
{ rpm[$0]++ }
END { for (r in rpm) if (rpm[r] == FileCount) print r }
' file_*.txt

Can anyone explain this awk script to me. I have a directory of files, each file contains the rpms that are installed on a specific machine in the cluster. When I run this script it gives me one file with the intersection of the rpms that exist on all machines.

It's seems pretty simple but I have no idea how this script works. I would like to adjust it to also give me the union of all the rpms too.

FNR is ordinal number of the current record in the current file.

FileCount++ is being added each time through

rpm[$0]++ is an array, I don't know what $0 is.

and this line is a total mystery:
END { for (r in rpm) if (rpm[r] == FileCount) print r }

What is "r in rpm", what is r.

PHV · May 13, 2010

man awk

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886

Annihilannic · May 17, 2010

Code:

awk '
        [gray]# Increment file count each time we hit the first record of an input[/gray]
        [gray]# file[/gray]
        [blue]FNR[/blue]==1 { FileCount++ }
        [gray]# Increment the the count of instances of this RPM found ($0 contains[/gray]
        [gray]# entire contents of input line)[/gray]
        { rpm[[blue]$0[/blue]]++ }
        [gray]# When all input files have been processed...[/gray]
        [green]END[/green] {
                [gray]# For each index of the rpm array[/gray]
                [olive]for[/olive] (r [olive]in[/olive] rpm) {
                        [gray]# If that RPM occurs as many times as there are input[/gray]
                        [gray]# files, print it.[/gray]
                        [olive]if[/olive] (rpm[r] == FileCount) [b]print[/b] r
                }
        }
' file_*.txt

In awk, arrays can be indexed by strings. These are usually called "hashes" rather than "arrays" in other languages, such as perl. So the rpm[] array is an array of counts indexed by the RPM names, and for (r in rpm) iterates through those indices.

Annihilannic.

rammalamma · May 18, 2010

Thank you very much!!!

My brain doesn't hurt so much when I look at it now.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Intersection of values

rammalamma

Technical User

PHV

MIS

Annihilannic

MIS

rammalamma

Technical User

Similar threads

Part and Inventory Search

Sponsor