rammalamma
Technical User
awk '
FNR==1 { FileCount++ }
{ rpm[$0]++ }
END { for (r in rpm) if (rpm[r] == FileCount) print r }
' file_*.txt
Can anyone explain this awk script to me. I have a directory of files, each file contains the rpms that are installed on a specific machine in the cluster. When I run this script it gives me one file with the intersection of the rpms that exist on all machines.
It's seems pretty simple but I have no idea how this script works. I would like to adjust it to also give me the union of all the rpms too.
FNR is ordinal number of the current record in the current file.
FileCount++ is being added each time through
rpm[$0]++ is an array, I don't know what $0 is.
and this line is a total mystery:
END { for (r in rpm) if (rpm[r] == FileCount) print r }
What is "r in rpm", what is r.
FNR==1 { FileCount++ }
{ rpm[$0]++ }
END { for (r in rpm) if (rpm[r] == FileCount) print r }
' file_*.txt
Can anyone explain this awk script to me. I have a directory of files, each file contains the rpms that are installed on a specific machine in the cluster. When I run this script it gives me one file with the intersection of the rpms that exist on all machines.
It's seems pretty simple but I have no idea how this script works. I would like to adjust it to also give me the union of all the rpms too.
FNR is ordinal number of the current record in the current file.
FileCount++ is being added each time through
rpm[$0]++ is an array, I don't know what $0 is.
and this line is a total mystery:
END { for (r in rpm) if (rpm[r] == FileCount) print r }
What is "r in rpm", what is r.