Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Please explain following awk 1

Status
Not open for further replies.

Kipnep70

Technical User
Nov 18, 2003
81
US
I know what it does.. but I don't understand the syntax:

Code:
awk '!x[$0]++'

How does this remove duplicate lines out of a file?
 
x[$0]++ increments the value of a hash member indexed by the contents of the current line. So the first time a line is encountered, it will be changed from "undefined" (or 0 as far as awk is concerned) to 1. The second time a matching line is encountered, it will be increased to 2, etc.

The ++ being appended to the variable means the increment occurs after the expression is evaluated. If it was ++x[$0] it would happen before the expression was evaluated.

You'll notice there is no code in { brackets } after the expression. This means that awk will take the default action of printing the input line when the expression is "true".

By placing ! in front of the expression, it means that the overall result is "true" when the contents of the hash are undefined (i.e. 0), or "false" when it contains any other value (i.e. an identical line has been encountered previously), and in the latter case, nothing is printed.

That's a lot of explanation for 8 characters of awk script. :)

Annihilannic.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top