Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Remove duplicate lines except first and last 1

Status
Not open for further replies.

motoslide

MIS
Oct 30, 2002
764
US
I'm working on a script to monitor POP3 access to one of our servers. For each Client, I'd like to list the first and last instance for each day. The log files look like this:
Code:
Oct 10 08:26:28 aon3 ipop3d[1118]: Login user=jdoe host=jdoe
Oct 10 08:29:45 aon3 ipop3d[1151]: Login user=jdoe host=jdoe
Oct 10 17:08:59 aon3 ipop3d[2531]: Login user=jdoe host=jdoe
Oct 10 17:10:26 aon3 ipop3d[2532]: Login user=jdoe host=jdoe
Oct 11 07:02:07 aon3 ipop3d[3439]: Login user=jdoe host=jdoe
Oct 11 07:05:25 aon3 ipop3d[3440]: Login user=jdoe host=jdoe
Oct 11 15:21:31 aon3 ipop3d[4315]: Login user=jdoe host=jdoe
Oct 11 15:23:02 aon3 ipop3d[4321]: Login user=jdoe host=jdoe
Oct 13 08:47:44 aon3 ipop3d[6936]: Login user=jdoe host=jdoe
Oct 13 08:50:45 aon3 ipop3d[6937]: Login user=jdoe host=jdoe
Oct 13 22:51:19 aon3 ipop3d[7764]: Login user=jdoe host=jdoe
Oct 13 22:54:06 aon3 ipop3d[7765]: Login user=jdoe host=jdoe
Oct 14 09:05:06 aon3 ipop3d[8854]: Login user=jdoe host=jdoe
Oct 14 09:08:15 aon3 ipop3d[8857]: Login user=jdoe host=jdoe

I want the output to be:
Code:
Oct 10 08:26:28 aon3 ipop3d[1118]: Login user=jdoe host=jdoe
Oct 10 17:10:26 aon3 ipop3d[2532]: Login user=jdoe host=jdoe
Oct 11 07:02:07 aon3 ipop3d[3439]: Login user=jdoe host=jdoe
Oct 11 15:23:02 aon3 ipop3d[4321]: Login user=jdoe host=jdoe
Oct 13 08:47:44 aon3 ipop3d[6936]: Login user=jdoe host=jdoe
Oct 13 22:54:06 aon3 ipop3d[7765]: Login user=jdoe host=jdoe
Oct 14 09:05:06 aon3 ipop3d[8854]: Login user=jdoe host=jdoe
Oct 14 17:04:40 aon3 ipop3d[9792]: Login user=jane host=jane

I can eliminate all duplicate date entries using this I found in this forum (Thanks to Vlad):
sort -t" " -u -k1,2

Does anybody know how I can retain the FIRST and LAST "duplicate" lines?

 
A starting point, provided your log file is already sorted:
awk '
$1!=f1 || $2!=f2{
if(NR>1)print x"\n"y
f1=$1;f2=$2;x=$0;next
}
{ y=$0 }
END{if(NR>1)print x"\n"y}
' /path/to/input > output

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886
 
Perfect! My homework will be to understand how this works.
I have other files with similar needs, and will need to identify the duplicates on different fields. My guess is that I can just change the field labels as desired (say change the $1 & $2 values to $5 & $6).

Thanks much!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top