Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

awk - find - extract 2

Status
Not open for further replies.

Cimm

Technical User
Feb 17, 2005
69
US
Hi, I am wondering how you guys would solve this with awk.

I have a messages file.
Everyday I have a line

"Jun 22 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"

The date would be needed to be dynamicly though.
ie , DATE=`date '+%b %e'`

The problem is, this message file is over 180.000 lines big/long.

I would like to extract a very small part, which starts with the above line(except the date would be today's date)
And 30-50 lines after it.

Anyone know a nifty way to do it instead of creating several tmp files.

Any help is appreciated.
Thanks
 
Code:
[b]#!/usr/bin/perl[/b]

$today = `date`;
chomp $today;
print "today's date   : $today\n";

$today =~ s/^[A-Z][a-z]{2} ([A-Z][a-z]{2} +\d+) .*$/$1/;
print "date 'cleaned' : $today\n\n";

@lines = <DATA>;

for ($x=0; $x<=$#lines; $x++) {
  if ($lines["$x"] =~ m/$today/) {
    for ($y=$x; $y<=$x+4; $y++) {
      print $lines["$y"];
    }
  }
}

__DATA__
"Jun 12 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 13 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 14 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 15 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 16 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 17 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 18 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 19 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 20 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 21 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 22 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 23 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 24 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 25 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 26 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 27 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 28 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 29 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 30 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"


Kind Regards
Duncan
 
Thanks for the reply, but it isnt exactly what I am looking for.

The specified line is what I would like to grep for.
And the lines after it ( 30-50) is what I want to a seperate file. That information is completely different.

its basicly, a grep "line" start from there and tail -30 next lines to a file to a another file.
Which will be sent away via mail.

 
I'm not familiar with the output of the Unix "date" command, so its parsing may need adjustment.
Code:
BEGIN {
  cnt = 30
  FS = "[^A-Za-z0-9]+"
  "date" | getline
  date = $2 " " $3
}

index( $0, date ), !--cnt
 
I cant get this code to work. Probably because I dont understand it.

awk 'BEGIN {
cnt = 30
FS = "[^A-Za-z0-9]+"
"date" | getline
date = $2 " " $3
}' /logs/messages

index( $0, date ), !--cnt


date output = Mon Jun 27 09:09:26 EDT 2005
But since the log shows Jun 28 21:04:49 , see above posts.

to modify date output to look the same the result is
`date '+%b %e'` which gives out Jun 28

I get this error when running the script

awk: syntax error near line 4
awk: illegal statement near line 4
./mail_backup_log: line 23: syntax error near unexpected token `$0,'
./mail_backup_log: line 23: `index( $0, date ), !--cnt'


line 23

19 "date" | getline
20 date = $2 " " $3
21 }' /logs/messages
22
23 index( $0, date ), !--cnt
24



Any suggestions?
 
I'd try this:
awk 'BEGIN {
cnt = 30
FS = "[^A-Za-z0-9]+"
"date" | getline
date = $2 " " $3
}
index( $0, date ), !--cnt
' /logs/messages


Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886
 
Thanks, I guess its too early still for me.

Ok, I see I have some problem explaining my situation here.
The above code is doing the same thing as

grep "Jun 27" /logs/messages , but I get over 3000 lines of logs per day so I cant mail that output.

I am trying to to extract a small part of a log.
I have been looking at the log and the only thing that is alike every day is

Jun 26 20:00:01 ....... starting UNIX Daily ...........
Jun 26 21:02:39 ....... UNIX Daily completed ........

I want to extract everything between those lines in the logfile.

Anyone know a way to do that?

 
Perhaps something like this ?
awk 'BEGIN {
"date" | getline
date = $2 " " $3
}
!index( $0, date ){next}
/starting UNIX Daily/,/UNIX Daily completed/
/UNIX Daily completed/{exit}
' /logs/messages

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886
 
PHV,
the code looks good, but I dont get any output.

I have modified the log file date to Jun 27 from Jun 26.
 
Nevermind PHV, Apperently there were a doublespace.
Therefor it didnt found the starting part.

Thanks alot for the help.
 
While I am into this.
Is there a similar way to use this but extract by time frame.

like from 19:00 to 21:00 ( same date ) incase there are no similar pattern for each day. This code you gave will be very usefull for other log extracting. But with some logs it will be quite impossible other than it is being executed and finished between the time.

Once again, thanks a ton.
 
Replace this:
/starting UNIX Daily/,/UNIX Daily completed/
/UNIX Daily completed/{exit}
By something like this:
$3>="19:00:00" && $3<="21:00:00"
$3>"21:00:00"{exit}

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top