awk - find - extract 2

Cimm · Jun 24, 2005

Hi, I am wondering how you guys would solve this with awk.

I have a messages file.
Everyday I have a line

"Jun 22 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"

The date would be needed to be dynamicly though.
ie , DATE=`date '+%b %e'`

The problem is, this message file is over 180.000 lines big/long.

I would like to extract a very small part, which starts with the above line(except the date would be today's date)
And 30-50 lines after it.

Anyone know a nifty way to do it instead of creating several tmp files.

Any help is appreciated.
Thanks

duncdude · Jun 24, 2005

Code:

[b]#!/usr/bin/perl[/b]

$today = `date`;
chomp $today;
print "today's date   : $today\n";

$today =~ s/^[A-Z][a-z]{2} ([A-Z][a-z]{2} +\d+) .*$/$1/;
print "date 'cleaned' : $today\n\n";

@lines = <DATA>;

for ($x=0; $x<=$#lines; $x++) {
  if ($lines["$x"] =~ m/$today/) {
    for ($y=$x; $y<=$x+4; $y++) {
      print $lines["$y"];
    }
  }
}

__DATA__
"Jun 12 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 13 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 14 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 15 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 16 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 17 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 18 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 19 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 20 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 21 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 22 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 23 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 24 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 25 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 26 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 27 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 28 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 29 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"
"Jun 30 21:04:49 root: [ID 702911 daemon.notice] Sun StorEdge(TM) Enterprise Backup savegroup: UNIX Daily"

Kind Regards
Duncan

Cimm · Jun 24, 2005

Thanks for the reply, but it isnt exactly what I am looking for.

The specified line is what I would like to grep for.
And the lines after it ( 30-50) is what I want to a seperate file. That information is completely different.

its basicly, a grep "line" start from there and tail -30 next lines to a file to a another file.
Which will be sent away via mail.

futurelet · Jun 24, 2005

I'm not familiar with the output of the Unix "date" command, so its parsing may need adjustment.

Code:

BEGIN {
  cnt = 30
  FS = "[^A-Za-z0-9]+"
  "date" | getline
  date = $2 " " $3
}

index( $0, date ), !--cnt

Cimm · Jun 27, 2005

I cant get this code to work. Probably because I dont understand it.

awk 'BEGIN {
cnt = 30
FS = "[^A-Za-z0-9]+"
"date" | getline
date = $2 " " $3
}' /logs/messages

index( $0, date ), !--cnt

date output = Mon Jun 27 09:09:26 EDT 2005
But since the log shows Jun 28 21:04:49 , see above posts.

to modify date output to look the same the result is
`date '+%b %e'` which gives out Jun 28

I get this error when running the script

awk: syntax error near line 4
awk: illegal statement near line 4
./mail_backup_log: line 23: syntax error near unexpected token `$0,'
./mail_backup_log: line 23: `index( $0, date ), !--cnt'

line 23

19 "date" | getline
20 date = $2 " " $3
21 }' /logs/messages
22
23 index( $0, date ), !--cnt
24

Any suggestions?

PHV · Jun 27, 2005

I'd try this:
awk 'BEGIN {
cnt = 30
FS = "[^A-Za-z0-9]+"
"date" | getline
date = $2 " " $3
}
index( $0, date ), !--cnt
' /logs/messages

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886

Cimm · Jun 27, 2005

Thanks, I guess its too early still for me.

Ok, I see I have some problem explaining my situation here.
The above code is doing the same thing as

grep "Jun 27" /logs/messages , but I get over 3000 lines of logs per day so I cant mail that output.

I am trying to to extract a small part of a log.
I have been looking at the log and the only thing that is alike every day is

Jun 26 20:00:01 ....... starting UNIX Daily ...........
Jun 26 21:02:39 ....... UNIX Daily completed ........

I want to extract everything between those lines in the logfile.

Anyone know a way to do that?

PHV · Jun 27, 2005

Perhaps something like this ?
awk 'BEGIN {
"date" | getline
date = $2 " " $3
}
!index( $0, date ){next}
/starting UNIX Daily/,/UNIX Daily completed/
/UNIX Daily completed/{exit}
' /logs/messages

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886

Cimm · Jun 27, 2005

PHV,
the code looks good, but I dont get any output.

I have modified the log file date to Jun 27 from Jun 26.

Cimm · Jun 27, 2005

Nevermind PHV, Apperently there were a doublespace.
Therefor it didnt found the starting part.

Thanks alot for the help.

Cimm · Jun 27, 2005

While I am into this.
Is there a similar way to use this but extract by time frame.

like from 19:00 to 21:00 ( same date ) incase there are no similar pattern for each day. This code you gave will be very usefull for other log extracting. But with some logs it will be quite impossible other than it is being executed and finished between the time.

Once again, thanks a ton.

PHV · Jun 27, 2005

Replace this:
/starting UNIX Daily/,/UNIX Daily completed/
/UNIX Daily completed/{exit}
By something like this:
$3>="19:00:00" && $3<="21:00:00"
$3>"21:00:00"{exit}

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886

Cimm · Jun 27, 2005

Awesome!.

Thanks PHV.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

awk - find - extract 2

Cimm

Technical User

duncdude

Programmer

Cimm

Technical User

futurelet

Programmer

Cimm

Technical User

PHV

MIS

Cimm

Technical User

PHV

MIS

Cimm

Technical User

Cimm

Technical User

Cimm

Technical User

PHV

MIS

Cimm

Technical User

Similar threads

Part and Inventory Search

Sponsor