Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

perl regex match carriage return

Status
Not open for further replies.

allensim81

Technical User
Apr 9, 2008
15
MY
hi,
I want to grab the following.... But i have been trying many times... it doesnt work... can you please guide me step by step. thanks in advance.

HTTrack3.43-5+libhtsjava.so.2 launched on Thu, 23 Jul 2009 11:23:17 at +*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/*
(webhttrack -q -%i -w -O "/data/websites/RTM" -n -%P -N0 -s2 -p7 -D -a -K0 -c4 -%k -r2 -%e2 -A25000 -F "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)" -%F "<!-- Mirrored from %s%s by HTTrack Website Copier/3.x [XR&CO'2006], %s -->" +*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/* -%s -%u )



## Grab from the file content ##
#print "[$buf]\n";
if ($buf=~/\(webhttrack (.*?) \)[\r\n]/) {
$command="$httrackpath/httrack $1";
print "[$command]";
}
 
Code:
[gray]#!/usr/bin/perl[/gray]

[url=http://perldoc.perl.org/functions/use.html][black][b]use[/b][/black][/url] [green]warnings[/green][red];[/red]
[black][b]use[/b][/black] [green]strict[/green][red];[/red]

[url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [blue]$httrackpath[/blue] = [red]q{[/red][purple]/Some/Path[/purple][red]}[/red][red];[/red]
[black][b]my[/b][/black] [blue]$buf[/blue] = [red]q{[/red][purple]HTTrack3.43-5+libhtsjava.so.2 launched on Thu, 23 Jul 2009 11:23:17 at [URL unfurl="true"]http://www.rtm.net.my[/URL] +*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/*[/purple]
[purple](webhttrack -q -%i -w [URL unfurl="true"]http://www.rtm.net.my[/URL] -O "/data/websites/RTM" -n -%P -N0 -s2 -p7 -D -a -K0 -c4 -%k -r2 -%e2 -A25000 -F "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)" -%F "<!-- Mirrored from %s%s by HTTrack Website Copier/3.x [XR&CO'2006], %s -->" +*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/* -%s -%u )[/purple][red]}[/red][red];[/red]

[gray][i]#if ($buf=~/\(webhttrack (.*?) \)[\r\n]/) {[/i][/gray]
[olive][b]if[/b][/olive] [red]([/red][blue]$buf[/blue] =~ [red]m{[/red][purple][purple][b]\([/b][/purple]webhttrack (.*?) [purple][b]\)[/b][/purple]$[/purple][red]}[/red][red])[/red] [red]{[/red]
    [black][b]my[/b][/black] [blue]$command[/blue]=[red]"[/red][purple][blue]$httrackpath[/blue]/httrack [blue]$1[/blue][/purple][red]"[/red][red];[/red]
    [url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [red]"[/red][purple][[blue]$command[/blue]][/purple][red]"[/red][red];[/red]
[red]}[/red]


Trojan.
 
I'm not sure if that helps at all cos I really have no clue what you're trying to do but it does appear to match what you wanted (I think!).



Trojan.
 
Silly question, but have you "chomped" your input data?
I would expect that to be the first thing you do to each record but with such a small slice of code and data to play with I could be very wrong.



Trojan.
 
Hi,
actually the following is from a log file.
HTTrack3.43-5+libhtsjava.so.2 launched on Thu, 23 Jul 2009 11:23:17 at +*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/*
(webhttrack -q -%i -w -O "/data/websites/RTM" -n -%P -N0 -s2 -p7 -D -a -K0 -c4 -%k -r2 -%e2 -A25000 -F "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)" -%F "<!-- Mirrored from %s%s by HTTrack Website Copier/3.x [XR&CO'2006], %s -->" +*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/* -%s -%u )

and i am require to write a perl script to retrieve wadever from the above log file before the bracket.Well,the url address will be the parameter

following is my perl code:

#if ($buf=~/\(webhttrack (.*?) \)[\r\n]/) {
if ($buf =~ m{\(webhttrack (.*?) \)$}) {
my $command="$httrackpath/httrack $1";
print "[$command]";

can you please help me. thanks "-)
 
In the example above, what are you expecting to see in $command ?

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
hi, in the command$ i expect to see

httrack -q -%i -w -O "/data/websites/RTM" -n -%P -N0 -s2 -p7 -D -a -K0 -c4 -%k -r2 -%e2 -A25000 -F "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)" -%F "<!-- Mirrored from %s%s by HTTrack Website Copier/3.x [XR&CO'2006], %s -->" +*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/* -%s -%u

meaning the perl tht i wrote able to capture everythng of the above, all the parameters from the log file.

thanks for your guidiance and leading...
 
this is the perl tht i wrote:


if ($buf =~ m{\(webhttrack (.*?) \)$}) {
my $command="$httrackpath/httrack $1";
print "[$command]";

I think somethg wrong that why it cannot capture evrythg. can you please guide me. thanks
 
my problem now is when i am using ths perl srcipt:-

if ($buf =~ m{\(webhttrack (.*?) \)$}) {
my $command="$httrackpath/httrack $1";
print "[$command]";

it will gv me the result...
[/usr/local/bin/httrack -q -%i -w -O "/data/websites/RTM" -n -%P -N0 -s2 -p7 -D -a -K0 -c4 -%k -r2 -%e2 -A25000 -F "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)" -%F "<!-- Mirrored from %s%s by HTTrack Website Copier/3.x [XR&CO'2006], %s -->" +*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/* -%s -%u][Majlis Daerah Limbang]

But it fact i want the result to be until -%u], i dont want the [Majlis Daerah Limabg]

can guide me on this? thanks in advance.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top