PCHomepage
Programmer
I'm trying to parse the apache.log file and have a function working that does so, but I know little about regex and could not parse it directly. I reviewd and tried just about every posting I could find here and on other sites but nothing worked. Please see $pattern below and can tell me how to parse each section without the ugly work-arounds? This is for the local development copy of the log which is truncated (without the last two columns) but I would like to use the same code to also parse the live NCSA combined log format. Any help is appreciated.
Code:
[COLOR=gray]function ParseLocalToScreen($path) {
global $output;
// Parses the local Windows Apache development Log Format lines:[/color]
// [bold]REF RAW LOG: 127.0.0.1 - - [27/Apr/2014:15:00:24 -0700] "GET / HTTP/1.1" 200 3051[/bold]
// [bold]REF PARSED OUTPUT: 127.0.0.1, -, -, 2014-04-27 15:00:24, GET, /, HTTP/1.1, 200, 3051[/bold]
[bold]$pattern = '/^(\S+)\s '; // Remote Host
$pattern .= '([^\s]+) '; // Log Name
$pattern .= '([^\s]+) '; // User
$pattern .= '\[(\d+)\/(\w+)\/(\d+):(\d{1,2}:\d{1,2}:\d{1,2} '; // Datetime WORKS SO-SO
$pattern .= '?[\+\-]?\d*)\] "(.*)/';[/bold] // Remainder
[COLOR=gray]if (is_readable($path)) :
$fh = fopen($path,'r') or die($php_errormsg);
while (!feof($fh)) :
$s = fgets($fh);
if (preg_match($pattern,$s,$matches)) :
[bold]list($whole_match, $remote_host, $logname, $user, $day, $month, $year, $time, $remainder) = $matches;[/color]
$month = date('m', strtotime($month)); // Converts short month to numeric
$time = trim(substr($time,0,-6)); // Removes -0800 offset
$replacements = array(' ', '"');
$remainder = str_replace($replacements, ', ', $remainder); // Removes extra space and quote
endif;
// REGEX NOT WORKING: remove extra field, build datetime and other output for MySQL
$output .= str_replace(", , ", ", ", "$remote_host, $logname, $user, $year-$month-$day $time, $remainder<br>\n");[/bold]
endwhile;
[COLOR=gray]fclose($fh);
echo $output;
else :
echo "Cannot access log file!";
endif;
}[/color]