Hi,
I'm writing a pretty involved spam filter in awk ... so far so good, for the most part.
Basically, the program reads in a queued mail file, stores each line in an array:
tempMail[i++] = $0
... then scans each line against several spam regexps, and then in the END portion of the script, loops through the temp array and writes out to a new file.
While writing out to this new temporary file, each line is checked to see if it's the "Subject" header ... if it is, two lines are thrown into the temporary file: the overall (X-spam-score) score of the spam tests, and a second header (X-spam-failed) listing out codes for the tests the message failed. AFTER printing out those two lines, the matched line (Subject: *) is then written out, and the program continues writing out lines to the file.
FInally, the temp file is moved to overwrite the original file in the queue, and then the mail is delivered.
THE PROBLEM: when I write out the two X-spam headers, I lose the last two lines of the original message!
I've done some bug testing here and looped through withOUT printing out my two X-spam headers ... and when I don't print those out, I get all of the lines of the original file from the array.
Any ideas on why that would be happening, and how I might prevent it?
Thanks in advance -- once the script is final and bug-tested, I plan to release it publicly. (If anyone here is using CommuniGate Pro on a UNIX system and would like to beta-test, please let me know.)
Program tested on: gawk 3.1.0, mawk 1.3.3
-Clay
I'm writing a pretty involved spam filter in awk ... so far so good, for the most part.
Basically, the program reads in a queued mail file, stores each line in an array:
tempMail[i++] = $0
... then scans each line against several spam regexps, and then in the END portion of the script, loops through the temp array and writes out to a new file.
While writing out to this new temporary file, each line is checked to see if it's the "Subject" header ... if it is, two lines are thrown into the temporary file: the overall (X-spam-score) score of the spam tests, and a second header (X-spam-failed) listing out codes for the tests the message failed. AFTER printing out those two lines, the matched line (Subject: *) is then written out, and the program continues writing out lines to the file.
FInally, the temp file is moved to overwrite the original file in the queue, and then the mail is delivered.
THE PROBLEM: when I write out the two X-spam headers, I lose the last two lines of the original message!
I've done some bug testing here and looped through withOUT printing out my two X-spam headers ... and when I don't print those out, I get all of the lines of the original file from the array.
Any ideas on why that would be happening, and how I might prevent it?
Thanks in advance -- once the script is final and bug-tested, I plan to release it publicly. (If anyone here is using CommuniGate Pro on a UNIX system and would like to beta-test, please let me know.)
Program tested on: gawk 3.1.0, mawk 1.3.3
-Clay