I've been tasked with fixing someone else's VB.Net application, and this is the first time I've ever used regex. I understand some of the basics, but I'm having trouble doing what I need to do. I have to parse through an email header and extract email addresses, as well as the "From", "To", and "CC" headers. I also need to be able to determine which addresses fall under which headers. (i.e. which email addresses are in the "From" line, which are in the "To" line, and which are in the "CC" line)
So far, I have this:
The email pattern was written by someone else in my company, and it seems to work for what I need. I just added (?i)(from:|to:|cc.*
I'm having two problems:
[ol]
[li][ignore]I'm getting all the text between the header and the email address, in addition to the two patterns I want. (i.e. From: "John Doe" <john.doe@email.com) I need to be able to get only the "From:" and the "john.doe@email.com" patterns. I know that .* is causing this, but I found that if I left it out, I got no results at all.[/ignore][/li]
[li]I'm only getting one email address per header, and there are multiple addresses per header (or there can be).[/li]
[/ol]
I think if I can get only the patterns I need, then I can do the rest as far as sorting out which addresses are recipients, which are senders, and which are carbon copies.
Any help would be much appreciated.
Jas
So far, I have this:
Code:
(?i)(from:|to:|cc:).*([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})
The email pattern was written by someone else in my company, and it seems to work for what I need. I just added (?i)(from:|to:|cc.*
I'm having two problems:
[ol]
[li][ignore]I'm getting all the text between the header and the email address, in addition to the two patterns I want. (i.e. From: "John Doe" <john.doe@email.com) I need to be able to get only the "From:" and the "john.doe@email.com" patterns. I know that .* is causing this, but I found that if I left it out, I got no results at all.[/ignore][/li]
[li]I'm only getting one email address per header, and there are multiple addresses per header (or there can be).[/li]
[/ol]
I think if I can get only the patterns I need, then I can do the rest as far as sorting out which addresses are recipients, which are senders, and which are carbon copies.
Any help would be much appreciated.
Jas