Bear with me... I think my question really is about comparing base 64 encoded strings...
In a mail parsing app, I've come across incoming mail where the In-Reply-To and References headers have been stripped out and replaced by a Microsoft/Outlook Thread-Index header, and need to start generating my own Thread-Index headers on outgoing mail so that when I receive a reply without standard threading headers, I can still match the reply to a thread using Thread-Index.
I found a function which creates a valid Thread-Index header ( and am storing that header in a MySQL table. According to the function's author
* These headers are base64 encoded 22-byte binary strings in the format:
* 6 bytes: The first 6 significant bytes from a FILETIME timestamp.
* 16 bytes: A unique GUID in hex.
So... a Thread-Index header value apparently looks like this:
Outlook appends 5-byte suffixes to subsequent thread members, so a thread reply would be coded like so:
Note that the first 30 characters are the same (AdH1tsVUVHkXt/ZLS4eksRmXC4Q5Ig), but the reply has dropped the two original trailing equals signs and added the characters AiTOHA.
What I really need to be able to do is match up emails containing the original thread index using a MySQL query, but I don't understand what's going on with the base conversions and encoding in the PHP function.
Do you think it's safe to just match on the first 30 characters of the thread index in order to identify messages from the same thread? I'd be grateful for any advice or suggestions!
More examples of Thread-Index values:
In a mail parsing app, I've come across incoming mail where the In-Reply-To and References headers have been stripped out and replaced by a Microsoft/Outlook Thread-Index header, and need to start generating my own Thread-Index headers on outgoing mail so that when I receive a reply without standard threading headers, I can still match the reply to a thread using Thread-Index.
I found a function which creates a valid Thread-Index header ( and am storing that header in a MySQL table. According to the function's author
* These headers are base64 encoded 22-byte binary strings in the format:
* 6 bytes: The first 6 significant bytes from a FILETIME timestamp.
* 16 bytes: A unique GUID in hex.
So... a Thread-Index header value apparently looks like this:
Code:
AdH1tsVUVHkXt/ZLS4eksRmXC4Q5Ig==
Outlook appends 5-byte suffixes to subsequent thread members, so a thread reply would be coded like so:
Code:
AdH1tsVUVHkXt/ZLS4eksRmXC4Q5IgAiTOHA
Note that the first 30 characters are the same (AdH1tsVUVHkXt/ZLS4eksRmXC4Q5Ig), but the reply has dropped the two original trailing equals signs and added the characters AiTOHA.
What I really need to be able to do is match up emails containing the original thread index using a MySQL query, but I don't understand what's going on with the base conversions and encoding in the PHP function.
Do you think it's safe to just match on the first 30 characters of the thread index in order to identify messages from the same thread? I'd be grateful for any advice or suggestions!
More examples of Thread-Index values:
Code:
AAAAHomOX/VlopU2wo+fFE1Bko39Cw==
AAABMV+Taic/ZZdmYJJfphDCqDHr3A==
AAABMV8OU5mSg/7oV6a8xlfEQ7kf5w==
AAABMV8PU5mSg/7oV6a8xlfEQ7kf5w==
AAABMV8U4KsZxQheHAiU3/alJNqcXQ==
AAABMV8UdFBCUFnlrAhDixq8PgSEqg==
AAABMV8vBEZjaDr8P0KrKK8KuJ3JSA==
AAABMV8vfQVtOjhVMiyPRf32ThjaOA==
AAABMV8vGK8E4NQKPshHoM6cj6W/iA==
AAABMV8vWPmyCure5b9P0thcJxfQ0g==
AAABMV8vwWsSxiCWEb7Ma5oSZfBnXw==
AdH1tsVUVHkXt/ZLS4eksRmXC4Q5Ig==
AdH1tsVUVHkXt/ZLS4eksRmXC4Q5IgAiTOHA
AdH1tsVUVHkXt/ZLS4eksRmXC4Q5IgAiyCOQ