spamd Syslog Format
SpamAssassin 2.x Format
The traditional spamd syslog format looks like this:
This details the time, hostname, process ID in normal syslog format; then whether the message was spam or nonspam; the points and threshold used; the username and uid; the time taken to scan the message; and the size of message in bytes. This is what a ham line looks like:
SpamAssassin 3.0.0 Format
As of SpamAssassin 3.0.0, spamd also produces syslog lines for each scanned message that are designed to be machine-parseable and extensible. It looks like this:
(in this example, I've broken it into multiple lines by inserting newlines, but in syslog, it's all one long line.)
Again, the normal syslog data (time, hostname, process ID) is present. In addition, the line contains "result: ", followed by what is basically the same format used by the MassCheck tool. This means: "Y" for spam or "." for nonspam; the points, rounded to an int; a skipped field ("-"); the list of tests hit, comma-separated; then a comma-separated list of name=value pairs.
The name=value pairs may appear in any order, and new ones may be introduced without warning; but they'll always be comma-separated. They include the following items:
- scantime: the time in seconds taken to scan this message.
- size: the size in bytes of the message.
- mid: the Message-ID.
- bayes: the score output by the Bayes classifier (NOTE: this could be in scientific notation, e.g. '2.22044604925031e-16', or as a normal number: '0.00437882812836221', or even '1').
- autolearn: whether the message was autolearned or not (ham, spam, no, failed, disabled, unavailable).
Here's another example, this time from a ham mail: