ManualWhitelist

Whitelisting a user

Adding a user to your whitelist gives them a -100 score, which has the effect of always marking their mail as non-spam.

To manually whitelist a particular address, say d.cary@sparkingwire.com, edit your local user prefs file ~/.spamassassin/user_prefs (or global /etc/mail/spamassassin/local.cf):

# whitelist David Cary:
whitelist_from  d.cary@sparkingwire.com

Whitelist and blacklist addresses are file-glob-style patterns, so
friend@somewhere.com, *@isp.com, or *.domain.net will all work.

# whitelist everyone at sparkingwire.com:
whitelist_from  *@sparkingwire.com

To manually blacklist, use blacklist_from to add an address to your blacklist.

If the sender is at all well known (such as a mailing list), you should use whitelist_from_rcvd instead so that a spammer can't forge their mail to look like it's from the whitelisted address. More info on whitelist_from, whitelist_from_rcvd, and blacklist_from is on the web or can be accessed from your local man pages by typing perldoc Mail::SpamAssassin::Conf.

Some good, free web-based tools are available to put a friendly user interface on whitelists (and blacklists) and allow users to edit their own. See WebUserInterfaces.

With the AutoWhitelist and TxRep plugins, besides their automated function, you can whitelist and blacklist email addresses, or (in the case of TxRep) also domain names, IP addresses, or NetBIOS/HELO names, with the command line options --add-addr-to-whitelist and --add-addr-to-blacklist of the main spamassassin script. The whitelisting and blacklisting with TxRep is documented in details on its Wiki and POD pages.

What is AutoWhitelist?

Another feature of SpamAssassin is "auto-whitelist". But the name is a misnomer. The AutoWhitelist is designed as an automatic score averaging system, and is just as likely to penalize or blacklist an address as it is to benefit or whitelist it. If you want to whitelist, you should use the directions above.

Alternatively, there is also the AutoWhiteList successor - the TxRep Sender Reputation plugin. It whitelists and blacklists automatically and manually not only email addresses, but also senders' domain names, IP addresses, and NetBIOS/HELO names in the combination with IP blocks, DKIM signatures, and SPF passes. TxRep also allows the whitelisting and blacklisting of senders through the sa-learn tool. It happens automatically when training spam/ham to Bayes - depending on your configuration, it can be done by feeding the sa-learn tool individual messages, or entire mailboxes and folders manually on the command line, by user input in webmail software, through a cronjob from IMAP folders, or in other similar ways.

Additionally, TxRep, in similar way to the Bayes plugin, can boost the automated whitelisting/blacklisting at the scan time, when the score of the message triggers the auto-learn process. To activate this feature, you need to enable the option txrep_autolearn. Do not activate the auto-learn option before SpamAssassin is well tuned, and before it sorts spam and ham correctly. With poorly trained SpamAssassin, the auto-learn function of TxRep would boost also all false results. Add the autolearn value to the email headers (i.e. "add_header all Status ... autolearn=AUTOLEARN" in local.cf), and activate the txrep_autolearn option only after you verified that SA triggers the autolearn process only in the cases when you clearly want to boost the sender's reputation (in one or the other way).

Automatically whitelisting people you've emailed

When parsing outbound email by SpamAssassin, you can automatically whitelist all recipients of the outgoing messages with the help of the TxRep plugin. To activate this feature, install the plugin, and enable the option txrep_whitelist_out.

When not using TxRep, or not parsing outgoing email by SpamAssassin, you could use the following method to extract a unique list of e-mail addresses from your 'Sent' folder (in mbox format). You can also use both methods simultaneously.

In your ~/.spamassassin/user_prefs file, put this in:

include sent_whitelist

The following script creates the sent_whitelist file with 100 addresses per line:

SADIR=~/.spamassassin
SENTMAIL=~/mail/Sent

cat $SENTMAIL |
        grep -Ei '^(To|cc|bcc):' |
	grep -oEi '[a-z0-9_.=/#+-]+@([a-z0-9-]+\.)+[a-z]{2,}' |
	tr "A-Z" "a-z" |
	sort -u |
        xargs -n 100 echo "whitelist_from" > $SADIR/sent_whitelist

This can be adapted as necessary, and executed as a cron job.

The script is simple and fast, but not very accurate. It extracts strings looking like e-mail addresses from the lines starting with To: CC: and BCC. However it does not take into account the continuation lines (addresses on continuation lines are not added to the white list), but extracts addresses from message body (if a line in the body starts with To:/CC:/BCC:, which often happens when formwarding e-mails).

To make the script more accurate but much more slow replace the line "grep -Ei '^(To|cc|bcc):'|" with a call to formail (part of procmail package):

	formail -s formail -czx 'To:' -x 'CC:' -x 'BCC:' |

Formail extracts lines only from RFC-822 header and concatenates continued fields providing an accurate list of all addresses.

Building an auto-whitelist from LDAP

If you run an LDAP-based addressbook, you can use the following simple cron job to build a whitelist nightly:

# Create a SpamAssassin whitelist
0 * * * *       /usr/pkg/bin/ldapsearch -LLL -b dc=domain,dc=net,dc=au mail | awk '/^mail:/ {print "whitelist_from " $2}' > $HOME/.spamassassin/whitelist_from_ldap.cf