The DumpText Plugin

DumpText is a demo plugin, but is in itself quite useful; it'll dump the "decoded, stripped" rendering of the text, as used by body rules, to STDERR. This is pretty handy for rule developers (and the curious).

Code

dumptext.cf:

loadplugin Dumptext dumptext.pm
body DUMPTEXT eval:dumptext()

dumptext.pm:

package Dumptext;
use strict;
use Mail::SpamAssassin;
use Mail::SpamAssassin::Plugin;
our @ISA = qw(Mail::SpamAssassin::Plugin);

sub new {
  my ($class, $mailsa) = @_;
  $class = ref($class) || $class;
  my $self = $class->SUPER::new($mailsa);
  bless ($self, $class);
  $self->register_eval_rule ("dumptext");
  return $self;
}

sub dumptext {
  my ($self, $permsgstatus) = @_;
  my $array = $permsgstatus->get_decoded_stripped_body_text_array();
  my $str = join (' ', @$array);
  $str =~ s/\s+/ /gs;
  print STDERR "text: $str\n";
  0;
}

1;

How To Use It

Save the two files above into a new directory called dumptext, then use spamassassin's -c switch so that it reads that directory for rules:

./spamassassin -L -t -c dumptext < spammail > /dev/null

example:

./spamassassin -L -t -c dumptext < sample-nonspam.txt > /dev/null
--------------------------------------------------------------------------------
text: TBTF ping for 2001-04-20: Reviving -----BEGIN PGP SIGNED MESSAGE----- TBTF
ping for 2001-04-20: Reviving T a s t y B i t s f r o m t h e T e c h n o l o g
y F r o n t Timely news of the bellwethers in computer and communications techno
logy that will affect electronic commerce -- since 1994 Your Host: Keith Dawson 
ISSN: 1524-9948 This issue: [....etc.]

You can also scan a massive selection of mail using either mass-check's
-c switch, or with the mailbox-scanning support added to the spamassassin script in SpamAssassin 3.0.0.