In the SpamAssassin "masses" directory, there's a tool called 'overlap', which is used to determine how much the rules in the ruleset overlap with each other.
For example, let's say I have a log file in spam.log, and want to examine how much the rules that start with _T_DRUG_ overlap with each other. I run overlap like so:
Which in this case produces this output:
Explanation of the columns: the first number is how many mails hit both rules; the second, how much of the hits for the first rule also hit the second; the third, how much of the hits for the second rule also hit the first.
So in the case of this line:
87 mails hit both rules; all of the mails that hit T_DRUGS_SLEEP_EREC also hit T_DRUGS_SLEEP; and 69% of the mails that hit T_DRUGS_SLEEP also hit T_DRUGS_SLEEP_EREC.
Overlap is very useful, if you believe that some rules are all hitting the same spam messages.