...
This section of scripts publishes new ruleset updates to the mirrors. There are currently (June 2017) two different rule daily updates. Both do lint tests against the latest version of SpamAssassin but the first one updates the 72_scores.cf based on the masscheck contributions while the second one is a "blind" rule promotion and tagged build of SVN rules for the masscheck area setup later.
Wiki Markup |
---|
*25 2 \* \* \* automc *~/svn/trunk/build/mkupdates/do-stable-update-with-scores*
**~/svn/masses/rule-update-score-gen/do-nightly-rescore-example.sh*
**~/svn/masses/rule-update-score-gen/generate-new-scores.sh*
*uses ~/tmp/generate-new-scores for SVN work area
*sorts out the usable corpus from the latest 'SVN revision' at the top of the submitter's log file which should match the latest tagged build of SVN rules
*$\{REVISION\} LINE 123 NEEDS IMPROVEMENT!!! THIS SVN REVISION NEEDS TO BE CLOSELY TIED TO THE REVISION THAT WAS STAGED IN THE MASSCHECK RSYNC DIR.
*checks the sorted corpus for a minimum number of valid contributors and ham/spam
**~/svn/trunk/build/mkupdates/mkupdate-with-scores*
*uses ~/tmp/sa-mkupdate for SVN working area
*gets latest SVN $\{REVISION\} from rulesrc/scores/score-set\*
*masses \-> perl Makefile.PL && make (complete build of SA and test)
*perl hit-frequencies
*garescorer - compiles and runs it, requires build/pga
*sends email if not enough masscheck submitters or usuable ham/spam for the latest SVN revision
*creates $\{REVISION\}.tar.gz $\{REVISION\}.tar.gz.sha1 and $\{REVISION\}.tar.gz.asc in /var/www/automc.spamassassin.org/updates for mirrors to pull
*updates DNS TXT entries \[0-3\].3.3.updates.spamassassin.org and 0.4.3.updates.spamassassin.org -- versions >= 3.4.1 have a CNAME to 3.3.3.updates.spamassassin.org
*Script rewrite notes:
*Make each primary step modular since these steps are commmon in other scripts
*Should check for minimum contributors of ham/spam up front and not waste resources if requirements not met
*These 3 scripts above all share the same temp working dir. This should be determined from config file or relative path of user's home dir for flexibility.
*Should be able to run the ham/spam processing in parallel and merge the results together to cut this time in half
*Temp working dir for the corpus should be persistent so the rsync copy will be faster.
*Usuable corpus symlink setup could be improved. Invalid stale corpus should be removed into an archive/excluded dir. |
Wiki Markup |
---|
*30 8 \* \* \* automc *~/svn/trunk/build/mkupdates/run_nightly* > /var/www/automc.spamassassin.org/mkupdates/mkupdates.txt
*Currently $\{SA_VERSION\} = "3.4.2"
*$\{REVISION\} = latest SVN revision THIS NEEDS TO BE ADDRESSED!!! NEED TO PREVENT REVISION FROM MESSING UP THE MASSCHECK PROCESSING.
*creates new rules/active.list
*commits new rules/active.list
*runs spamassassin lint against the updated rules and checks in a tagged version of 'sa-update_$\{SA_VERSION\}_$\{TSTAMP\}'
*commits "promotions validated" and emails dev@spamassassin.apache.org
*creates $\{REVISION\}.tar.gz $\{REVISION\}.tar.gz.sha1 and $\{REVISION\}.tar.gz.asc in /var/www/automc.spamassassin.org/updates for mirrors to pull
*updates DNS TXT entries \[0-3\].3.3.updates.spamassassin.org and 0.4.3.updates.spamassassin.org -- versions >= 3.4.1 have a CNAME to 3.3.3.updates.spamassassin.org
*Script rewrite notes:
*Uses many of the same primary steps previous section so reuse the code and not have to maintain multiple versions
*Should be turned into generic script that can be run on demand via SVN trigger/polling |
nitemc
These run shortly after the build/mkupdates/run_nightly to setup the masscheck download area based on the latest tagged build of SVN rules.
34 8 * * 0-5 automc *~/svn/nitemc/corpora_runs >> ~/rsync/corpus/nightly-versions.txt
36 8 * * 0-5 automc *~/svn/nitemc/extract_to_rsync_dir nightly ~/rsync/corpus/nightly-versions.txt
34 8 * * 6 automc *~/svn/nitemc/corpora_runs >> ~/rsync/corpus/weekly-versions.txt
36 8 * * 6 automc *~/svn/nitemc/extract_to_rsync_dir weekly ~/rsync/corpus/weekly-versions.txt
...