Hi, Currently I have MIMEDefang set up to call Spam Assassin for all incoming messages. I am trying to set up Bayes for Mailman lists so I have the script 'mmlearn' (attached) which runs sa-learn on pickled emails.
The problem is that for certain messages sa-learn crashes. I have attached a tar file of 2 examples (so they don't get marked as spam :) [midget 14:31] ~ >sudo su -m mailman -c 'env HOME=/usr/local/mailman sa-learn -u mailman -D --showdots --mbox --spam' <crashmsg1 debug: SpamAssassin version 3.0.4 debug: Score set 0 chosen. debug: running in taint mode? yes debug: Running in taint mode, removing unsafe env vars, and resetting PATH debug: PATH included '/home/darius/bin', keeping. debug: PATH included '/sbin', keeping. debug: PATH included '/bin', keeping. debug: PATH included '/usr/sbin', keeping. debug: PATH included '/usr/bin', keeping. debug: PATH included '/usr/games', keeping. debug: PATH included '/usr/local/sbin', keeping. debug: PATH included '/usr/local/bin', keeping. debug: PATH included '/usr/X11R6/bin', keeping. debug: PATH included '/home/darius/bin', keeping. debug: PATH included '/usr/sbin', keeping. debug: PATH included '/sbin', keeping. debug: PATH included '/usr/sbin', keeping. debug: PATH included '/sbin', keeping. debug: Final PATH set to: /home/darius/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin:/usr/X11R6/bin:/home/darius/bin:/usr/sbin:/sbin:/usr/sbin:/sbin debug: using "/usr/local/etc/mail/spamassassin/init.pre" for site rules init.pre debug: config: read file /usr/local/etc/mail/spamassassin/init.pre debug: using "/usr/local/share/spamassassin" for default rules dir debug: config: read file /usr/local/share/spamassassin/10_misc.cf debug: config: read file /usr/local/share/spamassassin/20_anti_ratware.cf debug: config: read file /usr/local/share/spamassassin/20_body_tests.cf debug: config: read file /usr/local/share/spamassassin/20_compensate.cf debug: config: read file /usr/local/share/spamassassin/20_dnsbl_tests.cf debug: config: read file /usr/local/share/spamassassin/20_drugs.cf debug: config: read file /usr/local/share/spamassassin/20_fake_helo_tests.cf debug: config: read file /usr/local/share/spamassassin/20_head_tests.cf debug: config: read file /usr/local/share/spamassassin/20_html_tests.cf debug: config: read file /usr/local/share/spamassassin/20_meta_tests.cf debug: config: read file /usr/local/share/spamassassin/20_phrases.cf debug: config: read file /usr/local/share/spamassassin/20_porn.cf debug: config: read file /usr/local/share/spamassassin/20_ratware.cf debug: config: read file /usr/local/share/spamassassin/20_uri_tests.cf debug: config: read file /usr/local/share/spamassassin/23_bayes.cf debug: config: read file /usr/local/share/spamassassin/25_body_tests_es.cf debug: config: read file /usr/local/share/spamassassin/25_hashcash.cf debug: config: read file /usr/local/share/spamassassin/25_spf.cf debug: config: read file /usr/local/share/spamassassin/25_uribl.cf debug: config: read file /usr/local/share/spamassassin/30_text_de.cf debug: config: read file /usr/local/share/spamassassin/30_text_fr.cf debug: config: read file /usr/local/share/spamassassin/30_text_nl.cf debug: config: read file /usr/local/share/spamassassin/30_text_pl.cf debug: config: read file /usr/local/share/spamassassin/50_scores.cf debug: config: read file /usr/local/share/spamassassin/60_whitelist.cf debug: using "/usr/local/etc/mail/spamassassin" for site rules dir debug: using "/usr/local/mailman/.spamassassin/user_prefs" for user prefs file debug: config: read file /usr/local/mailman/.spamassassin/user_prefs debug: plugin: loading Mail::SpamAssassin::Plugin::URIDNSBL from @INC debug: plugin: registered Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x8a0bd90) debug: plugin: loading Mail::SpamAssassin::Plugin::Hashcash from @INC debug: plugin: registered Mail::SpamAssassin::Plugin::Hashcash=HASH(0x8a1b794) debug: plugin: loading Mail::SpamAssassin::Plugin::SPF from @INC debug: plugin: registered Mail::SpamAssassin::Plugin::SPF=HASH(0x8a33edc) debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x8a0bd90) implements 'parse_config' debug: plugin: Mail::SpamAssassin::Plugin::Hashcash=HASH(0x8a1b794) implements 'parse_config' debug: bayes: 98028 tie-ing to DB file R/O /usr/local/mailman/.spamassassin/bayes_toks debug: bayes: 98028 tie-ing to DB file R/O /usr/local/mailman/.spamassassin/bayes_seen debug: bayes: found bayes db version 3 debug: Score set 2 chosen. debug: Initialising learner debug: Syncing Bayes and expiring old tokens... debug: lock: 98028 created /usr/local/mailman/.spamassassin/bayes.lock.midget.dons.net.au.98028 debug: lock: 98028 trying to get lock on /usr/local/mailman/.spamassassin/bayes with 0 retries debug: lock: 98028 link to /usr/local/mailman/.spamassassin/bayes.lock: link ok debug: bayes: 98028 tie-ing to DB file R/W /usr/local/mailman/.spamassassin/bayes_toks debug: bayes: 98028 tie-ing to DB file R/W /usr/local/mailman/.spamassassin/bayes_seen debug: bayes: found bayes db version 3 debug: refresh: 98028 refresh /usr/local/mailman/.spamassassin/bayes.lock debug: Syncing complete. debug: Learning Spam debug: received-header: parsed as [ ip=80.68.88.245 rdns=server1.aladan.net helo=server1.aladan.net by=midget.dons.net.au ident= [EMAIL PROTECTED] intl=0 id=j610d5Id081088 auth= ] debug: received-header: parsed as [ ip=195.47.42.5 rdns=195.47.42.5 helo=195.47.42.5 by=server1.aladan.net ident= envfrom= intl=0 id=j610ctAg015021 auth= ] debug: is DNS available? 0 debug: received-header: parsed as [ ip=92.132.203.224 rdns= helo= by=195.47.42.5 ident= envfrom= intl=0 id=2142659851detailing23659 auth= ] debug: received-header: cannot use DNS, do not trust any hosts from here on debug: received-header: relay 80.68.88.245 trusted? no internal? no debug: received-header: relay 195.47.42.5 trusted? no internal? no debug: received-header: relay 92.132.203.224 trusted? no internal? no debug: metadata: X-Spam-Relays-Trusted: debug: metadata: X-Spam-Relays-Untrusted: [ ip=80.68.88.245 rdns=server1.aladan.net helo=server1.aladan.net by=midget.dons.net.au ident= [EMAIL PROTECTED] intl=0 id=j610d5Id081088 auth= ] [ ip=195.47.42.5 rdns=195.47.42.5 helo=195.47.42.5 by=server1.aladan.net ident= envfrom= intl=0 id=j610ctAg015021 auth= ] [ ip=92.132.203.224 rdns= helo= by=195.47.42.5 ident= envfrom= intl=0 id=2142659851detailing23659 auth= ] debug: ---- MIME PARSER START ---- debug: main message type: text/plain debug: parsing normal part debug: added part, type: text/plain debug: ---- MIME PARSER END ---- debug: decoding: other encoding type (7bit), ignoring debug: uri found: http://uhdzu.azwpd9alp2az7ts.zorromf.info debug: refresh: 98028 refresh /usr/local/mailman/.spamassassin/bayes.lock debug: tokenize: header tokens for Mime-Version = " 1.0 (Apple Message framework v728)" debug: tokenize: header tokens for Content-Transfer-Encoding = " 7bit" debug: tokenize: header tokens for *m = " 1681980078 575569277 195 47 42 5 " debug: tokenize: header tokens for *c = " /plain; charset=US-ASCII; format=flowed" debug: tokenize: header tokens for To = "U*all D*fucs.org.au D*org.au D*au" debug: tokenize: header tokens for *F = "U*entertainers D*artdirectors.com D*com" debug: tokenize: header tokens for *x = " Apple Mail (2.728)" debug: tokenize: header tokens for *RT = " " debug: tokenize: header tokens for *RU = " [ ip=80.68.88.245 rdns=server1.aladan.net helo=server1.aladan.net by=midget.dons.net.au ident= [EMAIL PROTECTED] intl=0 id=j610d5Id081088 auth= ] [ ip=195.47.42.5 rdns=195.47.42.5 helo=195.47.42.5 by=server1.aladan.net ident= envfrom= intl=0 id=j610ctAg015021 auth= ] [ ip=92.132.203.224 rdns= helo= by=195.47.42.5 ident= envfrom= intl=0 id=2142659851detailing23659 auth= ]" debug: tokenize: header tokens for *r = " [92.132.203 ip*92.132.203.224 ] (port=4461 helo=[homesteaders]) by 195.47.42 ip*195.47.42.5 esmtp id 2142659851detailing23659 [EMAIL PROTECTED]; " debug: tokenize: header tokens for *r = " [92.132.203 ip*92.132.203.224 ] (port=4461 helo=[homesteaders]) by 195.47.42 ip*195.47.42.5 esmtp id 2142659851detailing23659 [EMAIL PROTECTED]; 195.47.42 ip*195.47.42.5 ([195.47.42 ip*195.47.42.5 ]) by server1.aladan.net (8.13.1/8.13.1) <[EMAIL PROTECTED]>; " Segmentation fault If I move the bayes_toks file out of the way it doesn't crash - I could accept that it's a corrupt file but I get the same result with 2 separate systems so perhaps something is causing the toks file to become bogus. I am running SA v3.0.4 on both systems. One system is FreeBSD 4.11 with Perl 5.6.2, and the other is FreeBSD 5.4 with Perl 5.8.7 (both built from ports) Any help greatly appreciated! Thanks. -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C
mmlearn
Description: application/shellscript
#!/usr/bin/env PYTHONPATH=/usr/local/mailman/pythonlib:/usr/local/mailman python import sys import os import pickle import email from Mailman.mm_cfg import DATA_DIR if len(sys.argv) < 3: print 'Incorrect usage' print ' %s mbox pickle [pickle ...]' % sys.argv[0] sys.exit(1) if sys.argv[1] == '-': mbox = sys.stdout else: mbox = open(sys.argv[1], 'w') for filename in sys.argv[2:]: if filename.endswith('.pck'): msg = pickle.load(open(filename, 'rb')).as_string(unixfrom=True) else: msg = open(filename, 'r').read() mbox.write(msg) if msg[-1] != '\n': mbox.write('\n') mbox.write('\n') mbox.close()
crash-salearn.tbz
Description: BZip2 compressed data
pgphNvIZBv4Il.pgp
Description: PGP signature