so days later, still chunking away, not making much progress. If I kill the process (doesn't stop sa-learn, just kills current script), it always returns ^Cplugin: eval failed: interrupted at /usr/local/bin/sa-learn line 511.
which is 0509 sub killed { 0510 $spamtest->finish_learner(); 0511 die "interrupted"; 0512 } The only difference in sa-learn I'm running from 3.4.1 at https://svn.apache.org/repos/asf/spamassassin/tags/spamassassin_release_3_4_1/ is line 50 0050 $searchrelative = 1; # disabled during "make install": REMOVEFORINST (which I assume is removed given "REMOVEFORINST") So I assume given the changes in lines 19-21, that my server is running 3.4.1 release. I note that 3.4.2p3 has one difference from 3.4.1, which is comment out use bytes; at line 21 (this has been there or not there a few times over various versions and so may be slightly meaningful to something) 0021 # use bytes; I'm not sufficiently perl savvy to have any idea whether that's useful to my performance issues or not, but it an easy enough mod to try. Any thoughts? -David -------- Original Message -------- Subject: Re: very basic SA-Learn performance question: is 90 seconds or so per token really, really slow or roughly normal? From: David Gessel <ges...@blackrosetech.com> To: David Jones <djo...@ena.com>, users@spamassassin.apache.org Date: Thu Nov 02 2017 01:29:42 GMT+0300 (AST) > Oh, I wiped the bayes data and started over already once, it isn't (or > shouldn't be) that big a deal. > > Disk performance: seems OK to me. > > # diskinfo -t /dev/aacd0 > /dev/aacd0 > 512 # sectorsize > 73295462400 # mediasize in bytes (68G) > 143155200 # mediasize in sectors > 0 # stripesize > 0 # stripeoffset > 8910 # Cylinders according to firmware. > 255 # Heads according to firmware. > 63 # Sectors according to firmware. > # Disk ident. > > Seek times: > Full stroke: 250 iter in 2.966242 sec = 11.865 msec > Half stroke: 250 iter in 2.126653 sec = 8.507 msec > Quarter stroke: 500 iter in 3.616484 sec = 7.233 msec > Short forward: 400 iter in 1.540087 sec = 3.850 msec > Short backward: 400 iter in 1.104617 sec = 2.762 msec > Seq outer: 2048 iter in 0.546351 sec = 0.267 msec > Seq inner: 2048 iter in 0.726598 sec = 0.355 msec > Transfer rates: > outside: 102400 kbytes in 2.103472 sec = 48681 kbytes/sec > middle: 102400 kbytes in 2.300709 sec = 44508 kbytes/sec > inside: 102400 kbytes in 3.192841 sec = 32072 kbytes/sec > > > nothing amazing, but nothing unexpectedly bad either. > > -------- Original Message -------- > Subject: Re: very basic SA-Learn performance question: is 90 seconds or so > per token really, really slow or roughly normal? > From: David Jones <djo...@ena.com> > To: users@spamassassin.apache.org > Date: Thu Nov 02 2017 01:00:40 GMT+0300 (AST) > >> If you want to try to keep your existing Bayes data, try dumping it to a >> backup file, clear the DB, then restore it back to see if this resets things >> properly. Hopefully this won't take weeks to dump. :) >> >> https://wiki.apache.org/spamassassin/BayesMigration >> >> BTW, do you have normal file IO performance? Have you checked iotop and >> iostats to see what kind of IOPs/Mbps you are getting on your filesystem >> where the Bayes DB files are?