hmmm.... my hardware shouldn't be improper... the only quirk is that it is the very rare, pre-Intel agreement 3.1GHz QC AMD 2352-based dual socket mobo. It's a somewhat older IBM 3655 server, but it has 64GB of RAM, ServeRAID, dual socket, 8 cores, etc. I could, I'm sure, payback the cost of replacing it with a newer, faster box in saved electricity in a few months, but I do rather like these pre-lenovo IBM boxen with their lovely internal mechanics and lightpath diagnostics and redundancy and RSA-II and such.
Nothing else seems quite so remarkably slow. I realize this is a relative comparison and the last benchmarks I did were in 2013 when this box was brand new and exciting: # grep CPU /var/run/dmesg.boot CPU: Quad-Core AMD Opteron(tm) Processor 2352 (2100.12-MHz K8-class CPU) FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs And it has "classic coke" rotating media (punch card reader is disconnected): ServeRAID 8k with 8x 2.5" disks RAID 6 Not exactly state of the art, but I mean... the third generation Opteron 2352 was introduced in 2009, so less than a decade old! still! I have stuff in my fridge older than that. -------- Original Message -------- Subject: Re: very basic SA-Learn performance question: is 90 seconds or so per token really, really slow or roughly normal? From: Reindl Harald <h.rei...@thelounge.net> To: David Gessel <ges...@blackrosetech.com>, users@spamassassin.apache.org Date: Tue Oct 31 2017 06:12:43 GMT+0300 (AST) > > > Am 30.10.2017 um 23:35 schrieb David Gessel: >> FreeBSD 10.3-RELEASE FreeBSD 10.3-RELEASE #0 r322073: Sat Aug 5 01:44:09 >> PDT 2017 >> spamassassin-3.4.1_10 >> amavisd-new-2.11.0_2,1 >> >> I'm finding the command /usr/local/bin/sa-learn --spam --showdots >> /mail/blackrosetech.com/gessel/.Junk/{cur,new} is taking a while to >> complete... by a while I mean it has been running for 3 days. The folder >> has a few months of spam in it, 4760 "conversations" according to >> Thunderbird, which is roughly the message count since spam doesn't tend to >> thread deeply. > > no it is not, on proper hardware you have around 150 *messsages* per second > and a rebuild of that bayes from scratch (corpus of eml files) takes 20 > minutes or so > > 0 98479 SPAM > 0 29017 HAM > 0 3466912 TOKEN > > 4,0K -rw-r----- 1 sa-milt sa-milt 360 2017-10-31 02:30 bayes_journal > 12K -rw-r----- 1 sa-milt sa-milt 12K 2017-10-30 21:05 bayes_seen > 65M -rw-r----- 1 sa-milt sa-milt 80M 2017-10-30 21:05 bayes_toks >