> On Feb 29, 2016, at 3:18 PM, Reindl Harald <h.rei...@thelounge.net> wrote: > > Am 29.02.2016 um 21:05 schrieb Charles Sprickman: >>> On Feb 29, 2016, at 4:23 AM, Reindl Harald <h.rei...@thelounge.net> wrote: >>> >>> Am 29.02.2016 um 06:24 schrieb Charles Sprickman: >>>> I’ve not had much luck with Bayes - when I had it enabled recently on a >>>> per-user basis it was just hitting the master DB server too hard with >>>> udpates >>> >>> just make a sitewide bayes >>> (https://wiki.apache.org/spamassassin/SiteWideBayesSetup) without autolearn >>> / autoexpire and the default database in a folder read-only for the daemon >>> >> >> I think I still have to stick with a db-backed option since I need to keep >> two SA servers in sync. > > and i know that it don't matter > > nothing easier then rsync the bayes-folder to several machines at the end of > the learning script, we even share the side-wide bayes over webservices to > external entities and so it coves around 5000 users at the moment in summary
I’m not seeing much of a change in load after enabling this with a global user and no autolearn. I think the db was really only constrained on the inserts/updates. > >> I’ll try that today and see how the load looks. My concern with disabling >> autolearn is that then I’m the only one training. My spam probably looks >> like everyone else’s, but my ham is very different, lots list traffic and >> such. > > you should be the only one who trains in most cases for several reasons > > * few to zero users train anough ham and spam for a proper bayes > * wrong classified autolearn takes a wrong direction sooner or later > > given that we now for more than a year maintain a side-wide bayes for inbound > MX re-used on submission servers to minimize the impact of hacked accounts > and it works so much better than all the "user bayes" solutions the last > decade it's the way to go if you *really* want proper operations I’ve been running with some daily training for a little over a week and I’m seeing less spam in my inbox. I’ve seen a few things slip through because bayes tipped them below the default score, these were two phishing emails. Here’s some rule stats for anyone interested: TOP SPAM RULES FIRED RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM 1 TXREP 13171 8.47 40.38 91.00 72.91 2 HTML_MESSAGE 12714 8.18 38.98 87.85 90.80 3 DCC_CHECK 10593 6.81 32.48 73.19 33.78 4 RDNS_NONE 10269 6.60 31.48 70.95 5.63 5 SPF_HELO_PASS 10070 6.48 30.87 69.58 23.41 6 URIBL_BLACK 9711 6.25 29.77 67.10 1.58 7 BODY_NEWDOMAIN_FMBLA 9550 6.14 29.28 65.98 1.64 8 FROM_NEWDOMAIN_FMBLA 9483 6.10 29.07 65.52 1.36 9 BAYES_99 8486 5.46 26.02 58.63 1.18 10 BAYES_999 8141 5.24 24.96 56.25 1.06 TOP HAM RULES FIRED RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM 1 HTML_MESSAGE 16473 9.13 50.51 87.85 90.80 2 DKIM_SIGNED 13776 7.64 42.24 13.81 75.93 3 TXREP 13228 7.33 40.56 91.00 72.91 4 DKIM_VALID 12962 7.19 39.74 11.93 71.44 5 RCVD_IN_DNSWL_NONE 9941 5.51 30.48 8.08 54.79 6 DKIM_VALID_AU 8711 4.83 26.71 7.99 48.01 7 BAYES_00 8390 4.65 25.72 1.84 46.24 8 RCVD_IN_JMF_W 7369 4.09 22.59 2.54 40.62 9 RCVD_IN_MSPIKE_WL 6713 3.72 20.58 4.39 37.00 10 BAYES_50 6201 3.44 19.01 25.56 34.18 Charles
signature.asc
Description: Message signed with OpenPGP using GPGMail