Re: bayes/txrep questions

2025-02-16 Thread Bill Cole
On 2025-02-16 at 10:38:38 UTC-0500 (Sun, 16 Feb 2025 10:38:38 -0500) Alex is rumored to have said: [quoting me] TxRep (like AWL) is fed not by Bayes learning (sa-learn) but rather it tracks the combination of an address and a source IP range (/24) with a tally of the SA scores of messages

Re: bayes/txrep questions

2025-02-16 Thread Alex
> > > > > Is there any benefit to training an email that's already hitting > > bayes99? > > Yes. The tokens which made it hit 99% are already doing their jobs, but > the rest of the message that Bayes isn't seeing as spammy may turn out > to be what make

Re: bayes/txrep questions

2025-02-14 Thread Bill Cole
r "salesforce optimization" or "HR consulting" that already hit bayes99 (and bayes999) but are still just shy of 5 points. Is there any benefit to training an email that's already hitting bayes99? Yes. The tokens which made it hit 99% are already doing their jobs, but th

Re: bayes/txrep questions

2025-02-14 Thread Greg Troxel
Alex writes: > These also aren't always one-offs, but maybe a dozen or twenty of each over > a short period that get through, likely before the URIs are blocked through > other means. Other times they don't have a link at all. Sounds like fairly aggressive greylisting is in order.

Re: bayes/txrep questions

2025-02-14 Thread John Hardin
On Fri, 14 Feb 2025, Alex wrote: Hi, I'm using SA v4 and trying to find ways to minimize the amount of junk that isn't tagged. Emails like "1-hour free consultation" or "buy this event list" or "salesforce optimization" or "HR consulting" that already hit bayes99 (and bayes999) but are still jus

bayes/txrep questions

2025-02-14 Thread Alex
Hi, I'm using SA v4 and trying to find ways to minimize the amount of junk that isn't tagged. Emails like "1-hour free consultation" or "buy this event list" or "salesforce optimization" or "HR consulting" that already hit bayes99 (and bayes999) but are still just shy of 5 points. Is there any ben

Re: Strategy for collecting spam to feed Bayes?

2025-01-13 Thread Bill Cole
On 2025-01-13 at 03:12:25 UTC-0500 (Mon, 13 Jan 2025 02:12:25 -0600 (CST)) Dave Funk is rumored to have said: It's also possible for the messages to differ by things such as network routing headers, better to feed it all to bayes and let it get parsed/scored. That's an important

Re: Strategy for collecting spam to feed Bayes?

2025-01-13 Thread Bill Cole
On 2025-01-13 at 01:51:17 UTC-0500 (Mon, 13 Jan 2025 08:51:17 +0200) Anders Gustafsson is rumored to have said: Hi! When collecting spam I frequently see multiple copies of the same message, but with different fake senders. In this case, should I feed just one or all to Bayes? All. Also

Sv: Re: Strategy for collecting spam to feed Bayes?

2025-01-13 Thread Anders Gustafsson
Thanks! -- Regards, Anders >>> Dave Funk 2025-01-13 10:12 >>> On Mon, 13 Jan 2025, Anders Gustafsson wrote:

Re: Strategy for collecting spam to feed Bayes?

2025-01-13 Thread Dave Funk
On Mon, 13 Jan 2025, Anders Gustafsson wrote: Hi! When collecting spam I frequently see multiple copies of the same message, but with different fake senders. In this case, should I feed just one or all to Bayes? Yes, feed all copies of verfied spam to Bayes. As it is a weighted score per

Strategy for collecting spam to feed Bayes?

2025-01-12 Thread Anders Gustafsson
Hi! When collecting spam I frequently see multiple copies of the same message, but with different fake senders. In this case, should I feed just one or all to Bayes? Also: Is there a point in feeding such spam that is already flagged by other rules than Bayes and if so, should I remove the

Re: SA 4.0.1 Bayes in SQL: MYSQL_OPT_RECONNECT is deprecated

2025-01-10 Thread John Wilcock
Le 10/01/2025 à 15:35, Bill Cole a écrit : On 2025-01-10 at 08:49:04 UTC-0500 (Fri, 10 Jan 2025 14:49:04 +0100) John Wilcock is rumored to have said: Hi all, I'm using Spamassassin 4.0.1 on Gentoo and I've recently switched to using MySQL (actually Mariadb 10.6) for Bayes stor

Re: SA 4.0.1 Bayes in SQL: MYSQL_OPT_RECONNECT is deprecated

2025-01-10 Thread Bill Cole
On 2025-01-10 at 08:49:04 UTC-0500 (Fri, 10 Jan 2025 14:49:04 +0100) John Wilcock is rumored to have said: Hi all, I'm using Spamassassin 4.0.1 on Gentoo and I've recently switched to using MySQL (actually Mariadb 10.6) for Bayes storage. I'm seeing "WARNING: MY

SA 4.0.1 Bayes in SQL: MYSQL_OPT_RECONNECT is deprecated

2025-01-10 Thread John Wilcock
Hi all, I'm using Spamassassin 4.0.1 on Gentoo and I've recently switched to using MySQL (actually Mariadb 10.6) for Bayes storage. I'm seeing "WARNING: MYSQL_OPT_RECONNECT is deprecated and will be removed in a future version" warnings. $ spamassassin --lint -

Re: Bayes in V4 compared to V3

2024-11-13 Thread Grega via users
Well, I didnt re-learn it and now it works awsome. Also when I try to learn already learned mail I get that id didn`t learn as DB already contains same tokens. So config error was storing tokens in OK way, but it was reading them wrong. Now all my bayes issues are gone this is like night and

Re: Bayes in V4 compared to V3

2024-11-12 Thread Shawn Iverson
> > As documented: > > # perldoc Mail::SpamAssassin::BayesStore::SQL > NAME > Mail::SpamAssassin::BayesStore::SQL - SQL Bayesian Storage Module > Implementation > > DESCRIPTION > This module implements a SQL based bayesian storage module. It's > compatible with SQLite and possibly other standard SQ

Re: Bayes in V4 compared to V3

2024-11-12 Thread hg user
Yes Il Mar 12 Nov 2024, 13:53 Grega via users ha scritto: > If we used SQL and now switched to MySQL do we have to re-train bayes? > > -- > *From:* Bill Cole > *Sent:* Tuesday, 12 November 2024 13:35 > *To:* users@spamassassin.apache.org > *Subj

Re: Bayes in V4 compared to V3

2024-11-12 Thread Grega via users
If we used SQL and now switched to MySQL do we have to re-train bayes? From: Bill Cole Sent: Tuesday, 12 November 2024 13:35 To: users@spamassassin.apache.org Subject: Re: Bayes in V4 compared to V3 On 2024-11-12 at 00:33:13 UTC-0500 (Tue, 12 Nov 2024 00:33:13

Re: Bayes in V4 compared to V3

2024-11-12 Thread Bill Cole
On 2024-11-12 at 00:33:13 UTC-0500 (Tue, 12 Nov 2024 00:33:13 -0500) Shawn Iverson is rumored to have said: [...] > The "-D bayes" parameter was quite informative. Thank you. Turns out the > database wasn't being read properly with the bayes_store_module in use. > Mayb

Re: Bayes in V4 compared to V3

2024-11-11 Thread Grega via users
: users@spamassassin.apache.org; Grega Cc: hg user Subject: Re: Bayes in V4 compared to V3 On Mon, Nov 11, 2024 at 4:48 PM hg user mailto:mercurialu...@gmail.com>> wrote: In spamassassin 3 you could debug bayes points running command line spamassassin with "-D bayes" parameter. I

Re: Bayes in V4 compared to V3

2024-11-11 Thread Shawn Iverson
On Mon, Nov 11, 2024 at 4:48 PM hg user wrote: > In spamassassin 3 you could debug bayes points running command line > spamassassin with "-D bayes" parameter. I think you can in version 4 too. > > In the log all the tokens extracted from the message are listed with the >

Re: Bayes in V4 compared to V3

2024-11-11 Thread hg user
In spamassassin 3 you could debug bayes points running command line spamassassin with "-D bayes" parameter. I think you can in version 4 too. In the log all the tokens extracted from the message are listed with the points assigned, so you can see exactly how the score is calculated

Re: Bayes in V4 compared to V3

2024-11-05 Thread Grega via users
Hi all. I`m posting this here as well. Now I sorted out this score and disabled rules thing, but I still have bayes issues even after I re-trained the whole thing by hand AND disabled autolearn. Here is post: https://forum.efa-project.org/viewtopic.php?p=20226#p20226 I just don`t get 2

Re: training bayes and newsletters

2024-10-16 Thread Greg Troxel
I think you are missing that a particular newsletter is not intrinsically ham or spam. It is ham if the user has subscribed, and spam if they have not affirmatively subscribed. I have seen the very same content arrive at my mailserver for 2 users. For one it is ham and the other it is spam. Ther

Re: training bayes and newsletters

2024-10-16 Thread Bill Cole
On 2024-10-15 at 22:05:07 UTC-0400 (Tue, 15 Oct 2024 22:05:07 -0400) Alex is rumored to have said: I can imagine the newsletter template is somewhat common, but does bayes have any ability to distinguish a junk newsletter from a legitimate newsletter? Not if it has never seen either of them

RE: training bayes and newsletters

2024-10-16 Thread Marc
> I can imagine the newsletter template is somewhat common, but does bayes > have any ability to distinguish a junk newsletter from a legitimate > newsletter? How can bayes, if you also can't? My advice would be to mark eg everything from mailchimp and than whitelist what you in

Re: training bayes and newsletters

2024-10-15 Thread Axb
On 10/16/24 04:05, Alex wrote: Would I benefit from training known trustworthy newsletters such as ham? Yes, you would.

training bayes and newsletters

2024-10-15 Thread Alex
Hi, I've just retrained my bayes database (stored in SQL) with 10k hams and about 6k spams. I tried to make sure there were no newsletters in either corpus, but some emails present as newsletters but really are spam. However, many legitimate newsletters are hitting BAYES_99 even though I ha

Re: Whitelist or BAYES?

2024-10-03 Thread Bowie Bailey
) mails before it has effect. X-Spam-Report: *  4.1 BAYES_99 BODY: Bayes spam probability is 99 to 100% *  [score: 1.] *  5.0 BAYES_999 BODY: Bayes spam probability is 99.9 to 100

Re: Whitelist or BAYES?

2024-10-01 Thread Bill Cole
HAM for re-learning. Which is the best approach? so far, both. You may need to relearn multiple their (monthly) mails before it has effect. X-Spam-Report: *  4.1 BAYES_99 BODY: Bayes spam probability is 99 to 100% *  [score: 1.] *  5.0 BAYES_999 BODY: Bayes spam probability

Re: Whitelist or BAYES?

2024-09-30 Thread joe a
may need to relearn multiple their (monthly) mails before it has effect. X-Spam-Report: *  4.1 BAYES_99 BODY: Bayes spam probability is 99 to 100% *  [score: 1.] *  5.0 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% *  [score: 1.] You have raised

Re: Whitelist or BAYES?

2024-09-30 Thread joe a
(monthly) mails before it has effect. X-Spam-Report: *  4.1 BAYES_99 BODY: Bayes spam probability is 99 to 100% *  [score: 1.] *  5.0 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% *  [score: 1.] You have raised BAYES_99 and BAYES_999 to huge values so I

Re: Whitelist or BAYES?

2024-09-27 Thread Matus UHLAR - fantomas
: * 4.1 BAYES_99 BODY: Bayes spam probability is 99 to 100% * [score: 1.] * 5.0 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% * [score: 1.] You have raised BAYES_99 and BAYES_999 to huge values so I recommend to rethink that

RE: Whitelist or BAYES?

2024-09-27 Thread Marc
> --- >If guns kill people, then... > -- pencils miss spel words. > -- cars make people drive drunk. > -- spoons make people fat. > --- :) I wa

RE: Whitelist or BAYES?

2024-09-27 Thread Marc
> > > So, on the one hand I can add them to whitelist and be done with it, or > > I can add them to missed HAM for re-learning. > > > > Which is the best approach? > > Do both. > You will be always having work. The one's SPAM is the other users delight. I have switched to having frontend serve

Re: Whitelist or BAYES?

2024-09-26 Thread John Hardin
On Thu, 26 Sep 2024, joe a wrote: So, on the one hand I can add them to whitelist and be done with it, or I can add them to missed HAM for re-learning. Which is the best approach? Do both. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.org

Re: Whitelist or BAYES?

2024-09-26 Thread Kris Deugau
one of those tests. So, on the one hand I can add them to whitelist and be done with it, or I can add them to missed HAM for re-learning. Which is the best approach? Both. Feeding it to Bayes helps to correct its behaviour for both future messages from this sender and similar mail from others

Whitelist or BAYES?

2024-09-26 Thread joe a
required=5.0 tests=BAYES_99,BAYES_999, DKIMWL_WL_MED,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,SPF_HELO_NONE,SPF_SOFTFAIL, T_KAM_HTML_FONT_INVALID autolearn=disabled version=3.4.5 X-Spam-Report: * 4.1 BAYES_99 BODY: Bayes spam

Re: Bayes in V4 compared to V3

2024-09-25 Thread Grega via users
Oh god I`m idiot... I had: score BAYES_20 0.0 So now every mail has bayes score in it (changed it to score BAYES_20 0.1) Still puzzling why I have no extreme low or extreme high values. Also still puzzling why out of 3 identical mails one had bayes_60 and other 2 bayes_20. Autolearn is

Re: Bayes in V4 compared to V3

2024-09-25 Thread Grega via users
Hi. Im on mysql backend. Load is none .. From: Matija Nalis Sent: Wednesday, September 25, 2024 18:24 To: users@spamassassin.apache.org Subject: Re: Bayes in V4 compared to V3 On Mon, Sep 23, 2024 at 01:14:25PM +, Grega via users wrote: > Why one

Re: Bayes in V4 compared to V3

2024-09-25 Thread Matija Nalis
On Tue, Sep 24, 2024 at 08:10:38AM +, Grega via users wrote: > Also this: > > RuleDescriptionScoreTotalHamCol6SpamCol8 > BAYES_40Bayes spam probability is 20 to 40%0.002,784 > 2,72197.7632.3 > BAYES_50Bayes spam pr

Re: Bayes in V4 compared to V3

2024-09-25 Thread Matija Nalis
local file storage (BDB?) which used file locking, and that locking was prone to timing out when several mails came in quick succession. For me, switching to MySQL backend for Bayes (and AWL) fixed such issues... -- Opinions above are GNU-copylefted.

Re: Bayes in V4 compared to V3

2024-09-24 Thread Grega via users
Also this: RuleDescriptionScoreTotalHamCol6SpamCol8 BAYES_40Bayes spam probability is 20 to 40%0.002,784 2,72197.7632.3 BAYES_50Bayes spam probability is 40 to 60%0.8012693 73.83326.2 BAYES_60Bayes spam

Re: Bayes in V4 compared to V3

2024-09-23 Thread Grega via users
Hi again. In V4 there is something wrong with bayes... I received 3 identical mails (1 external sender, 3 internal recipients) and scores are like this: 2 X like: 0.00ARC_SIGNED Message has a ARC signature -0.10 ARC_VALID Message has a valid ARC signature -0.40

Re: SPAM-DETECTOR Re: Tips on training bayes?

2024-09-19 Thread natan
fly With SA 4.X - on average 2-6GB and I had to do a quick fix: 59 23 * * * root find /var/lib/amavis/tmp/ -mtime +0 -delete; W dniu 18.09.2024 o 16:09, Matus UHLAR - fantomas pisze: On 18.09.24 13:42, Grega via users wrote: Right now in SA 4.0.1 bayes at least for me is really challenging to tr

Re: Tips on training bayes?

2024-09-19 Thread Bill Cole
ent directed at the widest audience, e.g. commercial or political advertising. Email: obvious. Judging that requires some knowledge of the target. I can't tell you whether your borderline email is spam. Neither can SA, but Bayes is one way it tries to guess. Is the goal to have every me

Re: Tips on training bayes?

2024-09-18 Thread Greg Troxel
_50, and let other rules, > like network checks, determine the score? In general the great to the edge something is the more useful the score, but you can't actually push them all to 00/99. There could be a newsletter than user A asked for and is thus ham but user B did not and when it arrives to

Re: Tips on training bayes?

2024-09-18 Thread Benny Pedersen
Jared Hall via users skrev den 2024-09-18 20:08: On Deb-based distros, you can add this in /etc/amavis/conf.d/50-user under the $max_servers parameter. also remember its safe to use tmpfs for tmp dir in amavisd no joke

Re: Tips on training bayes?

2024-09-18 Thread Jared Hall via users
On 9/18/2024 10:19 AM, natan wrote: Hi I was very disappointed with spamassassin 4.x because it started to grow /var/lib/amavis/tmp/ With SA 3.4.X - on average 100MB and it deletes on the fly With SA 4.X - on average 2-6GB and I had to do a quick fix: 59 23 * * * root find /var/lib/amavis/tmp/

Re: Tips on training bayes?

2024-09-18 Thread Benny Pedersen
natan skrev den 2024-09-18 16:36: W dniu 18.09.2024 o 16:30, Reindl Harald (privat) pisze: who reply here ? :) don't blame SA when a blind man can see that your problem is on the Amavis side - why do one need Amavis tu begin with when there is SA and spamass-milter yes yes everyone know

Re: Tips on training bayes?

2024-09-18 Thread natan
for testing and this bad amavis also works correctly W dniu 18.09.2024 o 16:09, Matus UHLAR - fantomas pisze: On 18.09.24 13:42, Grega via users wrote: Right now in SA 4.0.1 bayes at least for me is really challenging to train and set up. I had good trained DB from past V3 install, a

Re: Tips on training bayes?

2024-09-18 Thread Matus UHLAR - fantomas
users wrote: Right now in SA 4.0.1 bayes at least for me is really challenging to train and set up. I had good trained DB from past V3 install, and it behaved really odd. I trained it on new set of mails 3000 spam and 3000 ham (HAND PICKED mail it was PAIN) and I cant get either BAYES_00 or

Re: Tips on training bayes?

2024-09-18 Thread natan
18.09.2024 o 16:09, Matus UHLAR - fantomas pisze: On 18.09.24 13:42, Grega via users wrote: Right now in SA 4.0.1 bayes at least for me is really challenging to train and set up. I had good trained DB from past V3 install, and it behaved really odd. I trained it on new set of mails 3000 spam and 3000

Re: Tips on training bayes?

2024-09-18 Thread Matus UHLAR - fantomas
On 18.09.24 13:42, Grega via users wrote: Right now in SA 4.0.1 bayes at least for me is really challenging to train and set up. I had good trained DB from past V3 install, and it behaved really odd. I trained it on new set of mails 3000 spam and 3000 ham (HAND PICKED mail it was PAIN) and I

Re: Tips on training bayes?

2024-09-18 Thread Grega via users
Right now in SA 4.0.1 bayes at least for me is really challenging to train and set up. I had good trained DB from past V3 install, and it behaved really odd. I trained it on new set of mails 3000 spam and 3000 ham (HAND PICKED mail it was PAIN) and I cant get either BAYES_00 or BAYES_99 :) I

Re: Tips on training bayes?

2024-09-17 Thread Alex
> > > It is up to the user, ie you, what is and what is not spam. > Well, yes, and no. Of course it's my own system and I can define these terms however I wish. I'm also familiar with the need to investigate every message - perhaps I should have made that clear initially. It's only these few typ

Re: Tips on training bayes?

2024-09-17 Thread Benny Pedersen
Jared Hall via users skrev den 2024-09-17 08:15: On 9/16/2024 8:48 PM, Alex wrote: Hi, Now that I'm using SA4, and my bayes database is quite old, I'd like to retrain it with new ham and spam. I hoped someone had some pointers on some of the gray area and what you consider to be sp

Re: Tips on training bayes?

2024-09-16 Thread Jared Hall via users
On 9/16/2024 8:48 PM, Alex wrote: Hi, Now that I'm using SA4, and my bayes database is quite old, I'd like to retrain it with new ham and spam. I hoped someone had some pointers on some of the gray area and what you consider to be spam and ham. Are reliable newsletters, like

Tips on training bayes?

2024-09-16 Thread Alex
Hi, Now that I'm using SA4, and my bayes database is quite old, I'd like to retrain it with new ham and spam. I hoped someone had some pointers on some of the gray area and what you consider to be spam and ham. Are reliable newsletters, like those from, say, a trusted news source wher

Re: Bayes in V4 compared to V3

2024-09-13 Thread John Hardin
On Fri, 13 Sep 2024, Bill Cole wrote: Please send any replies to the list only. ...or to Harald only. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.org pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C

Re: Bayes in V4 compared to V3

2024-09-13 Thread Benny Pedersen
Grega via users skrev den 2024-09-13 16:16: Sorry guys if I replied to all, my intentions were not to spam :) top posters :) imho not impossible to request 3dr party list archives to make a password for users, never mind eggs came before chickens :=)

Re: Bayes in V4 compared to V3

2024-09-13 Thread Grega via users
Sorry guys if I replied to all, my intentions were not to spam :) From: Benny Pedersen Sent: Friday, 13 September 2024 15:13 To: users@spamassassin.apache.org Subject: Re: Bayes in V4 compared to V3 Bill Cole skrev den 2024-09-13 15:03: > Please send

Noise Around This List (was Re: Bayes in V4 compared to V3)

2024-09-13 Thread Bill Cole
On 2024-09-13 at 09:13:58 UTC-0400 (Fri, 13 Sep 2024 15:13:58 +0200) Benny Pedersen is rumored to have said: Bill Cole skrev den 2024-09-13 15:03: Please send any replies to the list only. unsubscribe listarchivers ? and make archived on apache.org with bugzilla login don't know if it wil

Re: Bayes in V4 compared to V3

2024-09-13 Thread Antony Stone
On Friday 13 September 2024 at 15:13:58, Benny Pedersen wrote: > Bill Cole skrev den 2024-09-13 15:03: > > Please send any replies to the list only. > > unsubscribe listarchivers ? > and make archived on apache.org with bugzilla login > don't know if it will help or not, but chicken and egg I do

Re: Bayes in V4 compared to V3

2024-09-13 Thread Benny Pedersen
Bill Cole skrev den 2024-09-13 15:03: Please send any replies to the list only. unsubscribe listarchivers ? and make archived on apache.org with bugzilla login don't know if it will help or not, but chicken and egg

Re: Bayes in V4 compared to V3

2024-09-13 Thread Bill Cole
9-13 at 05:00:17 UTC-0400 (Fri, 13 Sep 2024 09:00:17 +) Grega is rumored to have said: Do you have V3 or V4 SA? From: Reindl Harald (privat) Sent: Friday, 13 September 2024 10:57 To: Grega; Bill Cole; Grega via users Subject: Re: Bayes in V4 compared to V3

Re: Bayes in V4 compared to V3

2024-09-13 Thread Grega via users
Do you have V3 or V4 SA? From: Reindl Harald (privat) Sent: Friday, 13 September 2024 10:57 To: Grega; Bill Cole; Grega via users Subject: Re: Bayes in V4 compared to V3 autolearn was always a blackbox that below are the stats for the current month and that

Re: Bayes in V4 compared to V3

2024-09-13 Thread Grega via users
This strategy worked really great in V3 and bayes was excellent even with autotrain and ocasionally manual training. Now it`s non decisive and useless at least for me. We have around 5k-7k daily mails... From: Reindl Harald (privat) Sent: Friday, 13

Re: Bayes in V4 compared to V3

2024-09-12 Thread Grega via users
Hi. I just filtered in last week and I have BAYES_20 BAYES_40 BAYES_50 BAYES_80 So no BAYES_00, _05, _90,_95 etc... All extreme values which are the only one useful to do real scoring and marking are missing. Today I`m going to train bayes manually with around 4000 SPAM and 4000 HAM

Re: Bayes in V4 compared to V3

2024-09-12 Thread Bill Cole
On 2024-09-12 at 14:05:11 UTC-0400 (Thu, 12 Sep 2024 18:05:11 +) Grega via users is rumored to have said: Hi. I have SA 4.0.1 configured it, all is good, except for bayes. It IS working, it IS learning but when it classifies mail it is really not so decisive as it was in V3. I have

Bayes in V4 compared to V3

2024-09-12 Thread Grega via users
Hi. I have SA 4.0.1 configured it, all is good, except for bayes. It IS working, it IS learning but when it classifies mail it is really not so decisive as it was in V3. I have: dbg: bayes: corpus size: nspam = 1190, nham = 12441 dbg: bayes: DB expiry: tokens in DB: 979401, Expiry max size

Re: Bayes "corpus" - how old?

2024-01-31 Thread Bill Cole
On 2024-01-31 at 08:16:13 UTC-0500 (Wed, 31 Jan 2024 14:16:13 +0100) Matus UHLAR - fantomas is rumored to have said: On 2024-01-30 at 12:08:18 UTC-0500 (Tue, 30 Jan 2024 18:08:18 +0100) Matus UHLAR - fantomas is rumored to have said: [...] autolearn may help if your DB is well maintained, alt

Re: Bayes "corpus" - how old?

2024-01-31 Thread Matus UHLAR - fantomas
On 2024-01-30 at 12:08:18 UTC-0500 (Tue, 30 Jan 2024 18:08:18 +0100) Matus UHLAR - fantomas is rumored to have said: [...] autolearn may help if your DB is well maintained, although I have disabled nearly all rules with negative scores, like RCVD_IN_DNSWL_* RCVD_IN_IADB_* DKIMWL_WL_* RCVD_IN_

Re: Bayes "corpus" - how old?

2024-01-30 Thread Bill Cole
On 2024-01-30 at 12:08:18 UTC-0500 (Tue, 30 Jan 2024 18:08:18 +0100) Matus UHLAR - fantomas is rumored to have said: [...] autolearn may help if your DB is well maintained, although I have disabled nearly all rules with negative scores, like RCVD_IN_DNSWL_* RCVD_IN_IADB_* DKIMWL_WL_* RCVD_IN_

Re: Bayes "corpus" - how old?

2024-01-30 Thread Matus UHLAR - fantomas
On 30.01.24 09:59, joe a wrote: Advisable to "prune" Bayes data based on age? While cleaning up recent Ham/Spam, found my "saved SPAM" goes back to 2013. Why that's over . . . wait, I need to take off my socks . . . So, how old is "too old".  For saved SP

Re: Bayes "corpus" - how old?

2024-01-30 Thread joe a
On 1/30/2024 10:58:52, Matus UHLAR - fantomas wrote: On 30.01.24 09:59, joe a wrote: Advisable to "prune" Bayes data based on age? While cleaning up recent Ham/Spam, found my "saved SPAM" goes back to 2013. Why that's over . . . wait, I need to take off my socks .

Re: Bayes "corpus" - how old?

2024-01-30 Thread Bill Cole
On 2024-01-30 at 09:59:52 UTC-0500 (Tue, 30 Jan 2024 09:59:52 -0500) joe a is rumored to have said: Advisable to "prune" Bayes data based on age? Yes. That is why it has an expiration model. Expiration may be de facto blocked on some busy systems so you may need to explicitl

Re: Bayes "corpus" - how old?

2024-01-30 Thread Matus UHLAR - fantomas
On 30.01.24 09:59, joe a wrote: Advisable to "prune" Bayes data based on age? While cleaning up recent Ham/Spam, found my "saved SPAM" goes back to 2013. Why that's over . . . wait, I need to take off my socks . . . So, how old is "too old". For saved S

Bayes "corpus" - how old?

2024-01-30 Thread joe a
Advisable to "prune" Bayes data based on age? While cleaning up recent Ham/Spam, found my "saved SPAM" goes back to 2013. Why that's over . . . wait, I need to take off my socks . . . So, how old is "too old". For saved SPAM?

Re: Bayes Stopword

2023-12-29 Thread Jimmy
a...@paclan.it <mailto:giova...@paclan.it>>> > wrote: > > > > > > To create the stopwords regexp I used the script I shared in > a previous email and a list of words one per line. > > > Could you share the list you are using ? > >

Re: Bayes Stopword

2023-12-29 Thread giovanni
ld you share the list you are using ? > >         Giovanni > >     On 12/29/23 09:22, Jimmy wrote: >      > I use SpamAssassin 4.0.0 (2022-12-14) >      > >      > $ spamassassin -D --lint 2>&1 | grep bayes: >      > Dec

Re: Bayes Stopword

2023-12-29 Thread Jimmy
ist you are using ? > > > > Giovanni > > > > On 12/29/23 09:22, Jimmy wrote: > > > I use SpamAssassin 4.0.0 (2022-12-14) > > > > > > $ spamassassin -D --lint 2>&1 | grep bayes: > > > Dec 29 15:17:5

Re: Bayes Stopword

2023-12-29 Thread giovanni
Could you share the list you are using ?    Giovanni On 12/29/23 09:22, Jimmy wrote: > I use SpamAssassin 4.0.0 (2022-12-14) > > $ spamassassin -D --lint 2>&1 | grep bayes: > Dec 29 15:17:56.919 [17420] dbg: bayes: stopword found lang=en &g

Re: Bayes Stopword

2023-12-29 Thread Jimmy
you share the list you are using ? > >Giovanni > > On 12/29/23 09:22, Jimmy wrote: > > I use SpamAssassin 4.0.0 (2022-12-14) > > > > $ spamassassin -D --lint 2>&1 | grep bayes: > > Dec 29 15:17:56.919 [17420] dbg: bayes: stopword found lang=en > > D

Re: Bayes Stopword

2023-12-29 Thread giovanni
To create the stopwords regexp I used the script I shared in a previous email and a list of words one per line. Could you share the list you are using ? Giovanni On 12/29/23 09:22, Jimmy wrote: I use SpamAssassin 4.0.0 (2022-12-14) $ spamassassin -D --lint 2>&1 | grep bayes: Dec 2

Re: Bayes Stopword

2023-12-29 Thread Jimmy
I use SpamAssassin 4.0.0 (2022-12-14) $ spamassassin -D --lint 2>&1 | grep bayes: Dec 29 15:17:56.919 [17420] dbg: bayes: stopword found lang=en Dec 29 15:17:56.919 [17420] dbg: bayes: stopword found lang=th Dec 29 15:17:56.919 [17420] dbg: bayes: stopword found lang=ru Dec 29 15:17:56.919

Re: Bayes Stopword

2023-12-28 Thread giovanni
xt and it produces a working regexp. Bayes stopwords languages must also be enabled using "bayes_stopword_languages" config keyword, by default only english is enabled. Giovanni On 12/28/23 17:06, Jimmy wrote: bayes_stopword_th https://pastebin.pl/view/0838138d <https://pastebin.p

Re: Bayes Stopword

2023-12-28 Thread Jimmy
one that, and I am also editing Plugin/Bayes.pm to > investigate why it is not being skipped. I suspect that if words are not > separated by spaces, longer words may not match those patterns. > > > > Jimmy > > > > On Thu, Dec 28, 2023 at 10:13 PM giova...@paclan.it>>

Re: Bayes Stopword

2023-12-28 Thread giovanni
patterns. Jimmy On Thu, Dec 28, 2023 at 10:13 PM mailto:giova...@paclan.it>> wrote: "spamassassin -D bayes" will tell you, you should see a line like: bayes: skipped token 'from' because it's in stopword list for language 'en'   Gio

Re: Bayes Stopword

2023-12-28 Thread Jimmy
Yes, I have done that, and I am also editing Plugin/Bayes.pm to investigate why it is not being skipped. I suspect that if words are not separated by spaces, longer words may not match those patterns. Jimmy On Thu, Dec 28, 2023 at 10:13 PM wrote: > "spamassassin -D bayes" will

Re: Bayes Stopword

2023-12-28 Thread giovanni
"spamassassin -D bayes" will tell you, you should see a line like: bayes: skipped token 'from' because it's in stopword list for language 'en' Giovanni On 12/28/23 15:45, Jimmy wrote: The pattern has successfully passed the test script, but it needs to

Re: Bayes Stopword

2023-12-28 Thread Jimmy
The pattern has successfully passed the test script, but it needs to check whether Bayes learning will identify and possibly exclude the word from matching this pattern. Thank you. On Thu, Dec 28, 2023 at 9:22 PM wrote: > On 12/28/23 12:59, Jimmy wrote: > > Hi, > > > > I

Re: Bayes Stopword

2023-12-28 Thread giovanni
. I have used Regexp::Trie to create Bayes stopwords in the past, code is similar to: --- use strict; use warnings; use Encode; use Regexp::Trie; my @input = ; my $rt = Regexp::Trie-&g

Bayes Stopword

2023-12-28 Thread Jimmy
Hi, I'm seeking assistance in incorporating a stopword for Asian languages in Unicode. Although I possess comprehensive word lists, my attempts to generate a regex pattern and test it have been unsuccessful; the pattern fails to match or skips tokens in the newly added stopword list. I created th

Re: Bayes always reject.

2023-12-13 Thread Jeff Mincy
u do something to strip off all of the email headers? For the BAYES_99, as already mentioned you probably need to retrain bayes, making sure to correct any incorrectly trained email messages. -jeff

Re: Bayes always reject.

2023-12-13 Thread Bill Cole
On 2023-12-13 at 01:49:24 UTC-0500 (Wed, 13 Dec 2023 07:49:24 +0100) Pierluigi Frullani is rumored to have said: Hello all, I'm facing a strange problem. Not really. MANY people run into this issue... I've feed the bayes db for a while and now I would like to put it in us

Bayes always reject.

2023-12-12 Thread Pierluigi Frullani
Hello all, I'm facing a strange problem. I've feed the bayes db for a while and now I would like to put it in use but all messages get a BAYES_99 and very high spam point. I would like to understand why, and troubleshoot this problem but I can't find a way. Spamassassin versio

Re: Share bayes database between servers

2023-07-09 Thread Matija Nalis
On Sun, Jul 09, 2023 at 07:06:10PM +0200, Robert Senger wrote: > I've set up a testing environment that also uses master-master > replication of the mysql bayes database, with priority in dns set to > equal for both mx to get incoming mail distributed evenly to both > systems. S

Re: Share bayes database between servers

2023-07-09 Thread Robert Senger
Am Sonntag, dem 09.07.2023 um 19:21 +0200 schrieb Reindl Harald: > > > Am 09.07.23 um 19:06 schrieb Robert Senger: > > But bayes data may be updated by either the primary mx or the > > backup > > mx, since email may arrive at either server. > > in a smart setup

Share bayes database between servers

2023-07-09 Thread Robert Senger
Hi there, I am running two mailservers, first one serving two domains, other one serving one domain. Both serve as backup mx for each other. Both know about users and aliases of the other domain(s). On both systems, spamassassin is configured to read/store userprefs and bayes data (per user) in

  1   2   3   4   5   6   7   8   9   10   >