On 2025-02-16 at 10:38:38 UTC-0500 (Sun, 16 Feb 2025 10:38:38 -0500)
Alex
is rumored to have said:
[quoting me]
TxRep (like AWL) is fed not by Bayes learning (sa-learn) but rather
it
tracks the combination of an address and a source IP range (/24) with
a
tally of the SA scores of messages
>
>
>
> > Is there any benefit to training an email that's already hitting
> > bayes99?
>
> Yes. The tokens which made it hit 99% are already doing their jobs, but
> the rest of the message that Bayes isn't seeing as spammy may turn out
> to be what make
r "salesforce optimization" or "HR consulting" that already hit
bayes99 (and bayes999) but are still just shy of 5 points.
Is there any benefit to training an email that's already hitting
bayes99?
Yes. The tokens which made it hit 99% are already doing their jobs, but
th
Alex writes:
> These also aren't always one-offs, but maybe a dozen or twenty of each over
> a short period that get through, likely before the URIs are blocked through
> other means. Other times they don't have a link at all.
Sounds like fairly aggressive greylisting is in order.
On Fri, 14 Feb 2025, Alex wrote:
Hi,
I'm using SA v4 and trying to find ways to minimize the amount of junk that
isn't tagged. Emails like "1-hour free consultation" or "buy this event
list" or "salesforce optimization" or "HR consulting" that already hit
bayes99 (and bayes999) but are still jus
Hi,
I'm using SA v4 and trying to find ways to minimize the amount of junk that
isn't tagged. Emails like "1-hour free consultation" or "buy this event
list" or "salesforce optimization" or "HR consulting" that already hit
bayes99 (and bayes999) but are still just shy of 5 points.
Is there any ben
On 2025-01-13 at 03:12:25 UTC-0500 (Mon, 13 Jan 2025 02:12:25 -0600
(CST))
Dave Funk
is rumored to have said:
It's also possible for the messages to differ by things such as
network routing headers, better to feed it all to bayes and let it get
parsed/scored.
That's an important
On 2025-01-13 at 01:51:17 UTC-0500 (Mon, 13 Jan 2025 08:51:17 +0200)
Anders Gustafsson
is rumored to have said:
Hi!
When collecting spam I frequently see multiple copies of the same
message, but with different fake senders.
In this case, should I feed just one or all to Bayes?
All.
Also
Thanks!
--
Regards, Anders
>>> Dave Funk 2025-01-13 10:12 >>>
On Mon, 13 Jan 2025, Anders Gustafsson wrote:
On Mon, 13 Jan 2025, Anders Gustafsson wrote:
Hi!
When collecting spam I frequently see multiple copies of the same message, but
with different fake senders.
In this case, should I feed just one or all to Bayes?
Yes, feed all copies of verfied spam to Bayes. As it is a weighted score per
Hi!
When collecting spam I frequently see multiple copies of the same message, but
with different fake senders.
In this case, should I feed just one or all to Bayes?
Also: Is there a point in feeding such spam that is already flagged by other
rules than Bayes and if so,
should I remove the
Le 10/01/2025 à 15:35, Bill Cole a écrit :
On 2025-01-10 at 08:49:04 UTC-0500 (Fri, 10 Jan 2025 14:49:04 +0100)
John Wilcock
is rumored to have said:
Hi all,
I'm using Spamassassin 4.0.1 on Gentoo and I've recently switched to
using MySQL (actually Mariadb 10.6) for Bayes stor
On 2025-01-10 at 08:49:04 UTC-0500 (Fri, 10 Jan 2025 14:49:04 +0100)
John Wilcock
is rumored to have said:
Hi all,
I'm using Spamassassin 4.0.1 on Gentoo and I've recently switched to
using MySQL (actually Mariadb 10.6) for Bayes storage.
I'm seeing "WARNING: MY
Hi all,
I'm using Spamassassin 4.0.1 on Gentoo and I've recently switched to
using MySQL (actually Mariadb 10.6) for Bayes storage.
I'm seeing "WARNING: MYSQL_OPT_RECONNECT is deprecated and will be
removed in a future version" warnings.
$ spamassassin --lint -
Well, I didnt re-learn it and now it works awsome.
Also when I try to learn already learned mail I get that id didn`t learn as DB
already contains same tokens.
So config error was storing tokens in OK way, but it was reading them wrong.
Now all my bayes issues are gone this is like night and
>
> As documented:
>
> # perldoc Mail::SpamAssassin::BayesStore::SQL
> NAME
> Mail::SpamAssassin::BayesStore::SQL - SQL Bayesian Storage Module
> Implementation
>
> DESCRIPTION
> This module implements a SQL based bayesian storage module. It's
> compatible with SQLite and possibly other standard SQ
Yes
Il Mar 12 Nov 2024, 13:53 Grega via users
ha scritto:
> If we used SQL and now switched to MySQL do we have to re-train bayes?
>
> --
> *From:* Bill Cole
> *Sent:* Tuesday, 12 November 2024 13:35
> *To:* users@spamassassin.apache.org
> *Subj
If we used SQL and now switched to MySQL do we have to re-train bayes?
From: Bill Cole
Sent: Tuesday, 12 November 2024 13:35
To: users@spamassassin.apache.org
Subject: Re: Bayes in V4 compared to V3
On 2024-11-12 at 00:33:13 UTC-0500 (Tue, 12 Nov 2024 00:33:13
On 2024-11-12 at 00:33:13 UTC-0500 (Tue, 12 Nov 2024 00:33:13 -0500)
Shawn Iverson
is rumored to have said:
[...]
> The "-D bayes" parameter was quite informative. Thank you. Turns out the
> database wasn't being read properly with the bayes_store_module in use.
> Mayb
: users@spamassassin.apache.org; Grega
Cc: hg user
Subject: Re: Bayes in V4 compared to V3
On Mon, Nov 11, 2024 at 4:48 PM hg user
mailto:mercurialu...@gmail.com>> wrote:
In spamassassin 3 you could debug bayes points running command line
spamassassin with "-D bayes" parameter. I
On Mon, Nov 11, 2024 at 4:48 PM hg user wrote:
> In spamassassin 3 you could debug bayes points running command line
> spamassassin with "-D bayes" parameter. I think you can in version 4 too.
>
> In the log all the tokens extracted from the message are listed with the
>
In spamassassin 3 you could debug bayes points running command line
spamassassin with "-D bayes" parameter. I think you can in version 4 too.
In the log all the tokens extracted from the message are listed with the
points assigned, so you can see exactly how the score is calculated
Hi all.
I`m posting this here as well.
Now I sorted out this score and disabled rules thing, but I still have bayes
issues even after I re-trained the whole thing by hand AND disabled autolearn.
Here is post: https://forum.efa-project.org/viewtopic.php?p=20226#p20226
I just don`t get 2
I think you are missing that a particular newsletter is not
intrinsically ham or spam. It is ham if the user has subscribed, and
spam if they have not affirmatively subscribed.
I have seen the very same content arrive at my mailserver for 2 users.
For one it is ham and the other it is spam.
Ther
On 2024-10-15 at 22:05:07 UTC-0400 (Tue, 15 Oct 2024 22:05:07 -0400)
Alex
is rumored to have said:
I can imagine the newsletter template is somewhat common, but does
bayes
have any ability to distinguish a junk newsletter from a legitimate
newsletter?
Not if it has never seen either of them
> I can imagine the newsletter template is somewhat common, but does bayes
> have any ability to distinguish a junk newsletter from a legitimate
> newsletter?
How can bayes, if you also can't? My advice would be to mark eg everything from
mailchimp and than whitelist what you in
On 10/16/24 04:05, Alex wrote:
Would I benefit from training known trustworthy newsletters such as ham?
Yes, you would.
Hi,
I've just retrained my bayes database (stored in SQL) with 10k hams and
about 6k spams. I tried to make sure there were no newsletters in either
corpus, but some emails present as newsletters but really are spam.
However, many legitimate newsletters are hitting BAYES_99 even though I
ha
)
mails before it has effect.
X-Spam-Report:
* 4.1 BAYES_99 BODY: Bayes spam probability is 99 to
100%
* [score: 1.]
* 5.0 BAYES_999 BODY: Bayes spam probability is 99.9
to 100
HAM for re-learning.
Which is the best approach?
so far, both. You may need to relearn multiple their (monthly) mails
before it has effect.
X-Spam-Report:
* 4.1 BAYES_99 BODY: Bayes spam probability is 99 to 100%
* [score: 1.]
* 5.0 BAYES_999 BODY: Bayes spam probability
may need to relearn multiple their (monthly) mails
before it has effect.
X-Spam-Report:
* 4.1 BAYES_99 BODY: Bayes spam probability is 99 to 100%
* [score: 1.]
* 5.0 BAYES_999 BODY: Bayes spam probability is 99.9 to 100%
* [score: 1.]
You have raised
(monthly) mails
before it has effect.
X-Spam-Report:
* 4.1 BAYES_99 BODY: Bayes spam probability is 99 to 100%
* [score: 1.]
* 5.0 BAYES_999 BODY: Bayes spam probability is 99.9 to 100%
* [score: 1.]
You have raised BAYES_99 and BAYES_999 to huge values so I
:
* 4.1 BAYES_99 BODY: Bayes spam probability is 99 to 100%
* [score: 1.]
* 5.0 BAYES_999 BODY: Bayes spam probability is 99.9 to 100%
* [score: 1.]
You have raised BAYES_99 and BAYES_999 to huge values so I recommend to
rethink that
> ---
>If guns kill people, then...
> -- pencils miss spel words.
> -- cars make people drive drunk.
> -- spoons make people fat.
> ---
:) I wa
>
> > So, on the one hand I can add them to whitelist and be done with it, or
> > I can add them to missed HAM for re-learning.
> >
> > Which is the best approach?
>
> Do both.
>
You will be always having work. The one's SPAM is the other users delight. I
have switched to having frontend serve
On Thu, 26 Sep 2024, joe a wrote:
So, on the one hand I can add them to whitelist and be done with it, or
I can add them to missed HAM for re-learning.
Which is the best approach?
Do both.
--
John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
jhar...@impsec.org
one of those tests.
So, on the one hand I can add them to whitelist and be done with it, or I can
add
them to missed HAM for re-learning.
Which is the best approach?
Both. Feeding it to Bayes helps to correct its behaviour for both
future messages from this sender and similar mail from others
required=5.0 tests=BAYES_99,BAYES_999,
DKIMWL_WL_MED,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,
HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,SPF_HELO_NONE,SPF_SOFTFAIL,
T_KAM_HTML_FONT_INVALID autolearn=disabled version=3.4.5
X-Spam-Report:
* 4.1 BAYES_99 BODY: Bayes spam
Oh god I`m idiot...
I had:
score BAYES_20 0.0
So now every mail has bayes score in it (changed it to score BAYES_20 0.1)
Still puzzling why I have no extreme low or extreme high values.
Also still puzzling why out of 3 identical mails one had bayes_60 and other 2
bayes_20.
Autolearn is
Hi.
Im on mysql backend.
Load is none ..
From: Matija Nalis
Sent: Wednesday, September 25, 2024 18:24
To: users@spamassassin.apache.org
Subject: Re: Bayes in V4 compared to V3
On Mon, Sep 23, 2024 at 01:14:25PM +, Grega via users wrote:
> Why one
On Tue, Sep 24, 2024 at 08:10:38AM +, Grega via users wrote:
> Also this:
>
> RuleDescriptionScoreTotalHamCol6SpamCol8
> BAYES_40Bayes spam probability is 20 to 40%0.002,784
> 2,72197.7632.3
> BAYES_50Bayes spam pr
local file storage (BDB?) which used file locking, and that locking was
prone to timing out when several mails came in quick succession.
For me, switching to MySQL backend for Bayes (and AWL) fixed such issues...
--
Opinions above are GNU-copylefted.
Also this:
RuleDescriptionScoreTotalHamCol6SpamCol8
BAYES_40Bayes spam probability is 20 to 40%0.002,784
2,72197.7632.3
BAYES_50Bayes spam probability is 40 to 60%0.8012693
73.83326.2
BAYES_60Bayes spam
Hi again.
In V4 there is something wrong with bayes...
I received 3 identical mails (1 external sender, 3 internal recipients) and
scores are like this:
2 X like:
0.00ARC_SIGNED Message has a ARC signature
-0.10 ARC_VALID Message has a valid ARC signature
-0.40
fly
With SA 4.X - on average 2-6GB and I had to do a quick fix:
59 23 * * * root find /var/lib/amavis/tmp/ -mtime +0 -delete;
W dniu 18.09.2024 o 16:09, Matus UHLAR - fantomas pisze:
On 18.09.24 13:42, Grega via users wrote:
Right now in SA 4.0.1 bayes at least for me is really challenging
to tr
ent directed at the widest audience, e.g. commercial or
political advertising.
Email: obvious.
Judging that requires some knowledge of the target. I can't tell you
whether your borderline email is spam. Neither can SA, but Bayes is one
way it tries to guess.
Is the goal to have every me
_50, and let other rules,
> like network checks, determine the score?
In general the great to the edge something is the more useful the score,
but you can't actually push them all to 00/99. There could be a
newsletter than user A asked for and is thus ham but user B did not and
when it arrives to
Jared Hall via users skrev den 2024-09-18 20:08:
On Deb-based distros, you can add this in /etc/amavis/conf.d/50-user
under the $max_servers parameter.
also remember its safe to use tmpfs for tmp dir in amavisd
no joke
On 9/18/2024 10:19 AM, natan wrote:
Hi
I was very disappointed with spamassassin 4.x because it started to
grow /var/lib/amavis/tmp/
With SA 3.4.X - on average 100MB and it deletes on the fly
With SA 4.X - on average 2-6GB and I had to do a quick fix:
59 23 * * * root find /var/lib/amavis/tmp/
natan skrev den 2024-09-18 16:36:
W dniu 18.09.2024 o 16:30, Reindl Harald (privat) pisze:
who reply here ? :)
don't blame SA when a blind man can see that your problem is on the
Amavis side - why do one need Amavis tu begin with when there is SA
and spamass-milter
yes yes everyone know
for testing and this
bad amavis also works correctly
W dniu 18.09.2024 o 16:09, Matus UHLAR - fantomas pisze:
On 18.09.24 13:42, Grega via users wrote:
Right now in SA 4.0.1 bayes at least for me is really challenging
to train and set up.
I had good trained DB from past V3 install, a
users wrote:
Right now in SA 4.0.1 bayes at least for me is really challenging
to train and set up.
I had good trained DB from past V3 install, and it behaved really odd.
I trained it on new set of mails 3000 spam and 3000 ham (HAND
PICKED mail it was PAIN) and I cant get either BAYES_00 or
18.09.2024 o 16:09, Matus UHLAR - fantomas pisze:
On 18.09.24 13:42, Grega via users wrote:
Right now in SA 4.0.1 bayes at least for me is really challenging to
train and set up.
I had good trained DB from past V3 install, and it behaved really odd.
I trained it on new set of mails 3000 spam and 3000
On 18.09.24 13:42, Grega via users wrote:
Right now in SA 4.0.1 bayes at least for me is really challenging to train and
set up.
I had good trained DB from past V3 install, and it behaved really odd.
I trained it on new set of mails 3000 spam and 3000 ham (HAND PICKED mail it
was PAIN) and I
Right now in SA 4.0.1 bayes at least for me is really challenging to train and
set up.
I had good trained DB from past V3 install, and it behaved really odd.
I trained it on new set of mails 3000 spam and 3000 ham (HAND PICKED mail it
was PAIN) and I cant get either BAYES_00 or BAYES_99 :)
I
>
>
> It is up to the user, ie you, what is and what is not spam.
>
Well, yes, and no.
Of course it's my own system and I can define these terms however I wish.
I'm also familiar with the need to investigate every message - perhaps I
should have made that clear initially.
It's only these few typ
Jared Hall via users skrev den 2024-09-17 08:15:
On 9/16/2024 8:48 PM, Alex wrote:
Hi,
Now that I'm using SA4, and my bayes database is quite old, I'd like
to retrain it with new ham and spam. I hoped someone had some pointers
on some of the gray area and what you consider to be sp
On 9/16/2024 8:48 PM, Alex wrote:
Hi,
Now that I'm using SA4, and my bayes database is quite old, I'd like
to retrain it with new ham and spam. I hoped someone had some pointers
on some of the gray area and what you consider to be spam and ham.
Are reliable newsletters, like
Hi,
Now that I'm using SA4, and my bayes database is quite old, I'd like to
retrain it with new ham and spam. I hoped someone had some pointers on some
of the gray area and what you consider to be spam and ham.
Are reliable newsletters, like those from, say, a trusted news source wher
On Fri, 13 Sep 2024, Bill Cole wrote:
Please send any replies to the list only.
...or to Harald only.
--
John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
jhar...@impsec.org pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C
Grega via users skrev den 2024-09-13 16:16:
Sorry guys if I replied to all, my intentions were not to spam :)
top posters :)
imho not impossible to request 3dr party list archives to make a
password for users, never mind
eggs came before chickens :=)
Sorry guys if I replied to all, my intentions were not to spam :)
From: Benny Pedersen
Sent: Friday, 13 September 2024 15:13
To: users@spamassassin.apache.org
Subject: Re: Bayes in V4 compared to V3
Bill Cole skrev den 2024-09-13 15:03:
> Please send
On 2024-09-13 at 09:13:58 UTC-0400 (Fri, 13 Sep 2024 15:13:58 +0200)
Benny Pedersen
is rumored to have said:
Bill Cole skrev den 2024-09-13 15:03:
Please send any replies to the list only.
unsubscribe listarchivers ?
and make archived on apache.org with bugzilla login
don't know if it wil
On Friday 13 September 2024 at 15:13:58, Benny Pedersen wrote:
> Bill Cole skrev den 2024-09-13 15:03:
> > Please send any replies to the list only.
>
> unsubscribe listarchivers ?
> and make archived on apache.org with bugzilla login
> don't know if it will help or not, but chicken and egg
I do
Bill Cole skrev den 2024-09-13 15:03:
Please send any replies to the list only.
unsubscribe listarchivers ?
and make archived on apache.org with bugzilla login
don't know if it will help or not, but chicken and egg
9-13 at 05:00:17 UTC-0400 (Fri, 13 Sep 2024 09:00:17 +)
Grega
is rumored to have said:
Do you have V3 or V4 SA?
From: Reindl Harald (privat)
Sent: Friday, 13 September 2024 10:57
To: Grega; Bill Cole; Grega via users
Subject: Re: Bayes in V4 compared to V3
Do you have V3 or V4 SA?
From: Reindl Harald (privat)
Sent: Friday, 13 September 2024 10:57
To: Grega; Bill Cole; Grega via users
Subject: Re: Bayes in V4 compared to V3
autolearn was always a blackbox
that below are the stats for the current month and that
This strategy worked really great in V3 and bayes was excellent even with
autotrain and ocasionally manual training.
Now it`s non decisive and useless at least for me.
We have around 5k-7k daily mails...
From: Reindl Harald (privat)
Sent: Friday, 13
Hi.
I just filtered in last week and I have
BAYES_20
BAYES_40
BAYES_50
BAYES_80
So no BAYES_00, _05, _90,_95 etc...
All extreme values which are the only one useful to do real scoring and marking
are missing.
Today I`m going to train bayes manually with around 4000 SPAM and 4000 HAM
On 2024-09-12 at 14:05:11 UTC-0400 (Thu, 12 Sep 2024 18:05:11 +)
Grega via users
is rumored to have said:
Hi.
I have SA 4.0.1 configured it, all is good, except for bayes. It IS
working, it IS learning but when it classifies mail it is really not
so decisive as it was in V3.
I have
Hi.
I have SA 4.0.1 configured it, all is good, except for bayes. It IS working, it
IS learning but when it classifies mail it is really not so decisive as it was
in V3.
I have:
dbg: bayes: corpus size: nspam = 1190, nham = 12441 dbg: bayes: DB expiry:
tokens in DB: 979401, Expiry max size
On 2024-01-31 at 08:16:13 UTC-0500 (Wed, 31 Jan 2024 14:16:13 +0100)
Matus UHLAR - fantomas
is rumored to have said:
On 2024-01-30 at 12:08:18 UTC-0500 (Tue, 30 Jan 2024 18:08:18 +0100)
Matus UHLAR - fantomas
is rumored to have said:
[...]
autolearn may help if your DB is well maintained, alt
On 2024-01-30 at 12:08:18 UTC-0500 (Tue, 30 Jan 2024 18:08:18 +0100)
Matus UHLAR - fantomas
is rumored to have said:
[...]
autolearn may help if your DB is well maintained, although I have
disabled nearly all rules with negative scores, like
RCVD_IN_DNSWL_*
RCVD_IN_IADB_* DKIMWL_WL_*
RCVD_IN_
On 2024-01-30 at 12:08:18 UTC-0500 (Tue, 30 Jan 2024 18:08:18 +0100)
Matus UHLAR - fantomas
is rumored to have said:
[...]
autolearn may help if your DB is well maintained, although I have
disabled nearly all rules with negative scores, like
RCVD_IN_DNSWL_*
RCVD_IN_IADB_* DKIMWL_WL_*
RCVD_IN_
On 30.01.24 09:59, joe a wrote:
Advisable to "prune" Bayes data based on age?
While cleaning up recent Ham/Spam, found my "saved SPAM" goes back
to 2013.
Why that's over . . . wait, I need to take off my socks . . .
So, how old is "too old". For saved SP
On 1/30/2024 10:58:52, Matus UHLAR - fantomas wrote:
On 30.01.24 09:59, joe a wrote:
Advisable to "prune" Bayes data based on age?
While cleaning up recent Ham/Spam, found my "saved SPAM" goes back to
2013.
Why that's over . . . wait, I need to take off my socks .
On 2024-01-30 at 09:59:52 UTC-0500 (Tue, 30 Jan 2024 09:59:52 -0500)
joe a
is rumored to have said:
Advisable to "prune" Bayes data based on age?
Yes. That is why it has an expiration model. Expiration may be de facto
blocked on some busy systems so you may need to explicitl
On 30.01.24 09:59, joe a wrote:
Advisable to "prune" Bayes data based on age?
While cleaning up recent Ham/Spam, found my "saved SPAM" goes back to
2013.
Why that's over . . . wait, I need to take off my socks . . .
So, how old is "too old". For saved S
Advisable to "prune" Bayes data based on age?
While cleaning up recent Ham/Spam, found my "saved SPAM" goes back to
2013.
Why that's over . . . wait, I need to take off my socks . . .
So, how old is "too old". For saved SPAM?
a...@paclan.it <mailto:giova...@paclan.it>>>
> wrote:
> > >
> > > To create the stopwords regexp I used the script I shared in
> a previous email and a list of words one per line.
> > > Could you share the list you are using ?
> >
ld you share the list you are using ?
>
> Giovanni
>
> On 12/29/23 09:22, Jimmy wrote:
> > I use SpamAssassin 4.0.0 (2022-12-14)
> >
> > $ spamassassin -D --lint 2>&1 | grep bayes:
> > Dec
ist you are using ?
> >
> > Giovanni
> >
> > On 12/29/23 09:22, Jimmy wrote:
> > > I use SpamAssassin 4.0.0 (2022-12-14)
> > >
> > > $ spamassassin -D --lint 2>&1 | grep bayes:
> > > Dec 29 15:17:5
Could you share the list you are using ?
Giovanni
On 12/29/23 09:22, Jimmy wrote:
> I use SpamAssassin 4.0.0 (2022-12-14)
>
> $ spamassassin -D --lint 2>&1 | grep bayes:
> Dec 29 15:17:56.919 [17420] dbg: bayes: stopword found lang=en
&g
you share the list you are using ?
>
>Giovanni
>
> On 12/29/23 09:22, Jimmy wrote:
> > I use SpamAssassin 4.0.0 (2022-12-14)
> >
> > $ spamassassin -D --lint 2>&1 | grep bayes:
> > Dec 29 15:17:56.919 [17420] dbg: bayes: stopword found lang=en
> > D
To create the stopwords regexp I used the script I shared in a previous email
and a list of words one per line.
Could you share the list you are using ?
Giovanni
On 12/29/23 09:22, Jimmy wrote:
I use SpamAssassin 4.0.0 (2022-12-14)
$ spamassassin -D --lint 2>&1 | grep bayes:
Dec 2
I use SpamAssassin 4.0.0 (2022-12-14)
$ spamassassin -D --lint 2>&1 | grep bayes:
Dec 29 15:17:56.919 [17420] dbg: bayes: stopword found lang=en
Dec 29 15:17:56.919 [17420] dbg: bayes: stopword found lang=th
Dec 29 15:17:56.919 [17420] dbg: bayes: stopword found lang=ru
Dec 29 15:17:56.919
xt and
it produces a working regexp.
Bayes stopwords languages must also be enabled using "bayes_stopword_languages"
config keyword, by default only english is enabled.
Giovanni
On 12/28/23 17:06, Jimmy wrote:
bayes_stopword_th https://pastebin.pl/view/0838138d
<https://pastebin.p
one that, and I am also editing Plugin/Bayes.pm to
> investigate why it is not being skipped. I suspect that if words are not
> separated by spaces, longer words may not match those patterns.
> >
> > Jimmy
> >
> > On Thu, Dec 28, 2023 at 10:13 PM giova...@paclan.it>>
patterns.
Jimmy
On Thu, Dec 28, 2023 at 10:13 PM mailto:giova...@paclan.it>> wrote:
"spamassassin -D bayes" will tell you, you should see a line like:
bayes: skipped token 'from' because it's in stopword list for language 'en'
Gio
Yes, I have done that, and I am also editing Plugin/Bayes.pm to investigate
why it is not being skipped. I suspect that if words are not separated by
spaces, longer words may not match those patterns.
Jimmy
On Thu, Dec 28, 2023 at 10:13 PM wrote:
> "spamassassin -D bayes" will
"spamassassin -D bayes" will tell you, you should see a line like:
bayes: skipped token 'from' because it's in stopword list for language 'en'
Giovanni
On 12/28/23 15:45, Jimmy wrote:
The pattern has successfully passed the test script, but it needs to
The pattern has successfully passed the test script, but it needs to check
whether Bayes learning will identify and possibly exclude the word from
matching this pattern.
Thank you.
On Thu, Dec 28, 2023 at 9:22 PM wrote:
> On 12/28/23 12:59, Jimmy wrote:
> > Hi,
> >
> > I
.
I have used Regexp::Trie to create Bayes stopwords in the past, code is similar
to:
---
use strict;
use warnings;
use Encode;
use Regexp::Trie;
my @input = ;
my $rt = Regexp::Trie-&g
Hi,
I'm seeking assistance in incorporating a stopword for Asian languages in
Unicode. Although I possess comprehensive word lists, my attempts to
generate a regex pattern and test it have been unsuccessful; the pattern
fails to match or skips tokens in the newly added stopword list.
I created th
u do something to strip off all of the email headers?
For the BAYES_99, as already mentioned you probably need to retrain
bayes, making sure to correct any incorrectly trained email messages.
-jeff
On 2023-12-13 at 01:49:24 UTC-0500 (Wed, 13 Dec 2023 07:49:24 +0100)
Pierluigi Frullani
is rumored to have said:
Hello all,
I'm facing a strange problem.
Not really. MANY people run into this issue...
I've feed the bayes db for a while and now I would like to put it in
us
Hello all,
I'm facing a strange problem.
I've feed the bayes db for a while and now I would like to put it in use
but all messages get a BAYES_99 and very high spam point.
I would like to understand why, and troubleshoot this problem but I can't
find a way.
Spamassassin versio
On Sun, Jul 09, 2023 at 07:06:10PM +0200, Robert Senger wrote:
> I've set up a testing environment that also uses master-master
> replication of the mysql bayes database, with priority in dns set to
> equal for both mx to get incoming mail distributed evenly to both
> systems. S
Am Sonntag, dem 09.07.2023 um 19:21 +0200 schrieb Reindl Harald:
>
>
> Am 09.07.23 um 19:06 schrieb Robert Senger:
> > But bayes data may be updated by either the primary mx or the
> > backup
> > mx, since email may arrive at either server.
>
> in a smart setup
Hi there,
I am running two mailservers, first one serving two domains, other one
serving one domain.
Both serve as backup mx for each other. Both know about users and
aliases of the other domain(s).
On both systems, spamassassin is configured to read/store userprefs and
bayes data (per user) in
1 - 100 of 1015 matches
Mail list logo