Re: Can't figure out how to "aggregate" the spam training for aliased users

2025-04-04 Thread Matus UHLAR - fantomas
, and said "editor" address is valid not only for denninger.net, but also for a couple of other domains that I run a web property for on behalf of someone else. If someone spams that "editor" user Spamassassin will use its built-in rules -- but it does /not /honor the Bayesian cla

Re: Can't figure out how to "aggregate" the spam training for aliased users

2025-03-15 Thread Karl Denninger
, but is in the aliases file to this account, and said "editor" address is valid not only for denninger.net, but also for a couple of other domains that I run a web property for on behalf of someone else. If someone spams that "editor" user Spamassassin will use its built-in r

Re: Can't figure out how to "aggregate" the spam training for aliased users

2025-03-15 Thread Matus UHLAR - fantomas
, and said "editor" address is valid not only for denninger.net, but also for a couple of other domains that I run a web property for on behalf of someone else. If someone spams that "editor" user Spamassassin will use its built-in rules -- but it does /not /honor the Bayesian cla

Can't figure out how to "aggregate" the spam training for aliased users

2025-03-13 Thread Karl Denninger
not only for denninger.net, but also for a couple of other domains that I run a web property for on behalf of someone else. If someone spams that "editor" user Spamassassin will use its built-in rules -- but it does /not /honor the Bayesian classifier training that my account (&qu

Re: training bayes and newsletters

2024-10-16 Thread Greg Troxel
I think you are missing that a particular newsletter is not intrinsically ham or spam. It is ham if the user has subscribed, and spam if they have not affirmatively subscribed. I have seen the very same content arrive at my mailserver for 2 users. For one it is ham and the other it is spam. Ther

Re: training bayes and newsletters

2024-10-16 Thread Bill Cole
27;t there be enough? Absolutely. But the Bayes classifier can't classify mail of types that have been meticulously excluded from its training corpus. Would I benefit from training known trustworthy newsletters such as ham? Yes. And train the spam ones as spam. -- Bill Cole b...@scconsu

RE: training bayes and newsletters

2024-10-16 Thread Marc
> I can imagine the newsletter template is somewhat common, but does bayes > have any ability to distinguish a junk newsletter from a legitimate > newsletter? How can bayes, if you also can't? My advice would be to mark eg everything from mailchimp and than whitelist what you indeed agreed to re

Re: training bayes and newsletters

2024-10-15 Thread Axb
On 10/16/24 04:05, Alex wrote: Would I benefit from training known trustworthy newsletters such as ham? Yes, you would.

training bayes and newsletters

2024-10-15 Thread Alex
ven't trained them. I can imagine the newsletter template is somewhat common, but does bayes have any ability to distinguish a junk newsletter from a legitimate newsletter? I realize there's somewhat of an imbalance between hams and spams, but shouldn't there be enough? Would I b

Re: SPAM-DETECTOR Re: Tips on training bayes?

2024-09-19 Thread natan
W dniu 18.09.2024 o 16:29, Matus UHLAR - fantomas pisze: On 18.09.24 16:19, natan wrote: I was very disappointed with spamassassin 4.x because it started to grow /var/lib/amavis/tmp/ amavis should clean this itself. which amavis version do you have installed? did you tune it anyhow? amavisd-

Re: Tips on training bayes?

2024-09-19 Thread Bill Cole
On 2024-09-17 at 16:29:52 UTC-0400 (Tue, 17 Sep 2024 16:29:52 -0400) Alex is rumored to have said: It is up to the user, ie you, what is and what is not spam. Well, yes, and no. Of course it's my own system and I can define these terms however I wish. I'm also familiar with the need to i

Re: Tips on training bayes?

2024-09-18 Thread Greg Troxel
Alex writes: > It's only these few types of messages that are very subjective and > experience from the broader open source community would be appreciated. > > If it has a legitimate unsubscribe link, does that make it ham? > > What criteria do you use to determine "spamminess/haminess of EVERY >

Re: Tips on training bayes?

2024-09-18 Thread Benny Pedersen
Jared Hall via users skrev den 2024-09-18 20:08: On Deb-based distros, you can add this in /etc/amavis/conf.d/50-user under the $max_servers parameter. also remember its safe to use tmpfs for tmp dir in amavisd no joke

Re: Tips on training bayes?

2024-09-18 Thread Jared Hall via users
On 9/18/2024 10:19 AM, natan wrote: Hi I was very disappointed with spamassassin 4.x because it started to grow /var/lib/amavis/tmp/ With SA 3.4.X - on average 100MB and it deletes on the fly With SA 4.X - on average 2-6GB and I had to do a quick fix: 59 23 * * * root find /var/lib/amavis/tmp/

Re: Tips on training bayes?

2024-09-18 Thread Benny Pedersen
natan skrev den 2024-09-18 16:36: W dniu 18.09.2024 o 16:30, Reindl Harald (privat) pisze: who reply here ? :) don't blame SA when a blind man can see that your problem is on the Amavis side - why do one need Amavis tu begin with when there is SA and spamass-milter yes yes everyone know

Re: Tips on training bayes?

2024-09-18 Thread natan
W dniu 18.09.2024 o 16:30, Reindl Harald (privat) pisze: Am 18.09.24 um 16:19 schrieb natan: Hi I was very disappointed with spamassassin 4.x because it started to grow /var/lib/amavis/tmp/ With SA 3.4.X - on average 100MB and it deletes on the fly With SA 4.X - on average 2-6GB and I had t

Re: Tips on training bayes?

2024-09-18 Thread Matus UHLAR - fantomas
On 18.09.24 16:19, natan wrote: I was very disappointed with spamassassin 4.x because it started to grow /var/lib/amavis/tmp/ amavis should clean this itself. which amavis version do you have installed? did you tune it anyhow? Did you enable and configure extracttext plugin? Because that one m

Re: Tips on training bayes?

2024-09-18 Thread natan
Hi I was very disappointed with spamassassin 4.x because it started to grow /var/lib/amavis/tmp/ With SA 3.4.X - on average 100MB and it deletes on the fly With SA 4.X - on average 2-6GB and I had to do a quick fix: 59 23 * * * root find /var/lib/amavis/tmp/ -mtime +0 -delete; W dniu 18.09.202

Re: Tips on training bayes?

2024-09-18 Thread Matus UHLAR - fantomas
On 18.09.24 13:42, Grega via users wrote: Right now in SA 4.0.1 bayes at least for me is really challenging to train and set up. I had good trained DB from past V3 install, and it behaved really odd. I trained it on new set of mails 3000 spam and 3000 ham (HAND PICKED mail it was PAIN) and I

Re: Tips on training bayes?

2024-09-18 Thread Grega via users
on training bayes? It is up to the user, ie you, what is and what is not spam. Well, yes, and no. Of course it's my own system and I can define these terms however I wish. I'm also familiar with the need to investigate every message - perhaps I should have made that clear initially.

Re: Tips on training bayes?

2024-09-17 Thread Alex
> > > It is up to the user, ie you, what is and what is not spam. > Well, yes, and no. Of course it's my own system and I can define these terms however I wish. I'm also familiar with the need to investigate every message - perhaps I should have made that clear initially. It's only these few typ

Re: Tips on training bayes?

2024-09-17 Thread Benny Pedersen
Jared Hall via users skrev den 2024-09-17 08:15: On 9/16/2024 8:48 PM, Alex wrote: Hi, Now that I'm using SA4, and my bayes database is quite old, I'd like to retrain it with new ham and spam. I hoped someone had some pointers on some of the gray area and what you consider to be spam and ham.

Re: Tips on training bayes?

2024-09-16 Thread Jared Hall via users
On 9/16/2024 8:48 PM, Alex wrote: Hi, Now that I'm using SA4, and my bayes database is quite old, I'd like to retrain it with new ham and spam. I hoped someone had some pointers on some of the gray area and what you consider to be spam and ham. Are reliable newsletters, like those from, sa

Tips on training bayes?

2024-09-16 Thread Alex
Hi, Now that I'm using SA4, and my bayes database is quite old, I'd like to retrain it with new ham and spam. I hoped someone had some pointers on some of the gray area and what you consider to be spam and ham. Are reliable newsletters, like those from, say, a trusted news source where the user op

Re: Training spamassassin past 5,000 emails

2021-03-09 Thread Kris Deugau
RW wrote: On Tue, 09 Mar 2021 08:52:28 -0500 Steve Dondley wrote: I will also be allowing users to flag their own spam using the roundcube webmail client. If you do that you should review the submissions. This. SO much this. ALL THE THIS. If you're using the "Mark as Junk" or "Mark as Jun

Re: Training spamassassin past 5,000 emails

2021-03-09 Thread RW
a specific user? I was really thinking more of an individual running SA for their own mail. It would be unusual for an admin to keep a full archive of trained mail for each account. Per user Bayes can be more accurate, but only if users take the training seriously. > I will also be allowing

Re: Training spamassassin past 5,000 emails

2021-03-09 Thread Bill Cole
On 9 Mar 2021, at 7:49, Steve Dondley wrote: I've read through https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html which states that "anything over about 5000 messages does not improve accuracy significantly in our tests." Did you read the section on expiration? https://spamassassi

Re: Training spamassassin past 5,000 emails

2021-03-09 Thread Steve Dondley
On 2021-03-09 08:28 AM, Greg Troxel wrote: Steve Dondley writes: I've read through https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html which states that "anything over about 5000 messages does not improve accuracy significantly in our tests." I would take that with a grain of salt.

Re: Training spamassassin past 5,000 emails

2021-03-09 Thread RW
On Tue, 09 Mar 2021 07:49:38 -0500 Steve Dondley wrote: > I've read through > https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html which > states that "anything over about 5000 messages does not improve > accuracy significantly in our tests." > > So once I hit 5,000, what do? Do I run -

Re: Training spamassassin past 5,000 emails

2021-03-09 Thread Greg Troxel
Steve Dondley writes: > I've read through > https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html which > states that "anything over about 5000 messages does not improve > accuracy significantly in our tests." I would take that with a grain of salt. Based on my experience running SA fo

Training spamassassin past 5,000 emails

2021-03-09 Thread Steve Dondley
I've read through https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html which states that "anything over about 5000 messages does not improve accuracy significantly in our tests." So once I hit 5,000, what do? Do I run --forget on say the 500 oldest emails, delete those from my ham/spa

Re: training bayes database

2018-05-16 Thread Alex Woick
David B Funk schrieb am 10.05.2018 um 20:23: On Thu, 10 May 2018, John Hardin wrote: On Thu, 10 May 2018, Matthew Broadhead wrote: On 09/05/18 20:43, David Jones wrote: On 05/09/2018 01:29 PM, Matthew Broadhead wrote: On 09/05/18 16:37, Reindl Harald wrote: quoting URIBL_BLOCKED is a joke

Re: training bayes database

2018-05-10 Thread David B Funk
On Thu, 10 May 2018, John Hardin wrote: On Thu, 10 May 2018, Matthew Broadhead wrote: On 09/05/18 20:43, David Jones wrote: On 05/09/2018 01:29 PM, Matthew Broadhead wrote: On 09/05/18 16:37, Reindl Harald wrote: quoting URIBL_BLOCKED is a joke - setup a *recursion* *non-forwarding* namese

Re: training bayes database

2018-05-10 Thread John Hardin
On Thu, 10 May 2018, Matthew Broadhead wrote: On 09/05/18 20:43, David Jones wrote: On 05/09/2018 01:29 PM, Matthew Broadhead wrote: On 09/05/18 16:37, Reindl Harald wrote: Am 09.05.2018 um 16:28 schrieb Matthew Broadhead: it looks like it is working.  so maybe it is just not flagging or mo

Re: training bayes database

2018-05-10 Thread Reio Remma
On 10.05.18 15:23, David Jones wrote: On 05/10/2018 07:12 AM, Reio Remma wrote: On 10.05.18 15:08, David Jones wrote: On 05/10/2018 07:02 AM, Reio Remma wrote: On a slightly related note. We're running a PFSense firewall with DNS Forwarder (dnsmasq) in front of our mail server. From what I've

Re: training bayes database

2018-05-10 Thread David Jones
On 05/10/2018 07:12 AM, Reio Remma wrote: On 10.05.18 15:08, David Jones wrote: On 05/10/2018 07:02 AM, Reio Remma wrote: On 10.05.18 14:58, Matus UHLAR - fantomas wrote: Am 09.05.2018 um 16:28 schrieb Matthew Broadhead: i guess my dns is set to use my isp's dns server.  do i need to set up d

Re: training bayes database

2018-05-10 Thread Matus UHLAR - fantomas
Am 09.05.2018 um 16:28 schrieb Matthew Broadhead: i guess my dns is set to use my isp's dns server. do i need to set up dns relay on my machine so it comes from my ip? there is no way we send more than 500k emails from our domain so i should qualify for the free lookup? On 09/05/18 20:43,

Re: training bayes database

2018-05-10 Thread Reio Remma
On 10.05.18 15:08, David Jones wrote: On 05/10/2018 07:02 AM, Reio Remma wrote: On 10.05.18 14:58, Matus UHLAR - fantomas wrote: Am 09.05.2018 um 16:28 schrieb Matthew Broadhead: i guess my dns is set to use my isp's dns server. do i need to set up dns relay on my machine so it comes from my

Re: training bayes database

2018-05-10 Thread David Jones
On 05/10/2018 07:02 AM, Reio Remma wrote: On 10.05.18 14:58, Matus UHLAR - fantomas wrote: Am 09.05.2018 um 16:28 schrieb Matthew Broadhead: i guess my dns is set to use my isp's dns server.  do i need to set up dns relay on my machine so it comes from my ip? there is no way we send more than

Re: training bayes database

2018-05-10 Thread Reio Remma
On 10.05.18 14:58, Matus UHLAR - fantomas wrote: Am 09.05.2018 um 16:28 schrieb Matthew Broadhead: i guess my dns is set to use my isp's dns server. do i need to set up dns relay on my machine so it comes from my ip? there is no way we send more than 500k emails from our domain so i should q

Re: training bayes database

2018-05-10 Thread Matus UHLAR - fantomas
Am 09.05.2018 um 16:28 schrieb Matthew Broadhead: i guess my dns is set to use my isp's dns server.  do i need to set up dns relay on my machine so it comes from my ip? there is no way we send more than 500k emails from our domain so i should qualify for the free lookup? On 09/05/18 20:43,

Re: training bayes database

2018-05-10 Thread Matthew Broadhead
On 09/05/18 20:43, David Jones wrote: On 05/09/2018 01:29 PM, Matthew Broadhead wrote: On 09/05/18 16:37, Reindl Harald wrote: Am 09.05.2018 um 16:28 schrieb Matthew Broadhead: it looks like it is working.  so maybe it is just not flagging or moving the spam? in a differnt post you showed t

Re: training bayes database

2018-05-09 Thread David Jones
On 05/09/2018 01:29 PM, Matthew Broadhead wrote: On 09/05/18 16:37, Reindl Harald wrote: Am 09.05.2018 um 16:28 schrieb Matthew Broadhead: it looks like it is working.  so maybe it is just not flagging or moving the spam? in a differnt post you showed this status header which *clearly* shows

Re: training bayes database

2018-05-09 Thread Matthew Broadhead
On 09/05/18 16:37, Reindl Harald wrote: Am 09.05.2018 um 16:28 schrieb Matthew Broadhead: it looks like it is working.  so maybe it is just not flagging or moving the spam? in a differnt post you showed this status header which *clearly* shows bayes is working - bayes alone don't flag, the tot

Re: training bayes database

2018-05-09 Thread Matthew Broadhead
On 09/05/18 16:37, Reindl Harald wrote: Am 09.05.2018 um 16:28 schrieb Matthew Broadhead: it looks like it is working.  so maybe it is just not flagging or moving the spam? in a differnt post you showed this status header which *clearly* shows bayes is working - bayes alone don't flag, the tot

Re: training bayes database

2018-05-09 Thread John Hardin
On Wed, 9 May 2018, Reio Remma wrote: On 9 May 2018, at 18:33, John Hardin wrote: Also: On Wed, 9 May 2018, Matthew Broadhead wrote: your message has X-Spam-Status: No, score=-18.15 tagged_above=-999 required=6.2 Setting the threshold higher will result in more spam getting through. The

Re: training bayes database

2018-05-09 Thread Reio Remma
> On 9 May 2018, at 18:33, John Hardin wrote: > > Also: > >> On Wed, 9 May 2018, Matthew Broadhead wrote: >> >> your message has >> >> X-Spam-Status: No, score=-18.15 tagged_above=-999 required=6.2 > > Setting the threshold higher will result in more spam getting through. The > scores calc

Re: training bayes database

2018-05-09 Thread John Hardin
Also: On Wed, 9 May 2018, Matthew Broadhead wrote: your message has X-Spam-Status: No, score=-18.15 tagged_above=-999 required=6.2 Setting the threshold higher will result in more spam getting through. The scores calculated by the masscheck processes are based on the assumption that the th

Re: training bayes database

2018-05-09 Thread John Hardin
G!!!) DNS server for the MTA's use so you avoid the URIBL_BLOCKED issue. That will help quite a lot.     autolearn=ham autolearn_force=no (4) around 50 users.  they are all working in same industry OK, that's small enough that manual training should not be an issue. Speculation: A

Re: training bayes database

2018-05-09 Thread Matthew Broadhead
On 09/05/18 16:03, Reio Remma wrote: On 09.05.18 16:59, Matthew Broadhead wrote: setting log_level and sa_debug in /etc/amavisd/amavisd.conf didn't seem to make any difference. should i be doing it in /etc/mail/spamassassin/local.cf? See if $sa_debug=1 works (for full debug)? (and restart ama

Re: training bayes database

2018-05-09 Thread Reio Remma
On 09.05.18 16:59, Matthew Broadhead wrote: setting log_level and sa_debug in /etc/amavisd/amavisd.conf didn't seem to make any difference. should i be doing it in /etc/mail/spamassassin/local.cf? See if $sa_debug=1 works (for full debug)? (and restart amavisd). Reio ok now i am getting a lot

Re: training bayes database

2018-05-09 Thread Matthew Broadhead
   DBI:mysql:sa_bayes:localhost:3306 it is storing the info to the database ok.  but it doesn't seem to be filtering any mail. (1) What is the output of: /usr/bin/sa-learn --dump magic (2) What user are you running sa-learn as for training, and what user is spamd running as? (3) Ar

Re: training bayes database

2018-05-09 Thread Reio Remma
is storing the info to the database ok. but it doesn't seem to be filtering any mail. (1) What is the output of: /usr/bin/sa-learn --dump magic (2) What user are you running sa-learn as for training, and what user is spamd running as? (3) Are you seeing any BAYES_nn rule hi

Re: training bayes database

2018-05-09 Thread Matthew Broadhead
it doesn't seem to be filtering any mail. (1) What is the output of: /usr/bin/sa-learn --dump magic (2) What user are you running sa-learn as for training, and what user is spamd running as? (3) Are you seeing any BAYES_nn rule hits on messages at all, on either ham or spam? Y

Re: training bayes database

2018-05-09 Thread Matthew Broadhead
On 09/05/18 09:09, Reio Remma wrote: On 09.05.18 9:57, Matthew Broadhead wrote: BAYES_00=-1.9 I've personally set *bayes_sql_override_username = amavis* in my local.cf If at all possible, run amavisd with SA bayes debug to see if/how it's using the database. Good luck, Reio Thanks Reio

Re: training bayes database

2018-05-09 Thread Reio Remma
On 09.05.18 9:57, Matthew Broadhead wrote: BAYES_00=-1.9 I've personally set *bayes_sql_override_username = amavis* in my local.cf If at all possible, run amavisd with SA bayes debug to see if/how it's using the database. Good luck, Reio

Re: training bayes database

2018-05-08 Thread Matthew Broadhead
27;t seem to be filtering any mail. (1) What is the output of: /usr/bin/sa-learn --dump magic (2) What user are you running sa-learn as for training, and what user is spamd running as? (3) Are you seeing any BAYES_nn rule hits on messages at all, on either ham or spam? (4) Ho

Re: training bayes database

2018-05-08 Thread John Hardin
.  but it doesn't seem to be filtering any mail. (1) What is the output of: /usr/bin/sa-learn --dump magic (2) What user are you running sa-learn as for training, and what user is spamd running as? (3) Are you seeing any BAYES_nn rule hits on messages at all, on either ham or spam? Y

Re: training bayes database

2018-05-08 Thread Reio Remma
g any mail. (1) What is the output of: /usr/bin/sa-learn --dump magic (2) What user are you running sa-learn as for training, and what user is spamd running as? (3) Are you seeing any BAYES_nn rule hits on messages at all, on either ham or spam? You'll probably need to look at yo

Re: training bayes database

2018-05-08 Thread John Hardin
ut of: /usr/bin/sa-learn --dump magic (2) What user are you running sa-learn as for training, and what user is spamd running as? (3) Are you seeing any BAYES_nn rule hits on messages at all, on either ham or spam? (4) How large is your environment (rough # and diversity of users)? I'm no

training bayes database

2018-05-08 Thread Matthew Broadhead
system setup centos-release-7-4.1708.el7.centos.x86_64, spamassassin-3.4.0-2.el7.x86_64, amavisd-new-2.11.0-3.el7.noarch /etc/mail/spamassassin/local.cf: required_hits 5 report_safe 0 rewrite_header Subject [SPAM] use_bayes  1 bayes_auto_learn   1 bayes_auto_expire  1 # Store bayesian

Re: txrep training performance

2017-08-01 Thread Jesse Norell
For anyone interested, I largely resolved the performance issues with sa-learn training when using txrep with a little mysql server tuning. As a reference point, training with ~6400 messages (most of which had already been learned) took about 14 minutes for both txrep+bayes, and about 3.5 minutes

Re: txrep training performance

2017-07-12 Thread Jesse Norell
One thing pointing to maybe a need for reworking the training logic is that I have txrep_track_messages at the default (1), and almost every message in my corpus has already been trained; each run brings in only a handful of new messages (usually 10-20, but often 0, and always < 100). It s

txrep training performance

2017-07-12 Thread Jesse Norell
Hello, I have txrep data in a mysql database, and am working on a training script to run sa-learn; with bayes also in MySQL and a corpus size of 5279 nspam and 849 nham, sa-learn takes a full 2 hours to run with txrep enabled (use_txrep 1), but only 13 minutes with txrep disabled (use_txrep 0

Re: training the filter

2016-11-08 Thread RW
On Mon, 7 Nov 2016 09:11:15 -0800 Daniel Ullfig wrote: > Hello: > > I’ve installed spamassassin to work with hMailServer on a windows > server. would like advice on training the filter, as I get a lot of > false positives. Would like to be able to forward ham to some

Re: training the filter

2016-11-07 Thread John Hardin
nt to divide your users into two broad groups: those whose judgement and responsibility you trust and who are allowed to train without review, and the rest, where you review the messages for valid classification before training. So that would be *four* folders: two public folders exposed to your

Re: training the filter

2016-11-07 Thread Sean Greenslade
On November 7, 2016 9:26:29 AM PST, Eric Abrahamsen wrote: >What a lot of people (including myself) do is have two IMAP folders >learn/spam and learn/ham. When a message is incorrectly classified you >put it in the right folder, then run sa-learn on a cron job, looking in >the appropriate folder,

Re: training the filter

2016-11-07 Thread Eric Abrahamsen
"Daniel Ullfig" writes: > Hello: > > I’ve installed spamassassin to work with hMailServer on a windows > server. would like advice on training the filter, as I get a lot of > false positives. Would like to be able to forward ham to something > like “h...@mydomain.

training the filter

2016-11-07 Thread Daniel Ullfig
Hello: I’ve installed spamassassin to work with hMailServer on a windows server. would like advice on training the filter, as I get a lot of false positives. Would like to be able to forward ham to something like “h...@mydomain.com”, and false negatives to “s...@mydomain.com”. Can this be done

Re: Training Bayes with BAYES_999 Mail

2015-10-02 Thread Reindl Harald
Am 02.10.2015 um 19:15 schrieb Andrew Davidson: I'm not an expert on the mechanics of Bayes so I'm wondering how valuable it is to continue training with collected spam that is properly tagged with BAYES_999. Does that help to reinforce the logic or is it overly focusing the d

Re: Training Bayes with BAYES_999 Mail

2015-10-02 Thread Matus UHLAR - fantomas
On 02.10.15 13:15, Andrew Davidson wrote: I'm not an expert on the mechanics of Bayes so I'm wondering how valuable it is to continue training with collected spam that is properly tagged with BAYES_999. Does that help to reinforce the logic or is it overly focusing the database on ema

Training Bayes with BAYES_999 Mail

2015-10-02 Thread Andrew Davidson
I'm not an expert on the mechanics of Bayes so I'm wondering how valuable it is to continue training with collected spam that is properly tagged with BAYES_999. Does that help to reinforce the logic or is it overly focusing the database on emails it can already detect? Should I only b

Re: Re-training

2015-04-16 Thread RW
On Thu, 16 Apr 2015 12:18:21 -0400 Roman Gelfand wrote: > Does sa-learn need read write access to emails or read only will do? Just read access. > In case of false negative, should I use --forget option to retrain? There's no need for that, it will work out what to do for itself.

Re: Re-training

2015-04-16 Thread Roman Gelfand
Does sa-learn need read write access to emails or read only will do? In case of false negative, should I use --forget option to retrain? On Tue, Apr 14, 2015 at 10:48 AM, Axb wrote: > On 04/14/2015 04:44 PM, Roman Gelfand wrote: > >> I received an email which is based on score ham. I would lik

Re: Re-training

2015-04-14 Thread Axb
On 04/14/2015 04:44 PM, Roman Gelfand wrote: I received an email which is based on score ham. I would like to train the bayes db to consider this email as spam. Is it possible to retrain bayes db for just that email without having that email available by providing something like mail id. you

Re-training

2015-04-14 Thread Roman Gelfand
I received an email which is based on score ham. I would like to train the bayes db to consider this email as spam. Is it possible to retrain bayes db for just that email without having that email available by providing something like mail id. Thanks in advance

Re: Quick question about training...

2015-02-22 Thread RW
On Mon, 23 Feb 2015 00:22:31 +0100 Reindl Harald wrote: > >> in doubt the amout of trained ham and spam should be near 50%, > > > > This is myth. What's important is to have enough of each, the actual > > ratio is not important. > > true - but you don't have much to measure the "enough of each"

Re: Quick question about training...

2015-02-22 Thread Reindl Harald
Am 23.02.2015 um 00:11 schrieb RW: On Fri, 20 Feb 2015 21:36:38 +0100 Reindl Harald wrote: And I'd suggest the same for non-spam, train duplicative ham even if it happens to be similarly addressed to different users. More data is (nearly) always better for bayesian learning systems of course

Re: Quick question about training...

2015-02-22 Thread RW
On Fri, 20 Feb 2015 21:36:38 +0100 Reindl Harald wrote: > > > And I'd suggest the same for non-spam, train duplicative ham even > > if it happens to be similarly addressed to different users. More > > data is (nearly) always better for bayesian learning systems > > of course With the caveat th

Re: Quick question about training...

2015-02-20 Thread Patrick Domack
Quoting Kevin Miller : When a fresh spam flood comes in, sometimes 50 or more of my users will get hit with the same message - just a different user in the To: line. When one trains the bayes database, is there a significant difference between training on all 50+ or just grabbing a few

RE: Quick question about training...

2015-02-20 Thread Kevin Miller
lto:da...@hireahit.com] > Sent: Friday, February 20, 2015 11:30 AM > To: users@spamassassin.apache.org > Subject: Re: Quick question about training... > > On 2015-02-20 09:44, Bowie Bailey wrote: > > On 2/20/2015 12:35 PM, Kevin Miller wrote: > >> When a fresh spam flood comes

Re: Quick question about training...

2015-02-20 Thread Reindl Harald
bayes database, is there a significant difference between training on all 50+ or just grabbing a few of the messages and training on them? Will bayes be more convinced of the spaminess of a particular message if it sees dozens rather than a couple? Yes, there will be a difference. Training the

Re: Quick question about training...

2015-02-20 Thread Dave Warren
difference between training on all 50+ or just grabbing a few of the messages and training on them? Will bayes be more convinced of the spaminess of a particular message if it sees dozens rather than a couple? Yes, there will be a difference. Training the exact same message multiple times will not

Re: Quick question about training...

2015-02-20 Thread Bowie Bailey
On 2/20/2015 12:35 PM, Kevin Miller wrote: When a fresh spam flood comes in, sometimes 50 or more of my users will get hit with the same message - just a different user in the To: line. When one trains the bayes database, is there a significant difference between training on all 50+ or just

Re: Quick question about training...

2015-02-20 Thread Reindl Harald
Am 20.02.2015 um 18:35 schrieb Kevin Miller: When a fresh spam flood comes in, sometimes 50 or more of my users will get hit with the same message - just a different user in the To: line. When one trains the bayes database, is there a significant difference between training on all 50+ or

Quick question about training...

2015-02-20 Thread Kevin Miller
When a fresh spam flood comes in, sometimes 50 or more of my users will get hit with the same message - just a different user in the To: line. When one trains the bayes database, is there a significant difference between training on all 50+ or just grabbing a few of the messages and training

Re: Training new spamass-milter setup

2015-02-18 Thread @lbutlr
On 18 Feb 2015, at 03:50 , Reindl Harald wrote: > i would find it pervert using /var/spool for the userhome and bayes-database I did not set the home for the spamd user, it was done in the install process. And yes, I found the user or /var/spool/spamd odd as well. --

Re: Training new spamass-milter setup

2015-02-18 Thread Reindl Harald
Am 18.02.2015 um 11:41 schrieb @lbutlr: On 18 Feb 2015, at 02:06 , Reindl Harald wrote: Am 18.02.2015 um 05:50 schrieb @lbutlr: On 17 Feb 2015, at 15:46 , Reindl Harald wrote: because in a default milter-setup the one and only user is the user which SA and the miler service are running as

Re: Training new spamass-milter setup

2015-02-18 Thread @lbutlr
> On 18 Feb 2015, at 02:06 , Reindl Harald wrote: > > > Am 18.02.2015 um 05:50 schrieb @lbutlr: >> On 17 Feb 2015, at 15:46 , Reindl Harald wrote: >>> because in a default milter-setup the one and only user is the user which >>> SA and the miler service are running as, hence my script which n

Re: Training new spamass-milter setup

2015-02-18 Thread Reindl Harald
Am 18.02.2015 um 05:50 schrieb @lbutlr: On 17 Feb 2015, at 15:46 , Reindl Harald wrote: because in a default milter-setup the one and only user is the user which SA and the miler service are running as, hence my script which needs maybe small adjustments for your environment (--no-sync and s

Re: Training new spamass-milter setup

2015-02-17 Thread @lbutlr
On 17 Feb 2015, at 15:46 , Reindl Harald wrote: > because in a default milter-setup the one and only user is the user which SA > and the miler service are running as, hence my script which needs maybe small > adjustments for your environment (--no-sync and so on depend on the config, > director

Re: Training new spamass-milter setup

2015-02-17 Thread Reindl Harald
spamassassin spamassassin has existing user-specific training already in place. Spamass-milter isn’t using the user DBs. additionally do my previous mail some technical facts how the milter works: * postfix connects to the milter * the milter connects to spamd via TCP * spamd fires up if not present

Re: Training new spamass-milter setup

2015-02-17 Thread Reindl Harald
spamassassin spamassassin has existing user-specific training already in place. Spamass-milter isn’t using the user DBs. because in a default milter-setup the one and only user is the user which SA and the miler service are running as, hence my script which needs maybe small adjustments for your

Re: Training new spamass-milter setup

2015-02-17 Thread LuKreme
sin has existing user-specific training already in place. Spamass-milter isn’t using the user DBs. -- Don't just *do* something: *sit* there!

Re: Training new spamass-milter setup

2015-02-17 Thread Matus UHLAR - fantomas
On 17.02.15 08:13, LuKreme wrote: OK, so I have spamass-milter running, but I need to train it. What is the proper way to do this? if you use "-u" parameter (maybe with "-x"), you should train it as the user who receives the mail -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fanto

Re: Training new spamass-milter setup

2015-02-17 Thread Robert Schetterer
Am 17.02.2015 um 16:13 schrieb LuKreme: > OK, so I have spamass-milter running, but I need to train it. What is the > proper way to do this? > you dont train spamass-milter, you should train spamassassin http://spamassassin.apache.org/full/3.0.x/dist/doc/sa-learn.html Best Regards MfG Robert S

Re: Training new spamass-milter setup

2015-02-17 Thread Reindl Harald
Am 17.02.2015 um 16:13 schrieb LuKreme: OK, so I have spamass-milter running, but I need to train it. What is the proper way to do this? cat /var/lib/spamass-milter/training/learn.sh #!/usr/bin/bash # Home-Directory und Name des Milter-Users SA_MILTER_HOME="/var/lib/spamass-m

Training new spamass-milter setup

2015-02-17 Thread LuKreme
OK, so I have spamass-milter running, but I need to train it. What is the proper way to do this? -- What beep from yonder speaker sounds?

Re: after months of training still most messages treated as SPAM

2015-01-26 Thread Reindl Harald
Am 26.01.2015 um 17:17 schrieb Benny Pedersen: On 26. jan. 2015 16.57.09 John Hardin wrote: OK, but: why does Bayes saying "it looks as hammy as it looks spammy" score so much when network tests are disabled? dnswl is disabled, or missing training of ham, skip rbl check doe

Re: after months of training still most messages treated as SPAM

2015-01-26 Thread John Hardin
On Mon, 26 Jan 2015, Benny Pedersen wrote: On 26. jan. 2015 17.25.06 John Hardin wrote: I don't quite understand what you're saying, can you unpack that a bit? i have forgot now what the quesstion is and i belive you know what happends if using skip rbl check is 1 I know why that scores

  1   2   3   4   5   6   >