>https://wiki.apache.org/spamassassin/ImproveAccuracy
>I have gone through this wiki (and ones like it) at least a dozen times.
>My server is blocking about 50% of the spam, thanks to some of the
>other layers of spam protection. It's just bayes that I can't seem to get
>right
Are you getting a
On Wed, 2016-06-01 at 00:38 +, David Jones wrote:
>
> Too bad we couldn't make SA do something very annoying and
> more obvious when the URIBL_BLOCKED rule was hit.
>
I notice, rather to my surprise, that the SA Wiki doesn't seem to have
an entry for the URIBL_BLOCKED rule. However, since pe
On Tue, 2016-05-31 at 17:04 -0700, Peter Carlson wrote:
>
> URIBL_BLOCKED == read some basics
> your reply == useless. You have no idea what I may or may not have
> read. You are under no obligation to provide any help to me or
> anyone
> else. I suggest that if for whatever reason you find m
On Tue, 31 May 2016, Peter Carlson wrote:
I will investigate this (URIBL_BLOCKED) further tomorrow
(https://wiki.apache.org/spamassassin/CachingNameserver),
Note: caching != recursing. You can have a caching forwarding local
nameserver, which won't fix URIBL_BLOCKED.
however I doubt that it
not everyone is an email
expert that understands how RBLs work and that it's bad
to share a recursive DNS server on an SA server.
I will investigate this (URIBL_BLOCKED) further tomorrow
(https://wiki.apache.org/spamassassin/CachingNameserver), however I
doubt tha
On Tue, 31 May 2016, Peter Carlson wrote:
On 05/31/2016 04:27 PM, Reindl Harald wrote:
Am 31.05.2016 um 23:58 schrieb Peter Carlson:
> May 30 09:04:53 www amavis[16577]: (16577-03) Passed CLEAN
> {RelayedInbound}, Tests:
> [BAYES_00=-1.9,RCVD_IN_MSPIKE_H2=-0.001,SPF_PASS=-0.001,UR
>From: Reindl Harald
>Sent: Tuesday, May 31, 2016 6:27 PM
>To: users@spamassassin.apache.org
>Subject: Re: Bayes filter marking everything as ham
>Am 31.05.2016 um 23:58 schrieb Peter Carlson:
>> May 30 09:04:53 www amavis[16577]: (16577-03) Passed CLEAN
>> {RelayedInbound}, Tests:
>>
Kind of a shot in the dark, but are you sure everyone is promptly
moving their spam out of the inboxes? I worry about automated
learning like this. Even then, it seems unlikely that every mail
would get tagged by bayes as likely ham.
Someone just today suggested in another thread to add the fol
On 05/31/2016 04:27 PM, Reindl Harald wrote:
Am 31.05.2016 um 23:58 schrieb Peter Carlson:
May 30 09:04:53 www amavis[16577]: (16577-03) Passed CLEAN
{RelayedInbound}, Tests:
[BAYES_00=-1.9,RCVD_IN_MSPIKE_H2=-0.001,SPF_PASS=-0.001,URIBL_BLOCKED=0.001],
autolearn=ham autolearn_forc
Am 31.05.2016 um 23:58 schrieb Peter Carlson:
May 30 09:04:53 www amavis[16577]: (16577-03) Passed CLEAN
{RelayedInbound}, Tests:
[BAYES_00=-1.9,RCVD_IN_MSPIKE_H2=-0.001,SPF_PASS=-0.001,URIBL_BLOCKED=0.001],
autolearn=ham autolearn_force=no, autolearnscore=-0.001, 3992 ms
th
>From: RW
>Sent: Tuesday, May 31, 2016 5:20 PM
>To: users@spamassassin.apache.org
>Subject: Re: SA Concepts - plugin for email semantics
>On Tue, 31 May 2016 15:20:56 -0400
>Bill Cole wrote:
>> On 29 May 2016, at 11:07, RW wrote:
>>
>> > Statistical filters are based on some statistical theory
On Tue, 31 May 2016 15:20:56 -0400
Bill Cole wrote:
> On 29 May 2016, at 11:07, RW wrote:
>
> > Statistical filters are based on some statistical theory combined
> > with pragmatic kludges and assumptions. Practical filters have been
> > developed based on what's been found to work, not on what'
(sorry if this is a
repost, I dont see my messages coming through...the irony of
spamassassin.apache.org trapping my request for help as spam. I
have snipped the logfile entries which I think were causing it to
be tagged as spam)
All of my messages
On Tue, 31 May 2016 21:23:11 +0100
Paul Stead wrote:
> The implementation was undertaken from a personal interest - I asked
> the question of what people thought of the implementation and the
> impact to Bayes DB.
I think what the "concepts" concept ends up doing is this: "concepts"
are more-or-
On 31/05/16 20:20, Bill Cole wrote:
It is no shock that while this implementation has Paul Stead's name on
it, it is apparently mostly the product of the anti-spam community's
most spectacular case of Dunning-Kruger Syndrome, who has apparently
figured out that his personal 'brand' has negative
On 29 May 2016, at 11:07, RW wrote:
On Sat, 28 May 2016 15:37:21 -0400
Bill Cole wrote:
More importantly (IMHO) they aren't designed to collide with existing
common tokens and be added back into messages that may contain those
tokens already in order to influence Bayesian classification.
The
Agreed that I do not have experience. I am just playing my cards out here
to get a corpus of mails.
Thanks guys!
On Tue, May 31, 2016 at 11:20 AM, Reindl Harald
wrote:
>
>
> Am 31.05.2016 um 20:16 schrieb Antony Stone:
>
>> On Tuesday 31 May 2016 at 20:11:14, Shivram Krishnan wrote:
>>
>> In th
Am 31.05.2016 um 20:16 schrieb Antony Stone:
On Tuesday 31 May 2016 at 20:11:14, Shivram Krishnan wrote:
In the glue - like spamass-mailer, there would be two folders which are
created. One would be the mailbox and the other would be a spambox(dont
know the term). Cant you access the spambox
Am 31.05.2016 um 20:11 schrieb Shivram Krishnan:
In the glue - like spamass-mailer, there would be two folders which are
created. One would be the mailbox and the other would be a spambox(dont
know the term). Cant you access the spambox to extract the mail?
in the glue there are no folders as
On Tuesday 31 May 2016 at 20:11:14, Shivram Krishnan wrote:
> In the glue - like spamass-mailer, there would be two folders which are
> created. One would be the mailbox and the other would be a spambox(dont
> know the term). Cant you access the spambox to extract the mail?
It sounds to me that y
In the glue - like spamass-mailer, there would be two folders which are
created. One would be the mailbox and the other would be a spambox(dont
know the term). Cant you access the spambox to extract the mail?
On Tue, May 31, 2016 at 11:01 AM, Reindl Harald
wrote:
>
>
> Am 31.05.2016 um 19:55 sch
Am 31.05.2016 um 19:55 schrieb Shivram Krishnan:
There will a point where the decision to drop the mail is made based on
the headers. Cant we log it there?
SA don't make any decisions of drop / reject
the glue does - spamass-milter, amavis or whatever
and even if - i would find it pervert to
Hello Reindl,
There will a point where the decision to drop the mail is made based on the
headers. Cant we log it there?
On Tue, May 31, 2016 at 10:30 AM, Reindl Harald
wrote:
>
>
> Am 31.05.2016 um 19:25 schrieb Shivram Krishnan:
>
>> Thanks guys.
>>
>> What I am going to ask might be a longsh
Hi Antony,
I have an ongoing collection of Blacklists since Jan 1 ,2016. This way I
would know how long it has stayed on the Blacklist.
"Dealing with email "after the event" (especially with regard to blacklists)
will give you very different results from dealing with it as it happens, if
for
no o
On Tuesday 31 May 2016 at 15:47:56, Shivram Krishnan wrote:
> I am using SA as an oracle for Blacklisting. Our research concerns with
> combining multiple sources of blacklist and also consider the historical
> importance of an IP in a blacklist to create a very effective master
> blacklist.
>
>
Am 31.05.2016 um 19:25 schrieb Shivram Krishnan:
Thanks guys.
What I am going to ask might be a longshot.
But is it possible for anyone who is running a mailserver to give a list
of source of SPAM (recent , anytime this year)and the SA score
associated? It will be extremely useful for my rese
Thanks guys.
What I am going to ask might be a longshot.
But is it possible for anyone who is running a mailserver to give a list of
source of SPAM (recent , anytime this year)and the SA score associated? It
will be extremely useful for my research and credit would be given.
Example:-
efetunisie.
On 5/31/2016 12:06 PM, Anthony Hoppe wrote:
All,
I accidentally forwarded some spam to this list. Autocomplete got the
best of me and I chose "spamassassin" instead of "spamcop" in the "TO"
field of the message. I haven't received the message myself (not sure
if I will), but wanted to apolo
On Tue, 31 May 2016 12:05:39 -0400
Bill Cole wrote:
> On 31 May 2016, at 2:21, Henrik K wrote:
>
> > On Mon, May 30, 2016 at 06:25:08PM -0400, Dianne Skoll wrote:
> >> On Mon, 30 May 2016 17:45:52 -0400
> >> "Bill Cole" wrote:
> >>
> >>> So you could have 'sex' and 'meds' and 'watches' talli
On Mon, 30 May 2016 17:45:52 -0400
Bill Cole wrote:
> The "Naive Bayes" classification approach is theoretically moored to
> Bayes' Theorem
FWIW Bayes hasn't been "Naive Bayes" for a long time.
All,
I accidentally forwarded some spam to this list. Autocomplete got the best of
me and I chose "spamassassin" instead of "spamcop" in the "TO" field of the
message. I haven't received the message myself (not sure if I will), but wanted
to apologize in case any of you got it.
Happy, uh, Tu
On 31 May 2016, at 2:21, Henrik K wrote:
On Mon, May 30, 2016 at 06:25:08PM -0400, Dianne Skoll wrote:
On Mon, 30 May 2016 17:45:52 -0400
"Bill Cole" wrote:
So you could have 'sex' and 'meds' and 'watches' tallied up in into
frequency counts that sum up natural (word) and synthetic (concept)
On 5/31/2016 1:38 AM, @lbutlr wrote:
On May 30, 2016, at 11:06 PM, Shivram Krishnan wrote:
2) I have set a threshold of -10 to see how spamassassin assigns a score for
every mail.
No. Do not do this.
Instead, set this option in your local.cf file:
add_header all Report _REPORT_
This will
Am 31.05.2016 um 17:13 schrieb Shivram Krishnan:
I might be forced to do this. Take the corpus from Mailinator and
manually mark it as SPAM or HAM and use sa-learn to train spamassassin.
But this is what is confusing me. doesnt SA use a lot more tags, to
determine if it is a SPAM or HAM? does
On 5/30/2016 10:35 AM, Nick Howitt wrote:
Just for a bit of closure, it looks like when you use amavisd-new with
SA, it is amavisd-new and not SA which is adding the X-Spam headers.
In /etc/amavisd/api.conf there is a parameter, $sa_tag_level_deflt,
defaulted to -99, below which no X-Spam heade
I might be forced to do this. Take the corpus from Mailinator and manually
mark it as SPAM or HAM and use sa-learn to train spamassassin.
But this is what is confusing me. doesnt SA use a lot more tags, to
determine if it is a SPAM or HAM? does this mean that sa-learn is not only
for bayes but als
On Tuesday 31 May 2016 at 17:02:26, Reindl Harald wrote:
> Am 31.05.2016 um 16:59 schrieb Antony Stone:
> >
> > I had read SA documentation such as
> > https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html
> that's all based on opinions - the only question is the quality of
> training and
Am 31.05.2016 um 16:59 schrieb Antony Stone:
On Tuesday 31 May 2016 at 15:32:49, Reindl Harald wrote:
Am 31.05.2016 um 15:28 schrieb Antony Stone:
2. You should be aware (*especially* if using this stuff as the basis of
a research project - any competent referee should pick up on something
li
On Tuesday 31 May 2016 at 15:32:49, Reindl Harald wrote:
> Am 31.05.2016 um 15:28 schrieb Antony Stone:
> > 2. You should be aware (*especially* if using this stuff as the basis of
> > a research project - any competent referee should pick up on something
> > like this) that SA works best when the
BTW I am using SA as an oracle for Blacklisting. Our research concerns with
combining multiple sources of blacklist and also consider the historical
importance of an IP in a blacklist to create a very effective master
blacklist.
Let me give you an example.
Suppose an IP address 1.2.3.4 appeared on
The data set which i use for bayes consists of both ham and spam. (
https://www.cs.cmu.edu/~./enron/)
Lets consider a scenario, where I have a domain and I point it to a
mailserver. It might take a while for me to generate 50,000 mails a day (
mailinator provides me this) . I need to embed multipl
Am 31.05.2016 um 15:28 schrieb Antony Stone:
2. You should be aware (*especially* if using this stuff as the basis of a
research project - any competent referee should pick up on something like
this) that SA works best when the emails it is asked to process are from the
same source as it has be
Am 31.05.2016 um 15:21 schrieb Shivram Krishnan:
Here is my scenario. I am using SA as a oracle/ground truth for a
research project. It is generally hard to get hold of a real time mail
corpus
nope, just point a cheap domain to a mailserver accepting all incoming
stuff and spread some hidden
On Tuesday 31 May 2016 at 15:21:19, Shivram Krishnan wrote:
> Here is my scenario. I am using SA as a oracle/ground truth for a research
> project.
Okay.
> It is generally hard to get hold of a real time mail corpus
Er, what??
> I opted for a service provided by mailinator.
> I have also trai
Here is my scenario. I am using SA as a oracle/ground truth for a research
project. It is generally hard to get hold of a real time mail corpus, so I
opted for a service provided by mailinator. Mailinator is a company which
provides users with disposable email ID's and it offers an API to obtain
th
Am 31.05.2016 um 10:43 schrieb Matus UHLAR - fantomas:
On 30 May 2016, at 15:07, Alex wrote:
Yeah, that's it exactly. Particularly overseas where it doesn't appear
NAT and/or submission are used as readily as they are here.
Am 31.05.2016 um 03:09 schrieb Bill Cole:
Irrelevant in this case
Am 31.05.2016 um 08:18 schrieb Shivram Krishnan:
It is not on production. I am using this to evaluate spamassassin.
how will you evaluate something when you slay your setup that way?
On Mon, May 30, 2016 at 10:38 PM, @lbutlr mailto:krem...@kreme.com>> wrote:
On May 30, 2016, at 11:06 P
On 30 May 2016, at 15:07, Alex wrote:
Yeah, that's it exactly. Particularly overseas where it doesn't appear
NAT and/or submission are used as readily as they are here.
Am 31.05.2016 um 03:09 schrieb Bill Cole:
Irrelevant in this case because if you trust that header not to be an
intentionall
Am 31.05.2016 um 04:24 schrieb Shivram Krishnan:
I am testing spamassassin on a SPAM/HAM corpus of mails. Spamassassin is
not picking up an obvious spam like in this case
http://pastebin.com/MbNRNFWy .
you sample is mangeled and hence crap
it's even damaged because a leading newline
frankly y
Am 31.05.2016 um 03:09 schrieb Bill Cole:
On 30 May 2016, at 15:07, Alex wrote:
Yeah, that's it exactly. Particularly overseas where it doesn't appear
NAT and/or submission are used as readily as they are here.
Irrelevant in this case because if you trust that header not to be an
intentiona
Am 31.05.2016 um 02:30 schrieb Bill Cole:
On 30 May 2016, at 18:25, Dianne Skoll wrote:
On Mon, 30 May 2016 17:45:52 -0400
"Bill Cole" wrote:
So you could have 'sex' and 'meds' and 'watches' tallied up in into
frequency counts that sum up natural (word) and synthetic (concept)
occurrences,
OK,
So you are testing to see how SA scores artificial mail messages.
However SA is designed to evaluate real mail messages, not botched
fabrications of them, so I don't understand what you are trying to achieve.
You have (either deliberately or unknowingly) omitted the necessary
information tha
52 matches
Mail list logo