Re: Operators Blacklist Survey

2017-08-15 Thread Shivram Krishnan
s this common? On Tue, Aug 15, 2017 at 12:24 PM, Dianne Skoll wrote: > On Tue, 15 Aug 2017 12:02:23 -0500 > Shivram Krishnan wrote: > > > Thanks for the response Bill. I have got a couple of responses from > > this group, which agree with what you are saying - they have their &

Re: Operators Blacklist Survey

2017-08-15 Thread Shivram Krishnan
, Bill Cole < sausers-20150...@billmail.scconsult.com> wrote: > On 14 Aug 2017, at 18:00, Shivram Krishnan wrote: > > Hi, >> >> >> I am a graduate student at the University of Southern California and am >> currently researching on the impact of false positive

Operators Blacklist Survey

2017-08-14 Thread Shivram Krishnan
Hi, I am a graduate student at the University of Southern California and am currently researching on the impact of false positives in blacklists. I am aware that spamassassin uses blacklists in its rule based system to stop spam messages. But since it is a rule based system, even if there are fal

Re: Corpus of Spam/Ham headers(Source IP) for research

2016-06-29 Thread Shivram Krishnan
. Also getting the IP's in anonymized last octet would also help , as we are creating Blacklists in terms of Prefixes. On Wed, Jun 29, 2016 at 8:41 AM, Antony Stone < antony.st...@spamassassin.open.source.it> wrote: > On Wednesday 29 June 2016 at 17:35:28, Shivram Krishnan wrote:

Re: Corpus of Spam/Ham headers(Source IP) for research

2016-06-29 Thread Shivram Krishnan
lt.com> wrote: > On 29 Jun 2016, at 1:00, Shivram Krishnan wrote: > > Hello Bill, >> >> Thank you so much for your views. I agree that your customers would not >> like it if you share information. But Oliver suggested , I need only the >> source IP address

Re: Corpus of Spam/Ham headers(Source IP) for research

2016-06-29 Thread Shivram Krishnan
> On 29 Jun 2016, at 1:00, Shivram Krishnan wrote: > > Hello Bill, >> >> Thank you so much for your views. I agree that your customers would not >> like it if you share information. But Oliver suggested , I need only the >> source IP addresses of the Spam and Ham

Re: Corpus of Spam/Ham headers(Source IP) for research

2016-06-28 Thread Shivram Krishnan
? On Tue, Jun 28, 2016 at 9:04 PM, Bill Cole < sausers-20150...@billmail.scconsult.com> wrote: > On 28 Jun 2016, at 20:33, Shivram Krishnan wrote: > > Hey Guys, >> >> I am a researcher at the University of Southern California ( >> https://steel.isi.edu/ ), and I have b

Corpus of Spam/Ham headers(Source IP) for research

2016-06-28 Thread Shivram Krishnan
Hey Guys, I am a researcher at the University of Southern California ( https://steel.isi.edu/ ), and I have been working on making Blacklists more effective by combining different sources of Blacklists, and creating a Blacklists specific for a particular network. Though I have devised a mechanis

Re: Spamassassin not capturing obvious Spam

2016-05-31 Thread Shivram Krishnan
Agreed that I do not have experience. I am just playing my cards out here to get a corpus of mails. Thanks guys! On Tue, May 31, 2016 at 11:20 AM, Reindl Harald wrote: > > > Am 31.05.2016 um 20:16 schrieb Antony Stone: > >> On Tuesday 31 May 2016 at 20:11:14, Shivram Krishn

Re: Spamassassin not capturing obvious Spam

2016-05-31 Thread Shivram Krishnan
um 19:55 schrieb Shivram Krishnan: > >> There will a point where the decision to drop the mail is made based on >> the headers. Cant we log it there? >> > > SA don't make any decisions of drop / reject > the glue does - spamass-milter, amavis or whatever >

Re: Spamassassin not capturing obvious Spam

2016-05-31 Thread Shivram Krishnan
Hello Reindl, There will a point where the decision to drop the mail is made based on the headers. Cant we log it there? On Tue, May 31, 2016 at 10:30 AM, Reindl Harald wrote: > > > Am 31.05.2016 um 19:25 schrieb Shivram Krishnan: > >> Thanks guys. >> >> Wha

Re: Spamassassin not capturing obvious Spam

2016-05-31 Thread Shivram Krishnan
f mails. And I agree that there is no point evaluating my study on after the event mails. To evaluate the performance of my study I am using SA. On Tue, May 31, 2016 at 10:44 AM, Antony Stone < antony.st...@spamassassin.open.source.it> wrote: > On Tuesday 31 May 2016 at 15:47:56, Shivr

Re: Spamassassin not capturing obvious Spam

2016-05-31 Thread Shivram Krishnan
tlr wrote: > >> On May 30, 2016, at 11:06 PM, Shivram Krishnan >> wrote: >> >>> 2) I have set a threshold of -10 to see how spamassassin assigns a score >>> for every mail. >>> >> No. Do not do this. >> > > Instead, set this option in

Re: Spamassassin not capturing obvious Spam

2016-05-31 Thread Shivram Krishnan
I might be forced to do this. Take the corpus from Mailinator and manually mark it as SPAM or HAM and use sa-learn to train spamassassin. But this is what is confusing me. doesnt SA use a lot more tags, to determine if it is a SPAM or HAM? does this mean that sa-learn is not only for bayes but als

Re: Spamassassin not capturing obvious Spam

2016-05-31 Thread Shivram Krishnan
false positives we plan to use SA. On Tue, May 31, 2016 at 6:43 AM, Shivram Krishnan wrote: > The data set which i use for bayes consists of both ham and spam. ( > https://www.cs.cmu.edu/~./enron/) > > Lets consider a scenario, where I have a domain and I point it to a > mailserver

Re: Spamassassin not capturing obvious Spam

2016-05-31 Thread Shivram Krishnan
The data set which i use for bayes consists of both ham and spam. ( https://www.cs.cmu.edu/~./enron/) Lets consider a scenario, where I have a domain and I point it to a mailserver. It might take a while for me to generate 50,000 mails a day ( mailinator provides me this) . I need to embed multipl

Re: Spamassassin not capturing obvious Spam

2016-05-31 Thread Shivram Krishnan
for me to decide if it is SPAM or not. What do you guys suggest me to do in this case? Is there a better way to do it? On Tue, May 31, 2016 at 1:48 AM, Reindl Harald wrote: > > > Am 31.05.2016 um 08:18 schrieb Shivram Krishnan: > >> It is not on production. I am using t

Re: Spamassassin not capturing obvious Spam

2016-05-30 Thread Shivram Krishnan
It is not on production. I am using this to evaluate spamassassin. On Mon, May 30, 2016 at 10:38 PM, @lbutlr wrote: > On May 30, 2016, at 11:06 PM, Shivram Krishnan > wrote: > > 2) I have set a threshold of -10 to see how spamassassin assigns a score > for every mail. > &g

Re: Spamassassin not capturing obvious Spam

2016-05-30 Thread Shivram Krishnan
al 'Received:' headers which have '()' where > there should be at least the IP address of the incoming connection. > > This indicates that the message has either been tampered with or is from a > postfix system that somebody has messed up the configuration. > > &

Spamassassin not capturing obvious Spam

2016-05-30 Thread Shivram Krishnan
Hey guys, I am testing spamassassin on a SPAM/HAM corpus of mails. Spamassassin is not picking up an obvious spam like in this case http://pastebin.com/MbNRNFWy . I have followed the guidelines on https://wiki.apache.org/spamassassin/ImproveAccuracy . Let me know how to catch these type of Spams