Re: SPAM/Phish and Ham E-mail Dataset

2011-01-12 Thread David F. Skoll
On Wed, 12 Jan 2011 23:23:39 +0100 mouss wrote: [...] > you need to train with _your_mail. do not train with somebody else's > mail. one of the defence args is that attackers can't guess your > setup. if every one of us uses the same corpus then it'll be easy for > an attacker to get around. Th

Re: SPAM/Phish and Ham E-mail Dataset

2011-01-12 Thread Marco Ribeiro
http://untroubled.org/spam/ --> for spam, updated daily. --Marco Túlio *Sola Scriptura, Sola Fide, Sola Gratia, Solus Christus, Soli Deo Glória* On Wed, Jan 12, 2011 at 8:23 PM, mouss wrote: > Le 12/01/2011 23:02, Mahmoud Khonji a écrit : > > I would highly appreciate if anyone is able to sen

Re: SPAM/Phish and Ham E-mail Dataset

2011-01-12 Thread mouss
Le 12/01/2011 23:02, Mahmoud Khonji a écrit : > I would highly appreciate if anyone is able to send me his SPAM/Ham email > collection. sigh. if you can't understand what "privacy" means, then you are part of the problem. > > I need it to train and test classifiers. you need to train with _yo

Re: Understanding TrustPath

2011-01-12 Thread mouss
Le 11/01/2011 22:07, Mark Martinec a écrit : >> Consider for a moment how hard it would be for an average spammer to >> spoof rDNS > > This has nothing to do with DNS. The trusted/internal/msa networks > only checks an IP address as it stands in an Received header field, > it does not check nor de

SPAM/Phish and Ham E-mail Dataset

2011-01-12 Thread Mahmoud Khonji
I would highly appreciate if anyone is able to send me his SPAM/Ham email collection. I need it to train and test classifiers. The issue with available corpus is that they are outdated. They generally date back in 2005, and lot has changed since then -- We've got SPAMers with spell checkers at le