Am 31.05.2016 um 17:13 schrieb Shivram Krishnan:
I might be forced to do this. Take the corpus from Mailinator and manually mark it as SPAM or HAM and use sa-learn to train spamassassin. But this is what is confusing me. doesnt SA use a lot more tags, to determine if it is a SPAM or HAM? does this mean that sa-learn is not only for bayes but also for all the tags which get triggered in the mail?
sa-learn is *only* for the bayes but since you have no clean and uncrippeled messages from there you can't expect other rules working proper nor bayes trained with that data working proper for real email
On Tue, May 31, 2016 at 8:07 AM, Antony Stone <antony.st...@spamassassin.open.source.it <mailto:antony.st...@spamassassin.open.source.it>> wrote: On Tuesday 31 May 2016 at 17:02:26, Reindl Harald wrote: > Am 31.05.2016 um 16:59 schrieb Antony Stone: > > > > I had read SA documentation such as > > https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html > that's all based on opinions - the only question is the quality of > training and i don't base my decisions and what i say on some opionions > on a website but a ton of accounts on both involved copmanies sharing > bayes database for inbound and outgoing mail That's fair enough, but I think someone just starting out with SA, or doing a research project, or simply not handling the large quantity of email that you do (and able to put in the effort of hand-tuning which you appear to do as well) has to get their starting point from somewhere, and the official project website is something most people would regard as "good advice". > well, with the defaults of auto-learning that opinions maybe are true In which case maybe it's useful for the original poster after all.
signature.asc
Description: OpenPGP digital signature