I might be forced to do this. Take the corpus from Mailinator and manually mark it as SPAM or HAM and use sa-learn to train spamassassin.
But this is what is confusing me. doesnt SA use a lot more tags, to determine if it is a SPAM or HAM? does this mean that sa-learn is not only for bayes but also for all the tags which get triggered in the mail? On Tue, May 31, 2016 at 8:07 AM, Antony Stone < antony.st...@spamassassin.open.source.it> wrote: > On Tuesday 31 May 2016 at 17:02:26, Reindl Harald wrote: > > > Am 31.05.2016 um 16:59 schrieb Antony Stone: > > > > > > I had read SA documentation such as > > > https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html > > > that's all based on opinions - the only question is the quality of > > training and i don't base my decisions and what i say on some opionions > > on a website but a ton of accounts on both involved copmanies sharing > > bayes database for inbound and outgoing mail > > That's fair enough, but I think someone just starting out with SA, or > doing a > research project, or simply not handling the large quantity of email that > you > do (and able to put in the effort of hand-tuning which you appear to do as > well) has to get their starting point from somewhere, and the official > project > website is something most people would regard as "good advice". > > > well, with the defaults of auto-learning that opinions maybe are true > > In which case maybe it's useful for the original poster after all. > > > Antony. > > -- > "In fact I wanted to be John Cleese and it took me some time to realise > that > the job was already taken." > > - Douglas Adams > > Please reply to the > list; > please *don't* CC > me. >