Thanks for your help :) Another question: Is there any way to dump the rawtokens that sa learned from my corpus? I saw that there are only encoded strings.
regards, Xueron Matthias Fuhrmann wrote: > On Mon, 6 Mar 2006, Xueron Nee wrote: > > > Hi, all: > > > > I am using sa-learn to train my bayes filter. And I collect many > > known spams from our honey pot. > > > > I found that there are so many mails with the same content in > > this spam corpus. Is it necessary to delete the repeated spams before > > sa-learn study? > > no, you dont have to delete them, let sa do the trick :) > you'll see, not all messages will be learned, so sa already knows > about the message/pattern. > > regards, > Matthias -- Xueron Nee <[EMAIL PROTECTED]>