On Mon, 6 Mar 2006, Xueron Nee wrote: > Hi, all: > > I am using sa-learn to train my bayes filter. And I collect many > known spams from our honey pot. > > I found that there are so many mails with the same content in > this spam corpus. Is it necessary to delete the repeated spams before > sa-learn study?
no, you dont have to delete them, let sa do the trick :) you'll see, not all messages will be learned, so sa already knows about the message/pattern. regards, Matthias