On Mon, 29 Nov 2004, [EMAIL PROTECTED] moaned:
> Unless the address has never been used by a real person, you should
> manually check each message to see whether it's spam. Personally, I
> never have the endurance to check more than about 500 messages at a
> shot. So I'd just cut it into files of a size I could manually verify
> without bleeding from the eyes, delete any hammy-looking stuff I find in
> each file as I go through it, and then save the verified files and use
> those for bayes training.

I've always validated things like that by mass-checking the mailbox and
manually checking the stuff close to the spam/ham boundary line, on the
basis that SA is pretty much *never* wrong for very-high-scoring things
being spam --- well, maybe it is for particularly atrocious newsletters
or something, but my users don't get any such abominations.

It still means a good few manual checks, but checking a hundred-odd mails
is a hell of a lot easier than checking tens of thousands.

-- 
`The sword we forged has turned upon us
 Only now, at the end of all things do we see
 The lamp-bearer dies; only the lamp burns on.'

Reply via email to