Though nobody seems to have said it exactly this way:  It seems
to be becoming very obvious that the people who say the have problems
with Bayes are those who support a diverse group of users (e.g. ISPs
and email providers) and those who find it works well, even with autolearning
are those with either small numbers of users or users who are mostly of
a very specific categorization type (e.g. medical, legal, technical, or
just about any homogenous group).

        Despite the oft repeated cleam spammers are dumb, not all are;  And
the "Bayes poison" we all see added to spam must work for some group, and
I would guess that it is exactly those users who have the diverse user bases
and have primarily "personal conversational" content in lots of the email
running through their systems.

        For me, the few times I see Bayes give apparent wrong answers is
in email from friends and family, and never from clients or technical contacts.
(and it is certainly worse that many members of my family have spent their
entire careers in marketing - they often get Bayes_80 corse when writing me).
This lends support to the notion that the added text does indeed match some
types of common communication.

        If my supposition is correct, the question then becomes:  Can using
personal (i.e. per user) Bayes overcome the problems which some users/sites
see?  I'm not sure how to test this - certainly I couldn't myself, but maybe
some of the other members of this list are able to and could try.  Even if it
does work, the resource load may be too high to be reasonable for many large
sites.


        Paul Shupak
        [EMAIL PROTECTED]

Reply via email to