Dear readers,

while using Spamassassin for about one month and having a very good
recognition rate I am discovering that spam that has almost or no text
within does not get detected by SpamAssassin, neither by normal criteria nor
by the Bayes filter. I think because there is not enough information for the
Bayes filter to be able on a reliable decision.

Now I had the following thought. What about a special image database which
is maintained in a Bayes-like-style. I could imagine the following. While
learning spam with sa-learn, the learning process could also filter out each
image in an email. Then we have to build a more unique representation of
this image, comparable with a hash value of a string. Then we store this
"hash value" as entry a Bayes-like image database. When new email comes in
we do a comparison of all images in this email with them in our database.

I don't have any idea of what resouces this addition check(s) would consume,
but perhaps it would be a nice addition feature. I was concerned about some
ways how to get a hash value out of an image, if someone is interested, feel
free to contact me.

Best regards,
Manuel Schmitt




-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to