Re: pdf spam solution idea

2007-06-27 Thread Dallas Engelken
arni wrote: Hi, its come up several times now that people ask for a way to directly detect pdf spam by the pdf content and not only through headers or other means (hashes, bayes). I've found a solution that should be pretty easy to realise in a Fuzzy-OCR like plugin. Here is what it should do

pdf spam solution idea

2007-06-27 Thread arni
Hi, its come up several times now that people ask for a way to directly detect pdf spam by the pdf content and not only through headers or other means (hashes, bayes). I've found a solution that should be pretty easy to realise in a Fuzzy-OCR like plugin. Here is what it should do: Use xpdf