Sorry this reply is a bit late, but the problem is a bug in FuzzyOCR. When a
message has multiple images, it ends up appending to the text file instead
of replacing it. The bug is in routine open_on_specific_fd in Misc.pm:

$fname =~ s/> *// and $flags |= O_CREAT|O_WRONLY;

should be

$fname =~ s/> *// and $flags |= O_CREAT|O_WRONLY|O_TRUNC;

(and you have to add O_TRUNC to the import list at the top of the module
too).

I logged this as ticket 555 on the FuzzyOCR website.

Having fixed that, I'm not sure that FuzzyOCR is helping much. Also I've
lowered the FUZZY_OCR_WRONG_EXTENSION score as it was occasionally firing
multiple times on non-spam.

Dave



Bowie Bailey wrote:
> 
> I've had FuzzyOCR running for quite a while.  Today I found a false
> positive for it that is a bit strange.
> 
> The message has seven images.  FuzzyOCR claims to have found the word
> "service" in five of them (and counted it 10 times for a score of 6.5).
> However, I can only see the word in one of the images and only three of
> the seven images have any text at all.  Is there a problem here?
> 
> Is FuzzyOCR still useful?  It doesn't seem to hit a lot for me.
> 
>       %OFMAIL: 1.18
>       %OFSPAM: 3.41
>       %OFHAM:  0.26
> 
> --
> Bowie
> 
> 

-- 
View this message in context: 
http://www.nabble.com/FuzzyOCR-tp19672684p20581027.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

  • FuzzyOCR Bowie Bailey
    • Re: FuzzyOCR DaveAtJLA

Reply via email to