On 20.02.23 16:25, Wolfgang Breyha wrote:
Is there a way with SA4 to process inline images (SRC=data;image/...) as
images (like attachments)? eg. to feed them to ExtractText.
I see that SA tries to call URI canonicalization, but that's it so far?
I use config directives as described in documentation for in
Mail::SpamAssassin::Plugin::ExtractText
extracttext_external tesseract {OMP_THREAD_LIMIT=1} /usr/bin/tesseract
-c page_separator= {} -
extracttext_use tesseract .jpg .png .bmp .tif .tiff
image/(?:jpeg|png|x-ms-bmp|tiff)
and looks like it works.
...one just needs to have tesseract installed.
--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
LSD will make your ECS screen display 16.7 million colors