On Thu, 22 Feb 2024, 07:32 Dror Musai, wrote:
> Hi
>
> using version 5.3 of tesseract with hebrew lang. still not understand
> why adobe + foxit , can not find word in the pdf after ocr.
>
Pdf does not equal "text"! Pdf is a *complex* format where, more often than
not, human-visible "t
Hi experts,I’ve read that tesseract is not good at image OCR, for images like internet photos, but does well on pdf text. Is this true, or I need to build some complex training to guide it?Sent from my iPhoneOn Feb 14, 2024, at 12:28, Glenn C wrote:Hi all,I'm trying to build a meme text extractio
I only skimmed Ger's long reply, but didn't see a link to the issue, which
I think is the important bit of information:
https://github.com/tesseract-ocr/tesseract/issues/238
It's a long standing (and complex) problem in which behavior varies across
different PDF viewers.
Tom
--
You received
All,
I need some help extracting the text from this image. I'm using the
command line version of Tesseract from UBMannheim. I think it's 5.2
installed. I've tried every PSM, and nothing seems to pull it out. If I
crop off the minus sign, it works perfectly.
Any tips at all would be appreci
4 matches
Mail list logo