https://github.com/tesseract-ocr/tesseract/pull/2294 by @bertsky adds the
whitelist/blacklist functionality for Tesseract4. It has not been merged
yet.
On Sat, Mar 23, 2019 at 2:58 PM Lorenzo Bolzani wrote:
> Il giorno mar 19 mar 2019 alle ore 06:03 Jonathan Muller <
> jmul...@pukogames.com> ha
Il giorno mar 19 mar 2019 alle ore 06:03 Jonathan Muller <
jmul...@pukogames.com> ha scritto:
> 5 - Create a whitelist based on the zone of probable characters (this one
> improves accuracy a lot !)
>
Ho do you do whitelisting with tesseract 4.x? As far as I know is not yet
supported.
I do the
Thank you for your response, my experience with OCR is limited to the
conversion of screenshots I may take online, yours far more extensive I
think.
And thank you particularly for items 2 and 5, slight skewing of the image
may better account for the distortions in size and or aspect ratio that
I don't really agree with your statement. There is a lot of things we had
to consider with image processing before tesseract finally gave us accurate
results. But it all makes sense. Here is our actual pipeline:
1 - Cleanup the image: remove any artifact of the camera or scan device,
cut the pape
I would like some advice concerning the general use of tesseract, because
my experience with it tends to two extremes: either tesseract performs
flawlessly, with no prior modification of the image necessary except
cropping to the text and (most significant) enlarging the image by a factor
of 2
5 matches
Mail list logo