Thanks to both of you for replying. I'm using Charles Weld's NuGet package
(https://github.com/charlesw/tesseract/) so at the moment I think I am
stuck on version 4.1.1. I have to admit Tesseract is a bit of a black box
to me, and short of setting a few variables I am not I am at a bit of a
los
You provided no example, so just hint: have a look at the leptonica
function pixAutoPhotoinvert[1], that should help in such cases. Function is
available IMO from version 1.79.0
[1]
https://github.com/DanBloomberg/leptonica/blob/5aaf1c187deeef7f47288c6b0833a07021940da7/src/pageseg.c#L2370-L2391
Z
Hi,
On 01/07/2021 18:39, 'Chris' via tesseract-ocr wrote:
> I am experimenting with Tesseract 4.1.1 using C# to extract text from black
> and white or greyscale TIF images of semi structured forms that are 300
> dpi.
>
> The results are really promising except when some of the text is inverted
I am experimenting with Tesseract 4.1.1 using C# to extract text from black
and white or greyscale TIF images of semi structured forms that are 300
dpi.
The results are really promising except when some of the text is inverted
(ie white on black). In these cases the results are poor. Can anyon
4 matches
Mail list logo