subject:"\[tesseract\-ocr\] Improve text extraction"

Re: [tesseract-ocr] Improve text extraction

2022-07-22 Thread Lorenzo Bolzani

Hi Atef, I think your best option is to generate a lot of images as bad as this one and use them for training. So you take the good images (with the corresponding text), thousands, and ruin/blur them in many different ways. In this way, for example, from good 1000 images you get 5000/1 bad ima

[tesseract-ocr] Improve text extraction

2022-07-20 Thread Atef Chatty

Hi, i want to extract information from unclear images. I tried many filters but it doesn’t help. This is some example : This is the input pictures: Example.png So why i want to extract this informations ? : I am working on a project to extract information from driver’s licenses. The extraction i

Re: [tesseract-ocr] Improve text extraction when some text is inverted

2021-07-02 Thread 'Chris' via tesseract-ocr

Thanks to both of you for replying. I'm using Charles Weld's NuGet package (https://github.com/charlesw/tesseract/) so at the moment I think I am stuck on version 4.1.1. I have to admit Tesseract is a bit of a black box to me, and short of setting a few variables I am not I am at a bit of a los

Re: [tesseract-ocr] Improve text extraction when some text is inverted

2021-07-02 Thread Zdenko Podobny

You provided no example, so just hint: have a look at the leptonica function pixAutoPhotoinvert[1], that should help in such cases. Function is available IMO from version 1.79.0 [1] https://github.com/DanBloomberg/leptonica/blob/5aaf1c187deeef7f47288c6b0833a07021940da7/src/pageseg.c#L2370-L2391 Z

Re: [tesseract-ocr] Improve text extraction when some text is inverted

2021-07-02 Thread Merlijn B.W. Wajer

Hi, On 01/07/2021 18:39, 'Chris' via tesseract-ocr wrote: > I am experimenting with Tesseract 4.1.1 using C# to extract text from black > and white or greyscale TIF images of semi structured forms that are 300 > dpi. > > The results are really promising except when some of the text is inverted

[tesseract-ocr] Improve text extraction when some text is inverted

2021-07-01 Thread 'Chris' via tesseract-ocr

I am experimenting with Tesseract 4.1.1 using C# to extract text from black and white or greyscale TIF images of semi structured forms that are 300 dpi. The results are really promising except when some of the text is inverted (ie white on black). In these cases the results are poor. Can anyon

Re: [tesseract-ocr] Improve text extraction

[tesseract-ocr] Improve text extraction

Re: [tesseract-ocr] Improve text extraction when some text is inverted

Re: [tesseract-ocr] Improve text extraction when some text is inverted

Re: [tesseract-ocr] Improve text extraction when some text is inverted

[tesseract-ocr] Improve text extraction when some text is inverted

6 matches

Site Navigation

Mail list logo

Footer information