Hi Lorenzo,

The link you shared was very helpful and your valuable suggestions helped 
me a lot. Now the image detection seems to work for at least 90% of my 
cases.

Just for information, I started with a top-hat transform followed by the 
Sobel operator and then applied the Otsu thresholding. Then I found the 
coordinates surrounding the largest two areas which gets me a box (with an 
allowance) that only covers the text I need (removing all other noises). 
Now the next important step was to determine if the image had to be 
inverted, for which I used a colour average. Then I finally used the 
suggested Page segmentation method (psm 6) to read my characters. Probably 
there could be better ways, but this works for the time being.

Thanks a lot. :)

-Arjun


On Thursday, 2 May 2019 19:31:39 UTC+2, Lorenzo Blz wrote:
>
> Hi,
> use psm 6 (or 7). Also try to crop to have a single line, if possible. 
> Black text on white bg is better. 
>
> You should be able to isolate text in this way:
>
>
> https://www.pyimagesearch.com/2017/07/17/credit-card-ocr-with-opencv-and-python/
>
>
> Lorenzo
>
> Il giorno gio 2 mag 2019 alle ore 16:15 Arjun Bk <abk...@gmail.com 
> <javascript:>> ha scritto:
>
>> Hi, 
>>
>> Attached here are two images that I cannot recognize using tesseract-ocr. 
>> I am calling a standalone .exe file (tesseract v4.0.0.20181030) in my 
>> python script to generate the pytesseract function. So far, I have not 
>> tried training my program. If anybody hints on that, I will try to learn it 
>> and apply. I wanted to know if there were some simple image enhancement 
>> solutions that I overlooked.
>>
>> I have already tried to do a manual threshold, resize and invert of the 
>> image using OpenCV. Additionally, I wanted to specify that I only wish to 
>> detect the digits in each case.
>>
>> Thanks.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesser...@googlegroups.com <javascript:>.
>> To post to this group, send email to tesser...@googlegroups.com 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/043e6ecb-36c1-44de-ad8b-b91734b24b59%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/043e6ecb-36c1-44de-ad8b-b91734b24b59%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/7054772e-ceea-4233-9926-96eb96a5d3d4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to