Hi Lorenzo, The link you shared was very helpful and your valuable suggestions helped me a lot. Now the image detection seems to work for at least 90% of my cases.
Just for information, I started with a top-hat transform followed by the Sobel operator and then applied the Otsu thresholding. Then I found the coordinates surrounding the largest two areas which gets me a box (with an allowance) that only covers the text I need (removing all other noises). Now the next important step was to determine if the image had to be inverted, for which I used a colour average. Then I finally used the suggested Page segmentation method (psm 6) to read my characters. Probably there could be better ways, but this works for the time being. Thanks a lot. :) -Arjun On Thursday, 2 May 2019 19:31:39 UTC+2, Lorenzo Blz wrote: > > Hi, > use psm 6 (or 7). Also try to crop to have a single line, if possible. > Black text on white bg is better. > > You should be able to isolate text in this way: > > > https://www.pyimagesearch.com/2017/07/17/credit-card-ocr-with-opencv-and-python/ > > > Lorenzo > > Il giorno gio 2 mag 2019 alle ore 16:15 Arjun Bk <abk...@gmail.com > <javascript:>> ha scritto: > >> Hi, >> >> Attached here are two images that I cannot recognize using tesseract-ocr. >> I am calling a standalone .exe file (tesseract v4.0.0.20181030) in my >> python script to generate the pytesseract function. So far, I have not >> tried training my program. If anybody hints on that, I will try to learn it >> and apply. I wanted to know if there were some simple image enhancement >> solutions that I overlooked. >> >> I have already tried to do a manual threshold, resize and invert of the >> image using OpenCV. Additionally, I wanted to specify that I only wish to >> detect the digits in each case. >> >> Thanks. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to tesser...@googlegroups.com <javascript:>. >> To post to this group, send email to tesser...@googlegroups.com >> <javascript:>. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/043e6ecb-36c1-44de-ad8b-b91734b24b59%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/043e6ecb-36c1-44de-ad8b-b91734b24b59%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/7054772e-ceea-4233-9926-96eb96a5d3d4%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.