Hey zdenop that was the portion of full image which was not detected properly by tesseract. In full image there is lot's of information that's the reason i didn't share. All information are important so psm 11 is working great there. If i am using psm 6 then it will miss some lines so i can't use that. i have tried the psm 11 with oem 0,1,2,3 but none of them work as i want. For me the best choice is psm 11 but number are issue can you advise something on this? Thanks
On Wednesday, April 21, 2021 at 10:35:09 PM UTC+5:30 zdenop wrote: > > 1. You got the result for the image you provided. > 2. I suggest you to use other oem > 3. I know that invoice digitalizator use different parameters for > parsing numbers. > > > Zdenko > > > st 21. 4. 2021 o 17:45 Kumar Rajwani <kumarraj...@gmail.com> napísal(a): > >> Hi Zdenop, As i said i know psm 6 working better in number but it not >> able to get all text in image. where psm 11 does better. So this the reason >> i want to with psm 11 but i am getting wrong amount that's the only problem >> i am facing with psm 11. So can you tell me how can i achive same result as >> you in psm 11. >> Thanks >> >> On Wednesday, April 21, 2021 at 8:34:20 PM UTC+5:30 zdenop wrote: >> >>> Try to use better config parameters. e.g: >>> >>> $ tesseract download.png - --psm 6 --oem 0 >>> will produce: >>> $ 250,941.00 >>> $ -75,282.00 >>> $ 175,659.00 >>> $ -15,072 00 >>> $ 2,860.00 >>> $ 0.00 >>> $ 163,447.00 >>> >>> legacy engine could be better for numbers >>> >>> Zdenko >>> >>> >>> st 21. 4. 2021 o 14:10 Kumar Rajwani <kumarraj...@gmail.com> napísal(a): >>> >>>> Hey, >>>> I am using tesseract to identify amounts in my forms. You can look >>>> below image for sample. i am getting perfect amount with decimal in psm 6. >>>> but when i use psm 11 i am getting follwing output. I have to use psm >>>> 11 as it identify more text with compare to psm 6 in my images. >>>> 250,941 >>>> 00 >>>> 00 >>>> -75,282 >>>> 175,659 >>>> 00 >>>> -15,072 >>>> 00 >>>> 2,860 >>>> 00 >>>> 00 >>>> 163,447 >>>> 00 >>>> The code i am using. >>>> print(pytesseract.image_to_string(image.crop((2000,1570,2500,2000)), >>>> lang="eng", >>>> >>>> config = '-c tessedit_do_invert=0 --psm >>>> 11').replace("\n\n","\n")) >>>> >>>> I want to ask if there is any changes i can do to get decimal point >>>> with psm 11. >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to tesseract-oc...@googlegroups.com. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/4d793afb-b554-4322-83ef-4ff94accc85en%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/tesseract-ocr/4d793afb-b554-4322-83ef-4ff94accc85en%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to tesseract-oc...@googlegroups.com. >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/aaede6a0-c304-45a7-badd-b242091d821bn%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/aaede6a0-c304-45a7-badd-b242091d821bn%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/3d570cf0-1083-4529-9ee1-9cc0a1783fe5n%40googlegroups.com.