I beleave that key is page segmentation mode - try to play with it: https://www.pyimagesearch.com/2021/11/15/tesseract-page-segmentation-modes-psms-explained-how-to-improve-your-ocr-accuracy/
On Tuesday, December 28, 2021 at 11:45:54 PM UTC+3 thisism...@gmail.com wrote: > Thanks, yes i had looked at that. > I began by expanding the image by 5x to get the characters to about 50 > pixels high (vs about 8 initially). > My initial tests generated a tessinput.tif that looked very good to my > eye, but did not work for the OCR. > I ended up also doing: > - posterize to level 2 to reduce the colors > - dilate, to reduce the thickness of the characters > But this still was not working. > > I suspect the single characters and the lines between are causing issue. > I had tried several of the many settings in the config file hoping to > figure out ones that would work but got nowhere and seemed to be shooting > in the dark. > As i am unfamiliar with these many settings and did not find details on > the meaning of many of them, my question was hoping to find some ideas on > which ones might be helpful. > > In the end i have defined rectangles for the position of each character, > then copy all these rectangles to a new image placing characters in nice > rows. > This worked on the sample image i have but i do not yet have additional > samples to see if it works on them. > I had hoped to avoid coding the detail for the exact position of the > characters and that it might read them as is. > Will see later when more samples arrive if this is a workable solution. > > On Tuesday, December 28, 2021 at 4:33:56 AM UTC-5 zdenop wrote: > >> Did you read the docs? >> https://github.com/tesseract-ocr/tessdoc/blob/main/ImproveQuality.md >> >> Zdenko >> >> >> ut 28. 12. 2021 o 10:28 michael c <thisism...@gmail.com> napĂsal(a): >> >>> I am just starting to use the tesseract package and having no luck >>> getting it to recognize anything. >>> My environment is C# using the package from nuget. >>> I am able to run fine, just no text is recognized in my sample image. >>> It does work on the provided 'phototest.tif'. >>> I have fiddled with many parameters in the config file and none has >>> resulted in any useful output. >>> I only need to recognize digits and the image will have the same >>> consistent form as this one. >>> Any hints on parameters i should look at to get this running? >>> >>> [image: 20211222_Capture_cut.PNG] >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to tesseract-oc...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/d12cad94-ad9c-4659-87bc-94a57b58a4e1n%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/tesseract-ocr/d12cad94-ad9c-4659-87bc-94a57b58a4e1n%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/a80e3318-f8a6-4c4e-b6d7-4401174825b1n%40googlegroups.com.