Have at https://github.com/tesseract-ocr/tesseract/issues/2342 and search for "tesseract OCR dot matrix", there are several suggestions on how to improve OCR results e.g. https://jeffreymorgan.io/articles/improve-dot-matrix-ocr-performance-tutorial/
PS: it does not make sense to post custom traineddata - there could be also a problem with how it was created/trained. Zdenko ne 2. 7. 2023 o 11:27 Simon Plackett <plackett...@gmail.com> napĂsal(a): > Hi, > I have a display I would like to OCR - example image is at the bottom, > I have found a font that matches as far as I can tell exactly the number > format > ('5x7-dot-matrix') I have created 40k files similar to pil_image_10445.png > with their corresponding .gt.txt files and created a new traineddata file. > > My character set is limited to 0-9 and . > > I have tried using random sets of characters, and a more structured set > nnnnn.nnn, the results from all of the traineddata files is poor. > > I have also tried turning the image to grayscale, cropping, enhancing the > contrast etc to no avail. I am lucky to get 1 digit recognised. > > Bizarrely I get the same output no matter which input image file I use!! > (Using the attached traineddata file and the attached image I get 6.) - > which is the same output for all the files I have tried. > > I have stuck to either psm 7 or 13 as the others largely don't give any > output > > I would like some advice about whether continuing to increase the training > data set will help, or any hints about trying to get better OCR success for > these digits. > > I am using tesseract 5.3.0 leptonica 1.83.0 on a debian 11 machine. > I built tesstrain as per the github instructions. > > I am using ./tesseract -l dot_gas_int --psm 7 > ~/tesstrain/data/dot_gas_int-ground-truth/pil_image_10445.png stdout > > Apologies if I am doing something dumb, this is new to me and I am having > a go :-) > > Thanks > Simon > > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/fa2232b3-852b-421e-939e-177971178faen%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/fa2232b3-852b-421e-939e-177971178faen%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8x3FBHzfm1sXm-aipqYMQMbjQmPEZpEH3SmKRE6PYmynA%40mail.gmail.com.