I am on the same boat. I am using the latest version of Tesseract (5.3) on the Mac. The guide has mentioned a way to add (fine tune) missing characters. But, it is so very difficult to follow; has many steps ; I couldn't wrap my head around it: that I gave up after a couple of attempts.
How to train Tesseract 4.00 | tessdoc (tesseract-ocr.github.io) <https://tesseract-ocr.github.io/tessdoc/tess4/TrainingTesseract-4.00.html> the section is: Fine Tuning for ± a few characters - Fine tuning using the usual methods, from the existing .traineddata is not working to add the missing characters. - I have tried different method to fine tune: by increasing and decreasing iterations, by increasing and decreasing the lines: by feeding many lines of the missing characters, etc, with no avail. So, dear Zdenko, can you please tell us on how to fine tune for new characters, in simple (layman) terms? On Thursday, August 17, 2023 at 11:55:47 AM UTC+3 zdenop wrote: > Please provide details of what are you doing including details of > Tesseract version, OS, and which tessdata you used...) > > Make sure you read tesseract documentation and please provide also details > on which suggested solution you used and which char is missing (as not > everybody is familiar with Telugu) > > Zdenko > > > pi 11. 8. 2023 o 19:07 ravi kumar <rev...@gmail.com> napísal(a): > >> Hi , >> New to this program.. not sure how and where to start to fix.. >> i have a image attached that is used for testing Tesseract and H-ocr >> file for trace on missing char ; can someone interpret and guide me to >> the fix. >> >> TIA, >> Ravi Kumar. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to tesseract-oc...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/cf266779-e08c-4d8c-b970-738d2ad48084n%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/cf266779-e08c-4d8c-b970-738d2ad48084n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/a32a8e86-378e-45e9-a7a6-59212cd5a05bn%40googlegroups.com.