I am on the same boat. I am using the latest version of Tesseract (5.3) on 
the Mac. The guide has mentioned a way to add (fine tune) missing 
characters. But, it is so very difficult to follow; has many steps  ; I 
couldn't wrap my head around it: that I gave up after a couple of attempts. 

How to train Tesseract 4.00 | tessdoc (tesseract-ocr.github.io) 
<https://tesseract-ocr.github.io/tessdoc/tess4/TrainingTesseract-4.00.html> the 
section is: Fine Tuning for ± a few characters

- Fine tuning using the usual methods, from  the existing .traineddata is 
not working to add the missing characters. 
- I have tried different method to fine tune: by increasing and decreasing 
iterations, by increasing and decreasing the lines: by feeding many lines 
of the missing characters, etc, with no avail. 

So, dear Zdenko, can you please tell us on how to fine tune for   new 
characters, in simple (layman) terms?


On Thursday, August 17, 2023 at 11:55:47 AM UTC+3 zdenop wrote:

> Please provide details of what are you doing including details of 
> Tesseract version, OS, and which tessdata you used...)
>
> Make sure you read tesseract documentation and please provide also details 
> on which suggested solution you used and which char is missing (as not 
> everybody is familiar with Telugu)
>
> Zdenko
>
>
> pi 11. 8. 2023 o 19:07 ravi kumar <rev...@gmail.com> napísal(a):
>
>> Hi ,   
>> New  to this program.. not  sure how  and where to start  to fix.. 
>> i have  a image attached   that is used for testing Tesseract  and H-ocr  
>> file  for trace on missing char ; can  someone interpret   and guide me to 
>> the fix.  
>>
>> TIA,
>> Ravi Kumar. 
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/cf266779-e08c-4d8c-b970-738d2ad48084n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/cf266779-e08c-4d8c-b970-738d2ad48084n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/a32a8e86-378e-45e9-a7a6-59212cd5a05bn%40googlegroups.com.

Reply via email to