Hi Soma, to limit character recognition you can use the parameter "tessedit_char_whitelist" = 0123456789. For usage see parameter overview: https://muthu.co/all-tesseract-ocr-options/
Ale Il giorno martedì 1 novembre 2022 alle 11:29:41 UTC+1 Soma Al ha scritto: > > I'm trying to finetune Tesseract to recognize digits only but I'm not > getting good results so far. I continued the training from Arabic language > "ara" since the digits I'm trying to recognize are Arabic numbers. > The training will stop early at 0.01 error rate but the results on testing > data is really bad. > > I'm using my box/tif files and my training text with Tesstrain.h > > Any recommendation on what should I do to get better results? > > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/f795a78f-0ce7-4bcd-9863-e95ad77a4e2cn%40googlegroups.com.