As it is not properly possible to combine my traineddata from scratch with an existing one, I have decided to also train my traineddata model numbers. Therefore I wrote a script which synthetically generates groundtruth data with text2image. This script uses dozens of different fonts and creates numbers for the following formats. X.XXX X.XX X,XX X,XXX I generated 10,000 files to train the numbers. But unfortunately numbers get recognized pretty poorly with the best model. (most of times only "0."; "0" or "0," gets recognized) So I wanted to ask if It is not enough training (ground truth data) for proper recognition when I train several fonts. Thanks in advance for you help.
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/41f85540-7d84-4f76-b2a0-f9280229547dn%40googlegroups.com.

