I have around 10 symbols (icons) that I would like to add to the English model. For example ← or →. i have ability to generate these images as many as i want. however they don't conform to any font. And all of them don't belong to all fonts. As in symbol A might not be available in Font A but is available in Font B, while symbol B is available in Font B not in Font A. i Looked at the following section of documentation https://github.com/tesseract-ocr/tessdoc/blob/main/tess4/TrainingTesseract-4.00.md#fine-tuning-for--a-few-characters But it is not possible to add it in text as if it doesn't exist in the font it will generate question marks in images. So my questions are 1. First, which solution is the best for me? A. Finetune it B. Retrain a couple layers C. Train from scratch and combine it with English language 2. What kind of data i would need for the same? are there any tools that will help me generate it in my case?
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/22e0473e-7860-4929-8179-7ed26c815accn%40googlegroups.com.