Hi, I need to extract hand written malayalam text. I think it's possible to fine-tune Tesseract 5for handwritten Malayalam. There is no single document explicitly stating the data requirements for fine tune Tesseract 5 on handwritten Malayalam (at least, I couldn’t find one—though there may be some). According to ChatGPT, the estimated data requirement is 4 lakh text samples. From where we get the authenticity of this data requirement. Additionally, based on the documentation, I believe it runs only on a CPU. How much time is required for training, but I couldn’t find answers to these questions in the documentation. Where can we find information on aspects like training time, data requirements, etc.?
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/40b24d18-8454-412b-a1c1-bbe03afad411n%40googlegroups.com.