[tesseract-ocr] Tesseract arabic numbers

2024-01-04 Thread Ahmed Khalid
I have a problem that i want to use tesseract to read arabic numbers but it has low accuracy about this and give me incorrect reading. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails fro

[tesseract-ocr] Re: Article scanning: hocr output wrong after font training?

2024-01-04 Thread Tom Morris
I believe it's returning what it considers to be the best matching model (ie "lang"), but, if my experiments with eng+fra are any indication, the recognition isn't reliable. If it has trouble distinguishing two Romance languages using the same character set, I doubt it can be counted on to dist

[tesseract-ocr] Re: Tesseract arabic numbers

2024-01-04 Thread Tom Morris
On Thursday, January 4, 2024 at 12:03:15 PM UTC-5 ahmed54...@gmail.com wrote: I have a problem that i want to use tesseract to read arabic numbers but it has low accuracy about this and give me incorrect reading. You'll need to give more context about the program, version, language model, com

[tesseract-ocr] Preparation of a specific character-set traineddata

2024-01-04 Thread Karol Wójcik
Hi there, So far I've been using https://github.com/Shreeshrii/tessdata_shreetest/blob/master/digits_comma.traineddata. Generally speaking, with very good results, much better than when using eng-best or eng-fast from standard tesseract repo. But, unfortunately, recently I came across some u