[tesseract-ocr] Re: Tesseract arabic numbers

2024-01-05 Thread Harsha Perera
sri lanka number plate we can deploye On Friday 5 January 2024 at 01:03:58 UTC+5:30 tfmo...@gmail.com wrote: > On Thursday, January 4, 2024 at 12:03:15 PM UTC-5 ahmed54...@gmail.com > wrote: > > I have a problem that i want to use tesseract to read arabic numbers but > it has low accuracy abou

[tesseract-ocr] Re: Article scanning: hocr output wrong after font training?

2024-01-05 Thread Scott Goci
Hmmm -- makes sense (although unfortunate). Would you offer any suggestions as to next steps I could take from here? E.g. it seems my options are: 1. I can go back and train the legacy engine (e.g. *--oem 0*) on the fonts as well (I've been using this guide: https://michaeljaylissner

[tesseract-ocr] Re: Article scanning: hocr output wrong after font training?

2024-01-05 Thread Tom Morris
On Friday, January 5, 2024 at 9:30:05 AM UTC-5 sco...@gmail.com wrote: Would you offer any suggestions as to next steps I could take from here? E.g. it seems my options are: 1. I can go back and train the legacy engine (e.g. *--oem 0*) on the fonts as well (I've been using this guide:

[tesseract-ocr] Re: Article scanning: hocr output wrong after font training?

2024-01-05 Thread Scott Goci
Hey Tom, Overall thanks for your guidance here, I appreciate our back and forth! RE: *"[...] do you really *need* the italics?", *I think there is actually a lot lost without font attributes (e.g. bold / italic / underline). Consider the following sentences / quotes: - "I never said she sto