Re: [tesseract-ocr] New release for tessdata_{fast,best}?

Greg Jay Wed, 27 Jan 2021 20:24:31 -0800


> On Jan 27, 2021, at 1:42 AM, Shree Devi Kumar <shreesh...@gmail.com> wrote:
> 
> >The Internet Archive has switched to using Tesseract for all our OCR,
> 
> I am so happy to hear this. It will be great to have the Indic languages that 
> were marked as non-ocrable so far be converted to text correctly on Internet 
> Archive.
> 
> Is there any page with instructions to do this? Can a language be specified 
> while OCRing? eg. Better results are many times received using 
> script/Devanagari instead of san for Sanskrit.
> 
> Regarding your question about tessdata, there have only been minor changes to 
> tessdata files but adding a tag is a good idea. I suggest you post this as a 
> feature request in the repo.


I hope someone adds Grantha script as there are many texts on Archive.org 
<http://archive.org/> in this script.

Greg

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/7A10AE5A-E779-422F-97B1-FDE73198EEBE%40gmail.com.

Re: [tesseract-ocr] New release for tessdata_{fast,best}?

Reply via email to