[tesseract-ocr] Re: New release for tessdata_{fast,best}?

2021-02-23 Thread shree
>There is now a 4.1.0 release available for tessdata_fast, tessdata and tessdata_best. See https://github.com/tesseract-ocr/tessdata_fast/issues/26#issuecomment-780127901 @Merlijn Wajer archive.org has many books which use English with diacritics for Sanskrit (IAST). You could try the models

Re: [tesseract-ocr] Re: New release for tessdata_{fast,best}?

2021-02-19 Thread Tom Morris
Hi Merlijn, Apologies for the delayed reply. I'll definitely be in touch about the results of your OCR comparison study, but I'd encourage you to release it publically. One good way to give back to the open source community that the Internet Archive takes advantage of is to share knowledge and

Re: [tesseract-ocr] Re: New release for tessdata_{fast,best}?

2021-02-01 Thread Merlijn B.W. Wajer
Hi Tom, On 30/01/2021 21:25, Tom Morris wrote: > On Wednesday, January 27, 2021 at 5:28:27 AM UTC-5 Merlijn Wajer wrote: > > > The Internet Archive has switched to using Tesseract for all our OCR, > > > That's great to hear! It's certainly been a long time coming. Nick White > & I tried to

[tesseract-ocr] Re: New release for tessdata_{fast,best}?

2021-01-30 Thread Tom Morris
On Wednesday, January 27, 2021 at 5:28:27 AM UTC-5 Merlijn Wajer wrote: > > The Internet Archive has switched to using Tesseract for all our OCR, That's great to hear! It's certainly been a long time coming. Nick White & I tried to get this to happen 7 years ago and even volunteered to help, bu

Re: [tesseract-ocr] New release for tessdata_{fast,best}?

2021-01-27 Thread Greg Jay
> On Jan 27, 2021, at 1:42 AM, Shree Devi Kumar wrote: > > >The Internet Archive has switched to using Tesseract for all our OCR, > > I am so happy to hear this. It will be great to have the Indic languages that > were marked as non-ocrable so far be converted to text correctly on Internet >

Re: [tesseract-ocr] New release for tessdata_{fast,best}?

2021-01-27 Thread Merlijn B.W. Wajer
Hi, On 27/01/2021 12:42, Shree Devi Kumar wrote: >> The Internet Archive has switched to using Tesseract for all our OCR, > > I am so happy to hear this. It will be great to have the Indic languages > that were marked as non-ocrable so far be converted to text correctly on > Internet Archive. Ri

Re: [tesseract-ocr] New release for tessdata_{fast,best}?

2021-01-27 Thread Shree Devi Kumar
>The Internet Archive has switched to using Tesseract for all our OCR, I am so happy to hear this. It will be great to have the Indic languages that were marked as non-ocrable so far be converted to text correctly on Internet Archive. Is there any page with instructions to do this? Can a language

[tesseract-ocr] New release for tessdata_{fast,best}?

2021-01-27 Thread Merlijn B.W. Wajer
Hi, With Tesseract now switching to regular (alpha) releases of 5.0.0; does it make sense to consider some versioning for language files as well? The Internet Archive has switched to using Tesseract for all our OCR, and I'm hoping that we can record exactly what version of language files was used

Re: New release

2011-08-02 Thread Dmitri Silaev
Unfortunately, Google is not disposed to unveiling such information to the public. Warm regards, Dmitri Silaev www.CustomOCR.com On Fri, Jul 29, 2011 at 2:37 PM, Encolpe Degoute wrote: > Hello, > > Is it possible to have more often release or at least a roadmap for > next releases ? > I trie

New release

2011-07-29 Thread Encolpe Degoute
Hello, Is it possible to have more often release or at least a roadmap for next releases ? I tried to check release tags from the code repository but I didn't found any. Regards, Encolpe DEGOUTE -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.