Hi, Merlijn. Thanks for your kind response.
Regarding autonomous mode, I'm trying to find such module for Android. But I found nothing. I will try more. >I am not sure what you're finding on google play store, but I have found >there to be no limitation to the amount of languages that can be used >during OCR. Keep in mind that using more languages will slow down the >OCR process. It's textfairy, open source app. https://play.google.com/store/apps/details?id=com.renard.ocr Your response is really helpful. Best, Charles. On Sunday, March 21, 2021 at 8:29:13 AM UTC+8 Merlijn Wajer wrote: > Hi, > > On 19/03/2021 10:11, Charles Cho wrote: > > Hello, > > I'm working on a ocr android app based on tesseract. > > I want to add feature that detects language automatically and recognize > > at least 2 languages at once. > > I have investigated on that for a while so I know that I have to specify > > language for tesseract. > > Then how can I implement auto detection of language? > > Not exactly a mobile use case, but you can read how the Internet Archive > does this (I coined it "autonomous mode", where the software just > figures out the scripts and languages): > > https://archive.org/services/docs/api/ocr.html#autonomous-mode > > And the code is available, here (I plan to split out the archive.org > specific code from the python code that invokes Tesseract and performs > heuristics like script detection): > > https://git.archive.org/www/tesseract/-/blob/master/main.py#L757 > > the tl;dr is to first perform script detection, and use the detected > script to OCR the page - then use language detection libraries to guess > the languages on the page. > > > And tesseract on google play store can recognize 3 languages at once. > > Is it maximum? > > I am not sure what you're finding on google play store, but I have found > there to be no limitation to the amount of languages that can be used > during OCR. Keep in mind that using more languages will slow down the > OCR process. > > > Any help and advice would be really appreciated. > > Hope this helps. > > Cheers, > Merlijn > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/53bafa0d-88ce-4ff1-bc96-4e6b05cf5420n%40googlegroups.com.