date:20180409

Re: [tesseract-ocr] Tessercat 4.0 korean detecting chinese

2018-04-09 Thread Fanatico

It worked, thanks. Any reason for this chi_tra there? On Monday, 9 April 2018 03:24:44 UTC-3, shree wrote: > > Please remove the sub language line from config file, and use combine > tessdata to overwrite it. > > Right now it seems to be using chi_tra also. > > On Mon 9 Apr, 2018, 11:48 AM Fana

Re: [tesseract-ocr] Tessercat 4.0 korean detecting chinese

2018-04-09 Thread ShreeDevi Kumar

Leftover from 3.04, my guess. On Mon 9 Apr, 2018, 12:52 PM Fanatico, wrote: > It worked, thanks. > > Any reason for this chi_tra there? > > > On Monday, 9 April 2018 03:24:44 UTC-3, shree wrote: >> >> Please remove the sub language line from config file, and use combine >> tessdata to overwrite

Re: [tesseract-ocr] How to created training text as provided in langdata for any new language if i have just just have a wordlist.

2018-04-09 Thread Romil Mehla

Hi Shree Thanks for replying For tesseract *3.05.00* I had already checked that link there they mentioned *"Make sure there are a minimum number of samples of each character. 10 is good, but 5 is OK for rare characters.* *There should be more samples of the more frequent characters - at least 20.

Re: [tesseract-ocr] Tessercat 4.0 korean detecting chinese

2018-04-09 Thread ShreeDevi Kumar

For Korean, please check whether adding the following lines to config, improves your results further. #Fixes https://github.com/tesseract-ocr/tesseract/issues/1009 preserve_interword_spaces 1 ShreeDevi भजन - कीर्तन - आरती @ http://bhaj

Re: [tesseract-ocr] How to created training text as provided in langdata for any new language if i have just just have a wordlist.

2018-04-09 Thread ShreeDevi Kumar

For tesseract 3.05 random text will work, it is suggested to use combos similar to English training text. It is unlikely you will get answers to your questions from the developers. You can search past issues/questions in forum and github. 3.05 training does not take long, run a few experiments f

Re: [tesseract-ocr] How to created training text as provided in langdata for any new language if i have just just have a wordlist.

2018-04-09 Thread Romil Mehla

Thanks Shree , but if tesseract is open source then why developers can't answer doubts , If i were to randomly train my model how can i come down to accurate accuracy of my model , then my model accuracy will also be random. I want the reason for condition imposed on training text , how much it

[tesseract-ocr] Extract Header and Footer text separately from document image

2018-04-09 Thread Mohit Jain

Is there a way to extract the header and footer content on a document page separately using Tesseract OCR? I tried the hOCR output but it doesn't seem to have any such tags associated with the output. Regards, Mohit -- You received this message because you are subscribed to the Google Groups

Re: [tesseract-ocr] Tessercat 4.0 korean detecting chinese

2018-04-09 Thread Fanatico

The conf from kor did already have it #Fixes https://github.com/tesseract-ocr/tesseract/issues/1009 preserve_interword_spaces 1 -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it,

Re: [tesseract-ocr] Tessercat 4.0 korean detecting chinese

Re: [tesseract-ocr] Tessercat 4.0 korean detecting chinese

Re: [tesseract-ocr] How to created training text as provided in langdata for any new language if i have just just have a wordlist.

Re: [tesseract-ocr] Tessercat 4.0 korean detecting chinese

Re: [tesseract-ocr] How to created training text as provided in langdata for any new language if i have just just have a wordlist.

Re: [tesseract-ocr] How to created training text as provided in langdata for any new language if i have just just have a wordlist.

[tesseract-ocr] Extract Header and Footer text separately from document image

Re: [tesseract-ocr] Tessercat 4.0 korean detecting chinese

8 matches

Site Navigation

Mail list logo

Footer information