It worked, thanks.
Any reason for this chi_tra there?
On Monday, 9 April 2018 03:24:44 UTC-3, shree wrote:
>
> Please remove the sub language line from config file, and use combine
> tessdata to overwrite it.
>
> Right now it seems to be using chi_tra also.
>
> On Mon 9 Apr, 2018, 11:48 AM Fana
Leftover from 3.04, my guess.
On Mon 9 Apr, 2018, 12:52 PM Fanatico, wrote:
> It worked, thanks.
>
> Any reason for this chi_tra there?
>
>
> On Monday, 9 April 2018 03:24:44 UTC-3, shree wrote:
>>
>> Please remove the sub language line from config file, and use combine
>> tessdata to overwrite
Hi Shree Thanks for replying
For tesseract *3.05.00*
I had already checked that link there they mentioned
*"Make sure there are a minimum number of samples of each character. 10 is
good, but 5 is OK for rare characters.*
*There should be more samples of the more frequent characters - at least
20.
For Korean, please check whether adding the following lines to config,
improves your results further.
#Fixes https://github.com/tesseract-ocr/tesseract/issues/1009
preserve_interword_spaces 1
ShreeDevi
भजन - कीर्तन - आरती @ http://bhaj
For tesseract 3.05
random text will work, it is suggested to use combos similar to English
training text.
It is unlikely you will get answers to your questions from the developers.
You can search past issues/questions in forum and github.
3.05 training does not take long, run a few experiments f
Thanks Shree , but if tesseract is open source then why developers can't
answer doubts , If i were to randomly train my model how can i come down to
accurate accuracy of my model , then my model accuracy will also be random.
I want the reason for condition imposed on training text , how much it
Is there a way to extract the header and footer content on a document page
separately using Tesseract OCR? I tried the hOCR output but it doesn't seem
to have any such tags associated with the output.
Regards,
Mohit
--
You received this message because you are subscribed to the Google Groups
The conf from kor did already have it
#Fixes https://github.com/tesseract-ocr/tesseract/issues/1009
preserve_interword_spaces 1
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it,
8 matches
Mail list logo