After some research in Korean I found that they do use Chinese characters 
in their language, so it is correct to set Chinese as a sublanguage, the 
problem is that the kor.training_text doesn't have chinede letters, so the 
code is only training Korean and ignoring the Chinese, so if I tesseract on 
an image that has Korean and Chinese it is going to recognize some Korean 
characters as Chinese and some Chinese characters as Korean.

On Monday, 9 April 2018 05:15:57 UTC-3, shree wrote:
>
> Leftover from 3.04, my guess.
>
> On Mon 9 Apr, 2018, 12:52 PM Fanatico, <fanati...@gmail.com <javascript:>> 
> wrote:
>
>> It worked, thanks.
>>
>> Any reason for this chi_tra there?
>>
>>
>> On Monday, 9 April 2018 03:24:44 UTC-3, shree wrote:
>>>
>>> Please remove the sub language line from config file, and use combine 
>>> tessdata to overwrite it.
>>>
>>> Right now it seems to be using chi_tra also.
>>>
>>> On Mon 9 Apr, 2018, 11:48 AM Fanatico, <fanati...@gmail.com> wrote:
>>>
>>>> I used one traineddata that I created on removing the top layer from 
>>>> the kor.traineddata from "tessdata_best", after this I replaced this 
>>>> traineddata with the one from "tessdata_best" and got the same problem.
>>>>
>>>> Yes, it include chi_tra as sublanguage
>>>> tessedit_load_sublangs chi_tra
>>>>
>>>> lstm-unicharset only has corean characters
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to tesseract-oc...@googlegroups.com.
>>>> To post to this group, send email to tesser...@googlegroups.com.
>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/tesseract-ocr/0d50ee2b-b5d4-4c73-a45b-d5245403ad04%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/0d50ee2b-b5d4-4c73-a45b-d5245403ad04%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com <javascript:>.
>> To post to this group, send email to tesser...@googlegroups.com 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/8496ad57-f7eb-426c-a4ae-5d365c56bc96%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/8496ad57-f7eb-426c-a4ae-5d365c56bc96%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/d20b1468-9b36-49a5-9b96-3a8ed2df3e71%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to