Hi Shree.

I use tessdata_fast.


2018年7月24日火曜日 13時44分40秒 UTC+9 shree:
>
> Which tessdata repository are you using for your trained data files?
>
> tessdata
> tessdata_best
> tessdata_fast
>
>
>
> On Tue 24 Jul, 2018, 9:01 AM Atsuyoshi Suzuki, <atuyosi....@gmail.com 
> <javascript:>> wrote:
>
>> Hi.
>>
>> I tried new tesseract and  traineddata for Japanese (both jpn.traineddata 
>> and Japanese.traineddata). 
>>
>> It's very good recognition result with jpn.traineddata.
>>
>> Japanese.traineddata provide good result  but unnecessary space is 
>> inserted in words or characters.
>>
>>
>>
>> Is this behavior expected? In Japanese, there is no space between each 
>> words.
>>
>> If this behavior is expected, what kind of usage is assumed for 
>> Japanese.traineddata?
>>
>>
>>
>> jpn.traineddata (very good, and I expected):
>>
>> --- start ---
>> $ tesseract -l jpn  test_jpn_04.jpg stdout
>> Warning. Invalid resolution 0 dpi. Using 70 instead.
>> Estimating resolution as 168
>> OCR 機能を提供する Web API はいくつか存在しますが、用途によってカスタマイズすることが
>> できません。Tesseract は多数の言語に対応し、Linux、macOS、Windows で動作します。
>>
>> --- end ---
>>
>>
>> Japanese.traineddata:
>>
>> --- start ---
>> $ tesseract -l Japanese  test_jpn_04.jpg stdout
>> Warning. Invalid resolution 0 dpi. Using 70 instead.
>> Estimating resolution as 168
>> OCR 機能 を 提供 する Web API は いく つか 存在 し ます が 、 用 途 に よっ て カス タマ イズ する こと が
>> で きま せん 。Tesseract は 多数 の 言語 に 対応 し 、Linux、macOS、Windows で 動作 し ます 。
>>
>> --- end ---
>>
>>
>> This result is same between Ubuntu (beta.1) and macOS 
>> (4.0.0-beta.2-586-g607e).
>>
>>
>>
>> Thanks.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com <javascript:>.
>> To post to this group, send email to tesser...@googlegroups.com 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/ccfcb61b-3afa-4ecc-b6ac-ae3aebc55465%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/ccfcb61b-3afa-4ecc-b6ac-ae3aebc55465%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/e009654e-7f40-42fb-bc56-6946a60105aa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to