Which tessdata repository are you using for your trained data files?

tessdata
tessdata_best
tessdata_fast



On Tue 24 Jul, 2018, 9:01 AM Atsuyoshi Suzuki, <atuyosi.unloc...@gmail.com>
wrote:

> Hi.
>
> I tried new tesseract and  traineddata for Japanese (both jpn.traineddata
> and Japanese.traineddata).
>
> It's very good recognition result with jpn.traineddata.
>
> Japanese.traineddata provide good result  but unnecessary space is
> inserted in words or characters.
>
>
>
> Is this behavior expected? In Japanese, there is no space between each
> words.
>
> If this behavior is expected, what kind of usage is assumed for
> Japanese.traineddata?
>
>
>
> jpn.traineddata (very good, and I expected):
>
> --- start ---
> $ tesseract -l jpn  test_jpn_04.jpg stdout
> Warning. Invalid resolution 0 dpi. Using 70 instead.
> Estimating resolution as 168
> OCR 機能を提供する Web API はいくつか存在しますが、用途によってカスタマイズすることが
> できません。Tesseract は多数の言語に対応し、Linux、macOS、Windows で動作します。
>
> --- end ---
>
>
> Japanese.traineddata:
>
> --- start ---
> $ tesseract -l Japanese  test_jpn_04.jpg stdout
> Warning. Invalid resolution 0 dpi. Using 70 instead.
> Estimating resolution as 168
> OCR 機能 を 提供 する Web API は いく つか 存在 し ます が 、 用 途 に よっ て カス タマ イズ する こと が
> で きま せん 。Tesseract は 多数 の 言語 に 対応 し 、Linux、macOS、Windows で 動作 し ます 。
>
> --- end ---
>
>
> This result is same between Ubuntu (beta.1) and macOS
> (4.0.0-beta.2-586-g607e).
>
>
>
> Thanks.
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/ccfcb61b-3afa-4ecc-b6ac-ae3aebc55465%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/ccfcb61b-3afa-4ecc-b6ac-ae3aebc55465%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVDx5_gDmipLsM5Md98_RP4tri9dH100O6_3tgq-5Q5Pw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to