Please see
https://github.com/tesseract-ocr/tessdata_fast#example---jpn-and--japanese
for Ray's comment regarding the 'script' traineddata.
preserve_interword_spaces 1
was added via jpn.config to jpn.traineddata file and other CJK languages
to fix this issue - see
https://github.com/tesseract
I had this error when I was mixing best models with non best models.
I would try to run again
combine_tessdata -e base_model/eng.traineddata base_model/eng.lstm
to generate the eng.lstm from the "_best" model (the ones from
/usr/share/tessdata are not the "_best" models).
Then if the error is s
Thank you Shree.
I got same result jpn and Japanese with '-c preserve_interword_spaces=1'.
$ tesseract -l Japanese -c preserve_interword_spaces=1 test_jpn_04.jpg
stdout
Unnecessary space problem is solved. Thanks.
2018年7月24日火曜日 16時28分22秒 UTC+9 shree:
>
> Please see
> https://github.com/t
I have been looking through the documentation but cannot seem to find
anything that explains how the rms is calculated. I am a bit new to this
sort of work, so I am not quite sure where to look. Can anyone point me in
the right direction?
--
You received this message because you are subscrib
I'm using OCR-D that uses 4.0.0-beta.1
On Tuesday, July 24, 2018 at 12:05:22 AM UTC-5, shree wrote:
>
> Which version of tesseract are you using?
>
> Please post output of
>
> tesseract -v
>
> On Tue 24 Jul, 2018, 2:26 AM Emiliano Isaza Villamizar, > wrote:
>
>> Hello everyone,
>>
>>
>> 'm trying
I am using Japanese.traineddata.which gives good result
On Tue, Jul 24, 2018 at 2:59 PM, Atsuyoshi Suzuki <
atuyosi.unloc...@gmail.com> wrote:
> Thank you Shree.
>
>
> I got same result jpn and Japanese with '-c preserve_interword_spaces=1'.
>
> $ tesseract -l Japanese -c preserve_interword_spa
I'm using OCR-D I compiled it again changing the .traineddata in the
original file but it hasn't worked. I still get the same error.
Iteration 0: ALIGNED TRUTH : Zhejiang Huamei Holding Co Ltd
Iteration 0: BEST OCR TEXT : ₩Z₩h₩e₩j₩i₩a₩n₩ ₩₩u₩a₩m₩e ₩₩o₩₩d₩i₩n₩ ₩C₩o
₩L₩₩d
File
/home/tulipan1637/D
any luck guys ???
>
>
>
> Thanks
>
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send em
>
> * --continue_from
>>
>> /home/tulipan1637/Documents/Emiliano/OCR/OCRtraining/ocrd-train/tessdata/eng.lstm
>>
>> \*
>> * --old_traineddata
>> /home/tulipan1637/Documents/Emiliano/OCR/OCRtraining/ocrd-train/tessdata/eng.traineddata
>>
>> \*
>>
>
Use eng.traineddata from tessdata_best
It happens to the moment in which a word contains this tilde, it is not
recognized and the word changes, the same case is for the letter "ñ"
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emai
It worked maybe I was using another *eng.traineddata. *Thank you for your
time Shree and Lorenzo
kind regards,
Emiliano
On Tuesday, July 24, 2018 at 11:40:34 AM UTC-5, shree wrote:
>
> * --continue_from
>>>
>>> /home/tulipan1637/Documents/Emiliano/OCR/OCRtraining/ocrd-train/tessdata/eng.l
This may be a silly question, but I assume that when you call tesseract
that you are using the -l spa option?
On Tuesday, July 24, 2018 at 12:20:11 PM UTC-5, ricardo valadez wrote:
>
> It happens to the moment in which a word contains this tilde, it is not
> recognized and the word changes,
I anyone is following this thread and are using OCR-D, I had to change the
start of the .py file by adding these lines because I kept getting and
unicode error:
*import sys*
*reload(sys)*
*sys.setdefaultencoding('utf-8')*
On Tuesday, July 24, 2018 at 4:41:45 PM UTC-5, Emiliano Isaza Villamizar
If anyone is following this thread and are using OCR-D, I had to modify the
.py file because I kept getting a Unicode error, just add these lines to
the file:
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
On Tuesday, July 24, 2018 at 4:41:45 PM UTC-5, Emiliano Isaza Villamizar
wrote:
I am new to the tesseract also. Where in the tesseract world does rms value
come up? As a general rule in engineering, the rms value is .707 peak value
if one is working with amps or volts and you are dealing with sinusoids. If
the waveform is not sinusoidal, the rms value is equal to the averag
maybe if it's silly but I'm new to tesseract ... I'll call it that, thank
you
El martes, 24 de julio de 2018, 16:42:55 (UTC-5), John Lee Ward escribió:
>
> This may be a silly question, but I assume that when you call tesseract
> that you are using the -l spa option?
>
>
>
> On Tuesday, July 2
16 matches
Mail list logo