Are there any tutorials on this? I can't find any documentation regarding
this. Tesstrain doesn't take jpn_vert as a language so I do not know what
to do.
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group a
I have a script to train tesseract and I ran it on Arch Linux, Debian, and
even a docker container and they all produce the same errors. I checked to
make sure the script is correct as well.
Bug 1:
This happens when tesstrain runs text2image. The max pages parameter does
not work at all. It en
rsday, January 7, 2021 at 11:01:55 AM UTC-6 shree wrote:
> Old versions of tesstrain.sh used to limit training to 3 pages. Looks like
> you may have an old version in the path somewhere.
>
> On Thu, Jan 7, 2021 at 10:17 PM Kamui 7 wrote:
>
>> I have a script to train tesseract
; seems to have samples of both languages.
>
> On Thu, Jan 7, 2021, 22:40 Kamui 7 wrote:
>
>> I did a find command in the root directory and searched for the tesstrain
>> script. It could only find the script that i pulled from the latest
>> tesseract git repo. My tra
pages. You need to use a larger text if you want more pages.
>
> Also check that your fonts support both English and Japanese as the text
> seems to have samples of both languages.
>
> On Thu, Jan 7, 2021, 22:40 Kamui 7 wrote:
>
>> I did a find command in the root directory
could be if the characters in training text are not in the
> unicharset.
>
> On Fri, Jan 8, 2021, 00:46 Kamui 7 wrote:
>
>> Looks like that fixed bug #1. Now it is able to successfully create 400
>> pages. Do you have any ideas as to why the other 2 errors are occurring?
&
>> own unicharset file.
>> On Friday, January 8, 2021 at 12:58:27 AM UTC-6 shree wrote:
>>
>>> Are any of these vertical fonts?
>>>
>>> Encoding errors could be if the characters in training text are not in
>>> the unicharset.
>>
training text, because those are the samples
> that will be used for training.
>
> Why do you want to use a different unicharset?
>
>
> On Tue, Jan 12, 2021, 23:47 Kamui 7 wrote:
>
>>
>>
>> Great! The PR that you submitted fixed issue #3. All that's left
8 matches
Mail list logo