Old versions of tesstrain.sh used to limit training to 3 pages. Looks like
you may have an old version in the path somewhere.

On Thu, Jan 7, 2021 at 10:17 PM Kamui 7 <qntmmag...@gmail.com> wrote:

> I have a script to train tesseract and I ran it on Arch Linux, Debian, and
> even a docker container and they all produce the same errors. I checked to
> make sure the script is correct as well.
>
> Bug 1:
> This happens when tesstrain runs text2image. The max pages parameter does
> not work at all. It ends up only rendering 4 pages regardless of what I
> pass in for the maxpages parameter. I even tried hardcoding it into the
> tesstrain_utils.sh file and it still does the same thing.
>
> Bug 2:
> After it finishes producing those 4 pages, i finetune it with lstmtraining
> and the resulting output is full of "Encoding of string failed!" errors.
>
> Bug 3:
> Along with those encoding errors, it also outputs the following text:
>
> "Image too small to scale!! (2x48 vs min width of 3)
> Line cannot be recognized!!
> Image not trainable"
>
> I will upload my script along with the Dockerfile if anyone wants to take
> a look.
>
>
> https://drive.google.com/file/d/1FkW1q1cXwOxY6Yi1A1cMzInbtJa9L01M/view?usp=sharing
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/7a9415d6-4d0c-4333-98c0-2628720661ebn%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/7a9415d6-4d0c-4333-98c0-2628720661ebn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>


-- 

____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUQ_maJaMyk2akc9c0-8JquBDkw%2Bi4p6cmW8rW0BQKSdw%40mail.gmail.com.

Reply via email to