make training TESSDATA=./usr/local/share/tessdata
unicharset_extractor --output_unicharset "data/foo/unicharset" --norm_mode
2 "data/foo/all-gt"
Failed to read data from: data/foo/all-gt....


This indicates you already run training that failed...
Clean your training and start it once again. Pay attention to why
"data/foo/all-gt" is not created (there will be an error message).

Zdenko


st 26. 4. 2023 o 2:07 Madhav Pandey <mad.develope...@gmail.com> napísal(a):

> @zdenop
>
> This is the entire training output:
>
> ```make training TESSDATA=./usr/local/share/tessdata
> unicharset_extractor --output_unicharset "data/foo/unicharset" --norm_mode
> 2 "data/foo/all-gt"
> Failed to read data from: data/foo/all-gt
> Wrote unicharset file data/foo/unicharset
> PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i
> "data/foo-ground-truth/alexis_ruhe01_1852_0087_027.tif" -t
> "data/foo-ground-truth/alexis_ruhe01_1852_0087_027.gt.txt" >
> "data/foo-ground-truth/alexis_ruhe01_1852_0087_027.box"
> set -x; \
>         tesseract "data/foo-ground-truth/alexis_ruhe01_1852_0087_027.tif"
> data/foo-ground-truth/alexis_ruhe01_1852_0087_027 --psm 13 lstm.train
> + tesseract data/foo-ground-truth/alexis_ruhe01_1852_0087_027.tif
> data/foo-ground-truth/alexis_ruhe01_1852_0087_027 --psm 13 lstm.train
> PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i
> "data/foo-ground-truth/alexis_ruhe01_1852_0018_022.tif" -t
> "data/foo-ground-truth/alexis_ruhe01_1852_0018_022.gt.txt" >
> "data/foo-ground-truth/alexis_ruhe01_1852_0018_022.box"
> set -x; \
>         tesseract "data/foo-ground-truth/alexis_ruhe01_1852_0018_022.tif"
> data/foo-ground-truth/alexis_ruhe01_1852_0018_022 --psm 13 lstm.train
> + tesseract data/foo-ground-truth/alexis_ruhe01_1852_0018_022.tif
> data/foo-ground-truth/alexis_ruhe01_1852_0018_022 --psm 13 lstm.train
> PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i
> "data/foo-ground-truth/alexis_ruhe01_1852_0035_019.tif" -t
> "data/foo-ground-truth/alexis_ruhe01_1852_0035_019.gt.txt" >
> "data/foo-ground-truth/alexis_ruhe01_1852_0035_019.box"
> set -x; \
>         tesseract "data/foo-ground-truth/alexis_ruhe01_1852_0035_019.tif"
> data/foo-ground-truth/alexis_ruhe01_1852_0035_019 --psm 13 lstm.train
> + tesseract data/foo-ground-truth/alexis_ruhe01_1852_0035_019.tif
> data/foo-ground-truth/alexis_ruhe01_1852_0035_019 --psm 13 lstm.train
> python3 shuffle.py 0 "data/foo/all-lstmf"
> Traceback (most recent call last):
>   File "/Users/m/Code/git/tesstrain/shuffle.py", line 24, in <module>
>     fd0 = open(sys.argv[2], 'r')
> FileNotFoundError: [Errno 2] No such file or directory:
> 'data/foo/all-lstmf'
> make: *** [data/foo/all-lstmf] Error 1```
>
> For this run, I just have 3 text and tif files.
>
> I did follow macos installation section from this page:
> https://tesseract-ocr.github.io/tessdoc/Compiling.html#macos and
> installed everything that is mentioned here.
>
> Do I have to install anything else before running the training?
>
> On Tuesday, 25 April 2023 at 00:27:28 UTC-6 zdenop wrote:
>
>> Did you install all the necessary dependencies?
>> Did you check & fixed all errors (before this error) in training output?
>>
>> Zdenko
>>
>>
>> ut 25. 4. 2023 o 8:21 Madhav Pandey <mad.dev...@gmail.com> napísal(a):
>>
>>> Hi Everyone,
>>>
>>> I am relatively new to tesseract and OCR as whole.
>>>
>>> I have been trying to training do the setup for training model locally
>>> using the guide
>>> https://github.com/tesseract-ocr/tesstrain/blob/main/README.md
>>>
>>> I have copied the sample training data into the `data/foo` directory but
>>> when I run `make training`, I will always end up getting this error:
>>>
>>> ```Failed to read data from: data/foo/all-gt
>>> Wrote unicharset file data/foo/unicharset
>>> python3 shuffle.py 0 "data/foo/all-lstmf"
>>> Traceback (most recent call last):
>>>   File "shuffle.py", line 24, in <module>
>>>     fd0 = open(sys.argv[2], 'r')
>>> FileNotFoundError: [Errno 2] No such file or directory:
>>> 'data/foo/all-lstmf'
>>> make: *** [data/foo/all-lstmf] Error 1
>>> ```
>>>
>>> Can someone please help resolve this error?
>>>
>>> Thank you!
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesseract-oc...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/249216fc-70e5-4e40-a630-d4202fd24a36n%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/249216fc-70e5-4e40-a630-d4202fd24a36n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/98ffe203-7d53-4b57-a5e8-3edd3ae271cen%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/98ffe203-7d53-4b57-a5e8-3edd3ae271cen%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8yCzW0VS4ybdioMTweYTN9NVe%3DaiWZbtLV_hT4Ae-SLjA%40mail.gmail.com.

Reply via email to