[tesseract-ocr] Tesstrain.sh is not creating TrainedData

[email protected] Sat, 24 Apr 2021 07:50:13 -0700

Hi,

I am running the following command to create trained data:
tesstrain.sh --fonts_dir /usr/share/fonts --lang eng --linedata_only 
--fontlist "FreeMono" --noextract_font_properties --langdata_dir 
/home/administrator/Downloads/tesseract-4.0.0/langdata --tessdata_dir 
/home/administrator/Downloads/tesseract-4.0.0/tessdata --output_dir 
/home/administrator/images/output_folder_1/


After this it is printing:
=== Starting training for language 'eng'
[Sat Apr 24 20:15:19 IST 2021] /usr/local/bin/text2image 
--fonts_dir=/usr/share/fonts --font=FreeMono 
--outputbase=/tmp/font_tmp.e9Fi4vFUQQ/sample_text.txt 
--text=/tmp/font_tmp.e9Fi4vFUQQ/sample_text.txt 
--fontconfig_tmpdir=/tmp/font_tmp.e9Fi4vFUQQ
Rendered page 0 to file /tmp/font_tmp.e9Fi4vFUQQ/sample_text.txt.tif

=== Phase I: Generating training images ===
Rendering using FreeMono
[Sat Apr 24 20:15:23 IST 2021] /usr/local/bin/text2image 
--fontconfig_tmpdir=/tmp/font_tmp.e9Fi4vFUQQ --fonts_dir=/usr/share/fonts 
--strip_unrenderable_words --leading=32 --xsize=3600 --char_spacing=0.0 
--exposure=0 --outputbase=/tmp/eng-2021-04-24.Y8g/eng.FreeMono.exp0 
--max_pages=0 --font=FreeMono 
--text=/home/administrator/Downloads/tesseract-4.0.0/langdata/eng/eng.training_text
Rendered page 0 to file /tmp/eng-2021-04-24.Y8g/eng.FreeMono.exp0.tif
Rendered page 1 to file /tmp/eng-2021-04-24.Y8g/eng.FreeMono.exp0.tif

=== Phase UP: Generating unicharset and unichar properties files ===
[Sat Apr 24 20:15:25 IST 2021] /usr/local/bin/unicharset_extractor 
--output_unicharset /tmp/eng-2021-04-24.Y8g/eng.unicharset --norm_mode 1
Usage: /usr/local/bin/unicharset_extractor [--output_unicharset filename] 
[--norm_mode mode] box_or_text_file [...]
Where mode means:
 1=combine graphemes (use for Latin and other simple scripts)
 2=split graphemes (use for Indic/Khmer/Myanmar)
 3=pure unicode (use for Arabic/Hebrew/Thai/Tibetan)

As per specification it should be end with:
Created starter traineddata for LSTM training of language 'eng' 
Run 'lstmtraining' command to continue LSTM training for language 'eng

Please help.

Regards,
Pooja

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/aec09551-5250-4e22-8098-0b3291f931b3n%40googlegroups.com.

[tesseract-ocr] Tesstrain.sh is not creating TrainedData

Reply via email to