Hi everyone,

I was trying to create the datatraining for the OCR-A font following the 
guides found online, but I ran into various problems.
I will explain the steps I followed:
- through JTessBOXEditor I created the files (.box, font_properties, .tif)
- I ran the command "tesseract lang_name.ocraextended.exp0.tif 
lang_name.ocraextended.exp0 --psm 6 nobatch box.train" to create the .tr 
file, obtaining the following output:

Page 1
APPLY_BOXES:
   Boxes read from boxfile:     230
   Found 230 good blobs.
Generated training data for 230 words
Page 2
APPLY_BOXES:
   Boxes read from boxfile:     230
   Found 230 good blobs.
Generated training data for 230 words
Page 3
APPLY_BOXES:
   Boxes read from boxfile:     130
   Found 130 good blobs.
Generated training data for 130 words


- I executed the command unicharset_extractor 
lang_name.ocraextended.exp0.box
- mftraining -F lang_name.font_properties -U unicharset -O 
lang_name.unicharset leng_name.ocraextended.exp0.tr.
This last command gave me the following output

Reading mftraining ...
Failed to open tr file: mftraining
Reading  lang_name  .ocraextended.exp0.tr ...
Flat shape table summary: Number of shapes = 36 max unichars = 1 number 
with multiple unichars = 0
N == sizeof(Cluster->Mean):Error:Assert failed:in file 
../../../src/classify/cluster.cpp, line 2527

Then if I run the command: "cntraining leng_name.ocraextended.exp0.tr"  but 
I don't get the files created 

Can someone kindly help me please

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/395d46d1-2c6b-455f-8713-92b8968a20a8n%40googlegroups.com.

Reply via email to