If you do not include a word-dawg, freq-dawg then the only big file is 
inttemp. 
For 34000 character I am surprised to see it at the size of around 100k.
However your 6000 represents only 10 digit so it is very possible.
As of the poor performance, I think that the size is very detrimental : the 
character are usually 20 to 40 pixel high and 20 to 50 wide ( only for 'm' 
or 'w' ) 
Too much precision is not good.
 
All he others files are usually rather small (pffmtable, normproto, 
font_properties. shapetable, unicharset, unicharambigs)
and combined are less than 100k.
 
In this respect your traineddata seems normal.
 
Beside that you could write using wildcard:
 
   shapeclustering *.tr
   mftraining *.tr
   cntraining*.tr
 
 
Le mardi 25 février 2014 17:51:39 UTC+1, Frederico Ferro Schuh a écrit :

> Hello all, 
>
> I'm training Tesseract to recognize handwritten digits, and I have 
> provided it about 6000 samples of each digit, in 10 different box files, 
> one for each digit. Each box file is a 2152x2152 TIF file. However, the 
> resulting traineddata file I get after completing the training procedure is 
> only 137 kb.
> I went through the process again, providing smaller sample files (1000 
> samples of each digit), and ended up with the same traineddata size of 137 
> kb.
> Is this size reasonable or am I doing something wrong?
> I assume something is wrong because my results are pretty bad so far.
>
> I've attached the sample image I am using for the digit 0.
>
> Thanks in advance,
> Fred
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to