i have this problem too i used jtessboxeditor to train the tesseract my tif file had 34000 word and i build it with a 50 pages tiff file
but the output trained file was 1.5 mb and dont detected any words!! jtessboxeditor have problem? On 2/25/14, Bernard Polarski <[email protected]> wrote: > How do you produce your traineddata ? > > > > Le mardi 25 février 2014 17:51:39 UTC+1, Frederico Ferro Schuh a écrit : >> >> Hello all, >> >> I'm training Tesseract to recognize handwritten digits, and I have >> provided it about 6000 samples of each digit, in 10 different box files, >> one for each digit. Each box file is a 2152x2152 TIF file. However, the >> resulting traineddata file I get after completing the training procedure >> is >> only 137 kb. >> I went through the process again, providing smaller sample files (1000 >> samples of each digit), and ended up with the same traineddata size of 137 >> >> kb. >> Is this size reasonable or am I doing something wrong? >> I assume something is wrong because my results are pretty bad so far. >> >> I've attached the sample image I am using for the digit 0. >> >> Thanks in advance, >> Fred >> > > -- > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/groups/opt_out. > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

