Hi there,

OK, found it out by myself: here are the steps:

1. Create 01.tr with tesseract 01.tif 01 nobatch box.train
2. Create 02.tr with tesseract 02.tif 02 nobatch box.train
3. Create unicharset with: unicharset_extractor 01.box 02.box
4. Just copy it (maybe it is not necessary) cp unicharset
02.unicharset
5. echo 01 0 0 0 0 0 > font_properties
6. echo 02 0 0 0 0 0 >> font_properties
7. mftraining -F font_properties -U unicharset 01.tr 02.tr

SO YOU SEE:  step 6 was missing (with >>  which means you should have
two lines in your font_properties)

So Jimmi: now it is your turn :-)

Talk soon

Holm



On May 26, 2:23 pm, zdenko podobny <zde...@gmail.com> wrote:
> On Thu, May 26, 2011 at 2:02 PM, Sarel van der Merwe 
> <sfvdme...@gmail.com>wrote:
>
> > Hi,
>
> > Do you know where i can locate the version 3 manual or reference guide
> > for Tesseract..
>
> > The I know is in download section (tessdoc-html-3.0.0-preview1.tar.gz) ;-)
>
>  Maybe Jimmi will update it for 3.01 :-)
> Some good information could be found in tesseract forums.
> All links are on main project page. Surprisingly ;-)
>
> Zdenko
>
> Thanks
>
>
>
> > Sarel
>
> > On Thu, May 26, 2011 at 1:33 PM, zdenko podobny <zde...@gmail.com> wrote:
> > > Hi,
> > > Problem is that you use the latest version and you do not read the latest
> > > manual [1]. If I correctly understood that German manual (via google
> > > translate), it is for version 3.00 so it do not follow changes in 3.01
> > > version.
> > > Another "problem": 3.01 is not released yet. It is for developers and
> > > experienced tester for testing and bug reporting. IMHO 3.01 training is
> > not
> > > fully documented.
>
> > > Zdenko
> > > [1]http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
> > > On Thu, May 26, 2011 at 10:59 AM, Holm Dressler
> > > <velovity1...@googlemail.com> wrote:
>
> > >> Hi there,
>
> > >> I am using Tesseract 3.01 under Linux.
>
> > >> I can successfully create traineddata  from one *.tif file. But
> > >> combining different tif / box files give me an exception:
> > >> What are the steps:
>
> > >> Let's say I want to create a traineddata from two tif files: 01.tif
> > >> and 02.tif
>
> > >> 1. tesseract 01.tif 01 batch.nochop makebox
> > >> 2. tesseract 02.tif 02 batch.nochop makebox
> > >> 3. I check the two box files using jTessBoxEditor
> > >> 4. tesseract 01.tif 01 nobatch box.train
> > >> 5. tesseract 02.tif 02 nobatch box.train
> > >> 6. As described under
> > >>http://wiki.ubuntuusers.de/tesseract-ocr/tesseract-ocr_trainieren
> > >> (sorry: it is on German, but the commands are the same) I create the
> > >> *.tr files:
> > >> 7. mftraining 01.tr 02.tr
>
> > >> But this results in error: Reading 01.tr ...01 has no defined
> > >> properties.
> > >> !"Missing font_properties entry is a fatal error!":Error:Assert
> > >> failed:in file mftraining.cpp, line 287
> > >> Segmentation fault
>
> > >> Also trying to create unicharset with
>
> > >> unicharset_extractor 01.box 02.box
>
> > >> works successfully, but mftraining -U ./unicharset 01.tr 02.tr fails
> > >> with the same error.
>
> > >> Somebody has an idea what I am doing wrong.
>
> > >> Also using the group e.g.
> > >> with the search word "combine" did not result in any fitting
> > >> solution.
>
> > >> Thanks for any advice,
>
> > >> Holm from Germany
>
> > >> --
> > >> You received this message because you are subscribed to the Google
> > >> Groups "tesseract-ocr" group.
> > >> To post to this group, send email to tesseract-ocr@googlegroups.com
> > >> To unsubscribe from this group, send email to
> > >> tesseract-ocr+unsubscr...@googlegroups.com
> > >> For more options, visit this group at
> > >>http://groups.google.com/group/tesseract-ocr?hl=en
>
> > > --
> > > You received this message because you are subscribed to the Google
> > > Groups "tesseract-ocr" group.
> > > To post to this group, send email to tesseract-ocr@googlegroups.com
> > > To unsubscribe from this group, send email to
> > > tesseract-ocr+unsubscr...@googlegroups.com
> > > For more options, visit this group at
> > >http://groups.google.com/group/tesseract-ocr?hl=en
>
> > --
> > You received this message because you are subscribed to the Google
> > Groups "tesseract-ocr" group.
> > To post to this group, send email to tesseract-ocr@googlegroups.com
> > To unsubscribe from this group, send email to
> > tesseract-ocr+unsubscr...@googlegroups.com
> > For more options, visit this group at
> >http://groups.google.com/group/tesseract-ocr?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com
To unsubscribe from this group, send email to
tesseract-ocr+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to