>I get a file named output_checkpoint with 200MB. I renamed it to
ccy.traineddata and put it in the tessdata folder. *Is this how it's
supposed to do*?

No. Please see
https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00#combining-the-output-files

>*Is there a way to check if a traineddata file is valid*?

https://github.com/tesseract-ocr/tesseract/blob/master/doc/combine_tessdata.1.asc


-d *.traineddata* *FILE*…: Lists directory of components from the
.traineddata file.

combine_tessdata -d tessdata/eng.traineddata

On Tue, Sep 10, 2019 at 7:40 PM Nuno Feliciano <nfelici...@gmail.com> wrote:

>
> Thanks for the quick reply. The first time I got the error was after the
> learning process, so I did a step backwards to replicate the error.
>
> When I train the model
> lstmtraining
> --traineddata D:/software/Tesseract-OCR-4.0/tessdata/ccy.traineddata
> -U D:/software/Tesseract-OCR/tessdate/Latin.unicharset
> --train_listfile D:/software/Tesseract-OCR/training/list.train
> --net_spec
>  "[1,40,0,1 Ct5,5,64 Mp3,3 Lfys128 Lbx256 Lbx256 O1c1]"
>  --model_output D:/software/Tesseract-OCR/training/model/output
>
>  I get a file named output_checkpoint with 200MB. I renamed it to
> ccy.traineddata and put it in the tessdata folder. *Is this how it's
> supposed to do*?
> Then know When I execute the OCR I get
> Error opening data file
> D:\software\Tesseract-OCR-4.0\tessdata/ccy.traineddata
> Please make sure the TESSDATA_PREFIX environment variable is set to your
> "tessdata" directory.
> Failed loading language 'ccy'
> Tesseract couldn't load any languages!
> Could not initialize tesseract.
>
> The file exists, and I can open in a text editor.
>
> *Is there a way to check if a traineddata file is valid*?
>
> Thanks,
> Nuno
>
> segunda-feira, 9 de Setembro de 2019 às 17:09:39 UTC+1, shree escreveu:
>>
>> Combine-lang-model only creates the starter traineddata. It is used as
>> part of lstm training process. It cannot be used for recognition.
>>
>> Training from scratch requires running the lstmtraing command.
>>
>> On Mon, Sep 9, 2019, 21:36 Nuno Feliciano <nfeli...@gmail.com> wrote:
>>
>>>
>>>
>>>
>>>
>>> Hi,
>>>
>>> I am trying to make a model from scratch.
>>> I created a language using
>>> combine_lang_model --input_unicharset
>>> D:\software\Tesseract-OCR-4.0\tessdata\Latin.unicharset --script_dir
>>> D:\software\Tesseract-OCR-4.0\tessdata --output_dir
>>> D:\software\Tesseract-OCR-4.0\training\output *--lang ccy*
>>> Than I put the generated ccy.traineddata file in tessdata folder and I
>>> execute
>>> tesseract --tessdata-dir D:\software\Tesseract-OCR-4.0\tessdata -l ccy
>>> <file> stdout, which gives me
>>> *Failed loading language 'ccy'*
>>> Tesseract couldn't load any languages!
>>> Could not initialize tesseract.
>>>
>>> tesseract --list-langs gives me
>>> ccy
>>> eng
>>> osd
>>> ...
>>>
>>> I got Latin.unicharset from
>>> https://raw.githubusercontent.com/tesseract-ocr/langdata_lstm/master/Latin.unicharset
>>>
>>> Can anyone help?
>>>
>>> Thanks,
>>> Nuno Feliciano
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesser...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/f0157ef9-7b83-4fa3-8cf5-3697514d6de0%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/f0157ef9-7b83-4fa3-8cf5-3697514d6de0%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/9a4f9c1d-009a-4420-a662-26b2678e253a%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/9a4f9c1d-009a-4420-a662-26b2678e253a%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>


-- 

____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVP6an5i20EkW8V8xVCRd4xugAEZ-rL48UUdHUcqjr5Eg%40mail.gmail.com.

Reply via email to