[tesseract-ocr] Re: lstmtraining query

Samruddhi Dhake Thu, 16 Sep 2021 02:28:54 -0700

Hi,
One more question to add here is, after running 2nd command mentioned 
above, I am getting assert in file lstmtrainer.h, but I didn't find this 
file in src/training in my folder Tesseract-OCR.
Can you please me with this too?


Regards,
Samruddhi
On Thursday, September 16, 2021 at 2:54:25 PM UTC+5:30 Samruddhi Dhake 
wrote:

> Hello,
>
> *lstmtraining --model_output="D:\Test\output" 
> --continue_from="D:\Test\Dim_test.lstmf" 
> --train_listfile="D:\Test\eng.training_files.txt"  
> --traineddata="D:\Test\eng\eng.traineddata" --debug_interval -1 
> -max_iterations 10*
>
> After running above command, I got,
>
> Warning: given outputs 111 not equal to unicharset of 110.
> Num outputs,weights in Series:
>   1,36,0,1:1, 0
> Num outputs,weights in Series:
>   C3,3:9, 0
>   Ft16:16, 160
> Total weights = 160
>   [C3,3Ft16]:16, 160
>   Mp3,3:16, 0
>   Lfys48:48, 12480
>   Lfx96:96, 55680
>   Lrx96:96, 74112
>   Lfx256:256, 361472
>   Fc110:110, 28270
> Total weights = 532174
> Built network:[1,36,0,1[C3,3Ft16]Mp3,3Lfys48Lfx96Lrx96Lfx256Fc110] from 
> request [1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1c111]
> Training parameters:
>   Debug interval = -1, weights = 0.1, learning rate = 0.001, momentum=0.5
> null char=109
> Loaded 2/2 lines (1-2) of document D:\Test\Dim_test.lstmf
> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 
> 20 30 2e 30
> Can't encode transcription: '├ÿ423.1 0.0' in language ''
> Iteration 0: GROUND  TRUTH : +1.5
> Iteration 0: ALIGNED TRUTH : ++11..55
> Iteration 0: BEST OCR TEXT : _B_f_t_t_t_t_f
> File D:\Test\Dim_test.lstmf line 1 :
> Mean rms=5.855%, delta=27.586%, train=450%(100%), skip ratio=100%
> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 
> 20 30 2e 30
> Can't encode transcription: '├ÿ423.1 0.0' in language ''
> Iteration 1: GROUND  TRUTH : +1.5
> Iteration 1: ALIGNED TRUTH : ++11..55
> Iteration 1: BEST OCR TEXT : _______
> File D:\Test\Dim_test.lstmf line 1 :
> Mean rms=5.839%, delta=27.586%, train=362.5%(100%), skip ratio=100%
> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 
> 20 30 2e 30
> Can't encode transcription: '├ÿ423.1 0.0' in language ''
> Iteration 2: GROUND  TRUTH : +1.5
> Iteration 2: ALIGNED TRUTH : ++11..55
> Iteration 2: BEST OCR TEXT : _+++++++
> File D:\Test\Dim_test.lstmf line 1 :
> Mean rms=5.821%, delta=27.586%, train=325%(100%), skip ratio=100%
> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 
> 20 30 2e 30
> Can't encode transcription: '├ÿ423.1 0.0' in language ''
> Iteration 3: GROUND  TRUTH : +1.5
> Iteration 3: ALIGNED TRUTH : ++11..55
> Iteration 3: BEST OCR TEXT : +++++++
> File D:\Test\Dim_test.lstmf line 1 :
> Mean rms=5.798%, delta=25.862%, train=300%(100%), skip ratio=100%
> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 
> 20 30 2e 30
> Can't encode transcription: '├ÿ423.1 0.0' in language ''
> Iteration 4: GROUND  TRUTH : +1.5
> Iteration 4: ALIGNED TRUTH : ++11..55
> Iteration 4: BEST OCR TEXT : +++++++
> File D:\Test\Dim_test.lstmf line 1 :
> Mean rms=5.765%, delta=24.138%, train=285%(100%), skip ratio=100%
> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 
> 20 30 2e 30
> Can't encode transcription: '├ÿ423.1 0.0' in language ''
> Iteration 5: GROUND  TRUTH : +1.5
> Iteration 5: ALIGNED TRUTH : +11..55
> Iteration 5: BEST OCR TEXT : ++++...
> File D:\Test\Dim_test.lstmf line 1 :
> Mean rms=5.704%, delta=22.414%, train=266.667%(100%), skip ratio=100%
> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 
> 20 30 2e 30
> Can't encode transcription: '├ÿ423.1 0.0' in language ''
> Iteration 6: GROUND  TRUTH : +1.5
> Iteration 6: ALIGNED TRUTH : +1..555
> Iteration 6: BEST OCR TEXT : ++.
> File D:\Test\Dim_test.lstmf line 1 :
> Mean rms=5.519%, delta=19.704%, train=239.286%(100%), skip ratio=100%
> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 
> 20 30 2e 30
> Can't encode transcription: '├ÿ423.1 0.0' in language ''
> Iteration 7: GROUND  TRUTH : +1.5
> Iteration 7: ALIGNED TRUTH : +1..5555
> Iteration 7: BEST OCR TEXT : +.
> File D:\Test\Dim_test.lstmf line 1 :
> Mean rms=5.308%, delta=17.672%, train=215.625%(100%), skip ratio=100%
> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 
> 20 30 2e 30
> Can't encode transcription: '├ÿ423.1 0.0' in language ''
> Iteration 8: GROUND  TRUTH : +1.5
> Iteration 8: ALIGNED TRUTH : +1...555
> Iteration 8: BEST OCR TEXT : .
> File D:\Test\Dim_test.lstmf line 1 :
> Mean rms=5.101%, delta=16.092%, train=200%(100%), skip ratio=100%
> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 
> 20 30 2e 30
> Can't encode transcription: '├ÿ423.1 0.0' in language ''
> Iteration 9: GROUND  TRUTH : +1.5
> Iteration 9: ALIGNED TRUTH : +1...55
> Iteration 9: BEST OCR TEXT : .......5
> File D:\Test\Dim_test.lstmf line 1 :
> Mean rms=4.888%, delta=14.828%, train=200%(100%), skip ratio=100%
> At iteration 10/10/20, Mean rms=4.888%, delta=14.828%, char train=200%, 
> word train=100%, skip ratio=100%,  New worst char error = 200 wrote 
> checkpoint.
>
> Finished! Error rate = 100
>
> Later, running command, 
> lstmtraining --stop_training --continue_from="D:\Test\output_checkpoint" 
> --traineddata="D:\Test\eng\eng.traineddata 
> --model_output="D:\Test\abc.traineddata"
>
> It am getting.
> mgr_.Init(traineddata_path.c_str()):Error:Assert failed:in file 
> ../../../../../src/training/lstmtrainer.h, line 96
>
>
> My question is, does lstmtraining first generates output_checkpoints and 
> then giving those to lstmtraining --stop_training, does it generates 
> .trainneddata??? 
> Thankyou.
>
> Regards,
> Samruddhi
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/9ba5464e-1c3f-41bf-8e32-ed33ca76780dn%40googlegroups.com.

[tesseract-ocr] Re: lstmtraining query

Reply via email to