[tesseract-ocr] Re: lstmtraining query

Samruddhi Dhake Wed, 22 Sep 2021 21:37:46 -0700

Hi,
Can anyone help me to resolve above issues?

Regards,
Samruddhi


On Thursday, September 16, 2021 at 2:58:50 PM UTC+5:30 Samruddhi Dhake 
wrote:

>
> Hi,
> One more question to add here is, after running 2nd command mentioned 
> above, I am getting assert in file lstmtrainer.h, but I didn't find this 
> file in src/training in my folder Tesseract-OCR.
> Can you please me with this too?
>
> Regards,
> Samruddhi
> On Thursday, September 16, 2021 at 2:54:25 PM UTC+5:30 Samruddhi Dhake 
> wrote:
>
>> Hello,
>>
>> *lstmtraining --model_output="D:\Test\output" 
>> --continue_from="D:\Test\Dim_test.lstmf" 
>> --train_listfile="D:\Test\eng.training_files.txt"  
>> --traineddata="D:\Test\eng\eng.traineddata" --debug_interval -1 
>> -max_iterations 10*
>>
>> After running above command, I got,
>>
>> Warning: given outputs 111 not equal to unicharset of 110.
>> Num outputs,weights in Series:
>>   1,36,0,1:1, 0
>> Num outputs,weights in Series:
>>   C3,3:9, 0
>>   Ft16:16, 160
>> Total weights = 160
>>   [C3,3Ft16]:16, 160
>>   Mp3,3:16, 0
>>   Lfys48:48, 12480
>>   Lfx96:96, 55680
>>   Lrx96:96, 74112
>>   Lfx256:256, 361472
>>   Fc110:110, 28270
>> Total weights = 532174
>> Built network:[1,36,0,1[C3,3Ft16]Mp3,3Lfys48Lfx96Lrx96Lfx256Fc110] from 
>> request [1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1c111]
>> Training parameters:
>>   Debug interval = -1, weights = 0.1, learning rate = 0.001, momentum=0.5
>> null char=109
>> Loaded 2/2 lines (1-2) of document D:\Test\Dim_test.lstmf
>> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 
>> 31 20 30 2e 30
>> Can't encode transcription: '├ÿ423.1 0.0' in language ''
>> Iteration 0: GROUND  TRUTH : +1.5
>> Iteration 0: ALIGNED TRUTH : ++11..55
>> Iteration 0: BEST OCR TEXT : _B_f_t_t_t_t_f
>> File D:\Test\Dim_test.lstmf line 1 :
>> Mean rms=5.855%, delta=27.586%, train=450%(100%), skip ratio=100%
>> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 
>> 31 20 30 2e 30
>> Can't encode transcription: '├ÿ423.1 0.0' in language ''
>> Iteration 1: GROUND  TRUTH : +1.5
>> Iteration 1: ALIGNED TRUTH : ++11..55
>> Iteration 1: BEST OCR TEXT : _______
>> File D:\Test\Dim_test.lstmf line 1 :
>> Mean rms=5.839%, delta=27.586%, train=362.5%(100%), skip ratio=100%
>> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 
>> 31 20 30 2e 30
>> Can't encode transcription: '├ÿ423.1 0.0' in language ''
>> Iteration 2: GROUND  TRUTH : +1.5
>> Iteration 2: ALIGNED TRUTH : ++11..55
>> Iteration 2: BEST OCR TEXT : _+++++++
>> File D:\Test\Dim_test.lstmf line 1 :
>> Mean rms=5.821%, delta=27.586%, train=325%(100%), skip ratio=100%
>> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 
>> 31 20 30 2e 30
>> Can't encode transcription: '├ÿ423.1 0.0' in language ''
>> Iteration 3: GROUND  TRUTH : +1.5
>> Iteration 3: ALIGNED TRUTH : ++11..55
>> Iteration 3: BEST OCR TEXT : +++++++
>> File D:\Test\Dim_test.lstmf line 1 :
>> Mean rms=5.798%, delta=25.862%, train=300%(100%), skip ratio=100%
>> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 
>> 31 20 30 2e 30
>> Can't encode transcription: '├ÿ423.1 0.0' in language ''
>> Iteration 4: GROUND  TRUTH : +1.5
>> Iteration 4: ALIGNED TRUTH : ++11..55
>> Iteration 4: BEST OCR TEXT : +++++++
>> File D:\Test\Dim_test.lstmf line 1 :
>> Mean rms=5.765%, delta=24.138%, train=285%(100%), skip ratio=100%
>> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 
>> 31 20 30 2e 30
>> Can't encode transcription: '├ÿ423.1 0.0' in language ''
>> Iteration 5: GROUND  TRUTH : +1.5
>> Iteration 5: ALIGNED TRUTH : +11..55
>> Iteration 5: BEST OCR TEXT : ++++...
>> File D:\Test\Dim_test.lstmf line 1 :
>> Mean rms=5.704%, delta=22.414%, train=266.667%(100%), skip ratio=100%
>> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 
>> 31 20 30 2e 30
>> Can't encode transcription: '├ÿ423.1 0.0' in language ''
>> Iteration 6: GROUND  TRUTH : +1.5
>> Iteration 6: ALIGNED TRUTH : +1..555
>> Iteration 6: BEST OCR TEXT : ++.
>> File D:\Test\Dim_test.lstmf line 1 :
>> Mean rms=5.519%, delta=19.704%, train=239.286%(100%), skip ratio=100%
>> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 
>> 31 20 30 2e 30
>> Can't encode transcription: '├ÿ423.1 0.0' in language ''
>> Iteration 7: GROUND  TRUTH : +1.5
>> Iteration 7: ALIGNED TRUTH : +1..5555
>> Iteration 7: BEST OCR TEXT : +.
>> File D:\Test\Dim_test.lstmf line 1 :
>> Mean rms=5.308%, delta=17.672%, train=215.625%(100%), skip ratio=100%
>> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 
>> 31 20 30 2e 30
>> Can't encode transcription: '├ÿ423.1 0.0' in language ''
>> Iteration 8: GROUND  TRUTH : +1.5
>> Iteration 8: ALIGNED TRUTH : +1...555
>> Iteration 8: BEST OCR TEXT : .
>> File D:\Test\Dim_test.lstmf line 1 :
>> Mean rms=5.101%, delta=16.092%, train=200%(100%), skip ratio=100%
>> Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 
>> 31 20 30 2e 30
>> Can't encode transcription: '├ÿ423.1 0.0' in language ''
>> Iteration 9: GROUND  TRUTH : +1.5
>> Iteration 9: ALIGNED TRUTH : +1...55
>> Iteration 9: BEST OCR TEXT : .......5
>> File D:\Test\Dim_test.lstmf line 1 :
>> Mean rms=4.888%, delta=14.828%, train=200%(100%), skip ratio=100%
>> At iteration 10/10/20, Mean rms=4.888%, delta=14.828%, char train=200%, 
>> word train=100%, skip ratio=100%,  New worst char error = 200 wrote 
>> checkpoint.
>>
>> Finished! Error rate = 100
>>
>> Later, running command, 
>> lstmtraining --stop_training --continue_from="D:\Test\output_checkpoint" 
>> --traineddata="D:\Test\eng\eng.traineddata 
>> --model_output="D:\Test\abc.traineddata"
>>
>> It am getting.
>> mgr_.Init(traineddata_path.c_str()):Error:Assert failed:in file 
>> ../../../../../src/training/lstmtrainer.h, line 96
>>
>>
>> My question is, does lstmtraining first generates output_checkpoints and 
>> then giving those to lstmtraining --stop_training, does it generates 
>> .trainneddata??? 
>> Thankyou.
>>
>> Regards,
>> Samruddhi
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/0f2c640a-c5bc-4a79-917c-30a7ca342d8dn%40googlegroups.com.

[tesseract-ocr] Re: lstmtraining query

Reply via email to