[tesseract-ocr] Watching the learning iteration is better method than watching the BCER

Des Bw Wed, 18 Oct 2023 00:10:04 -0700

I am just writing a little observation here for beginners like me. 
( would love to be corrected if I am wrong). 
I am training by cutting the top layer of a best model; to improve the 
existing model. I have about 400,000 lines of texts; and generated the box 
and images files using text2image.


As I am training the model, I am getting BCER very low very fast. It took 
me not even two epochs to reach to BCER to  0.001. That might sound a good 
thing for an inexperienced user like me. But, as I am try the output model, 
the accuracy is nowhere as good as the default best model.  So, I have to 
change t the target_error parameter to lower (0.0001), keep on training; 
and the model is getting better and better. 

So, it looks like watching watching  your learning iteration,  which is the 
first number from the number of iterations 
(https://tesseract-ocr.github.io/tessdoc/tess4/TrainingTesseract-4.00.html#iterations-and-checkpoints)
 
is a better approach than to watch the BCER. If the learning iteration 
keeps on growing, that means, the model is still learning. You need to keep 
on training, regardless of the BCER. 

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/2dd7dec1-0e98-4a81-9b35-520131ed07f0n%40googlegroups.com.

[tesseract-ocr] Watching the learning iteration is better method than watching the BCER

Reply via email to