hello, i finetuned the eng.traineddata model because i wanted it to reconize the greek lambda symbol. I got the ocrlambda.traineddata file and i want to evaluate it using lstmeval.
when i eval a checkpoint file with --trainedata parameter set to eng.traineddata i get terrible results with this error on every iteration where the lambda appear. ************************ Encoding of string failed! Failure bytes: ce bb 4d 4e 44 Can't encode transcription: 'LY O kcXλMND' in language '' Truth:8Az7V I vUOs OCR :i g8 A z. 7 V I vlU O0 s.l Line BCER=1.000000, BWER=0.666667 ************************* but when i train with a --trainedata parameter set to ocrlambda.traineddata i get really good results. but on the doc a saw that the trainedata parameter should be set to the file that was given to the trainer. is it an error or do i understand or do anything wrong ? thanks a lot -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/44163478-529c-4734-88f8-dba3276792can%40googlegroups.com.

