I am also training for Amharic. 
I am pretty sure you are using Windows OS. I had exactly the same problem 
with it. It think it is contingent with Unicode. But, I was not able to 
solve the issue. I now installed Ubuntu on the side; and everything works 
fine. 

On Tuesday, September 26, 2023 at 12:25:40 PM UTC+3 genet.g...@gmail.com 
wrote:

> I am new to tesseract and I have tried to train a Tesseract model for 
> Amharic language
>  
> and it never stops when it starts like this
> Can't encode transcription: 'ህ' in language '' Encoding of string failed! 
> Failure bytes: ffffffe1 ffffff8d ffffffad
>
>
> anybody aware of this problem and how can I fine tune amh.traineddata? I 
> have followed this tutorial GitHub - livezingy/tesstrain-win: Train 
> Tesseract LSTM with make on Windows 
> <https://github.com/livezingy/tesstrain-win/tree/master>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/f0f7b90a-2213-4c0f-a510-7f829b1778afn%40googlegroups.com.

Reply via email to