I am also training for Amharic. I am pretty sure you are using Windows OS. I had exactly the same problem with it. It think it is contingent with Unicode. But, I was not able to solve the issue. I now installed Ubuntu on the side; and everything works fine.
On Tuesday, September 26, 2023 at 12:25:40 PM UTC+3 genet.g...@gmail.com wrote: > I am new to tesseract and I have tried to train a Tesseract model for > Amharic language > > and it never stops when it starts like this > Can't encode transcription: 'ህ' in language '' Encoding of string failed! > Failure bytes: ffffffe1 ffffff8d ffffffad > > > anybody aware of this problem and how can I fine tune amh.traineddata? I > have followed this tutorial GitHub - livezingy/tesstrain-win: Train > Tesseract LSTM with make on Windows > <https://github.com/livezingy/tesstrain-win/tree/master> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/f0f7b90a-2213-4c0f-a510-7f829b1778afn%40googlegroups.com.