Then there must be a mismatch between the unicharset you are using and the training text. eg. check whether the copyright symbol is in your unicharset.
On Sat, Jun 30, 2018 at 4:48 PM john <[email protected]> wrote: > I saw that link. this error occured many times,how can i prevent that? > > On Saturday, June 30, 2018 at 3:17:26 PM UTC+4:30, shree wrote: >> >> see >> https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00#error-messages-from-training >> >> On Sat, Jun 30, 2018 at 3:23 PM john <[email protected]> wrote: >> >>> Encoding of string failed! Failure bytes: ffffffc2 ffffffa9 20 ffffffd8 >>> ffffffa8 ffffffd8 ffffffa7 ffffffd8 ffffffae ffffffd8 ffffffaa ffffffd9 >>> ffffff86 ffffffd8 ffffffa7 20 ffffffd9 ffffff84 ffffffd8 ffffffa7 ffffffd8 >>> ffffffa4 ffffffd8 ffffffb3 20 ffffffdb ffffff8c ffffffd9 ffffff86 ffffffd8 >>> ffffffa7 ffffffd8 ffffffb1 ffffffdb ffffff8c ffffffd8 ffffffa7 20 ffffffd8 >>> ffffffa7 ffffffd8 ffffffa8 20 ffffffd8 ffffffaa ffffffd8 ffffffa8 ffffffd8 >>> ffffffab ffffffd9 ffffff87 20 ffffffd8 ffffffaf ffffffd8 ffffffa7 ffffffd9 >>> ffffff81 ffffffd8 ffffffaa ffffffd8 ffffffb3 ffffffd8 ffffffa7 20 ffffffd9 >>> ffffff86 ffffffdb ffffff8c ffffffd9 ffffff86 ffffffda ffffff86 ffffffd9 >>> ffffff85 ffffffd9 ffffff87 20 ffffffd9 ffffff82 ffffffd9 ffffff84 ffffffd8 >>> ffffffb7 ffffffd9 ffffff85 >>> Can't encode transcription: '۱۹ 2006© باختنا لاؤس یناریا اب تبثه دافتسا >>> نینچمه قلطم' in language '' >>> ^C >>> >>> when I finetune network for fas language i see top error? >>> what is wrong with training? >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/11d5277e-2ef1-4ae9-8cb3-3f38290c1dfc%40googlegroups.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/11d5277e-2ef1-4ae9-8cb3-3f38290c1dfc%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >> -- >> >> ____________________________________________________________ >> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/bb5696d3-f251-4181-a1a2-dcd6b0bbdf62%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/bb5696d3-f251-4181-a1a2-dcd6b0bbdf62%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXw2VNp6oik0MnyVoVg7oUUx7zqyqFT0jt6wxFZ0rP8kw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

