I have uploaded a new version of traineddata file at
https://github.com/Shreeshrii/tessdata_shreetest/blob/master/iast-layer-18003.traineddata
Attached is the OCRed output for pages 13-24 of dark pdf with it.
I am still training a different variation.
On Wed, Jun 27, 2018 at 6:46 PM Shree Devi
Also check that there is no tab or other unprintable character in your
training text.
Which version of tesseract are you using? show output of
tesseract -v
On Sat, Jun 30, 2018 at 8:04 PM Shree Devi Kumar
wrote:
> Then there must be a mismatch between the unicharset you are using and the
> t
Then there must be a mismatch between the unicharset you are using and the
training text. eg. check whether the copyright symbol is in your unicharset.
On Sat, Jun 30, 2018 at 4:48 PM john wrote:
> I saw that link. this error occured many times,how can i prevent that?
>
> On Saturday, June 30, 2
I saw that link. this error occured many times,how can i prevent that?
On Saturday, June 30, 2018 at 3:17:26 PM UTC+4:30, shree wrote:
>
> see
> https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00#error-messages-from-training
>
> On Sat, Jun 30, 2018 at 3:23 PM john >
> wrote:
see
https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00#error-messages-from-training
On Sat, Jun 30, 2018 at 3:23 PM john wrote:
> Encoding of string failed! Failure bytes: ffc2 ffa9 20 ffd8
> ffa8 ffd8 ffa7 ffd8 ffae ffd8 ffaa ffd9
Encoding of string failed! Failure bytes: ffc2 ffa9 20 ffd8
ffa8 ffd8 ffa7 ffd8 ffae ffd8 ffaa ffd9
ff86 ffd8 ffa7 20 ffd9 ff84 ffd8 ffa7 ffd8
ffa4 ffd8 ffb3 20 ffdb ff8c ffd9 ff86 ffd8
f
Hello,
thank you for your answer.
I have found the answer in LibreOffice: File open/filtered as txt- text
encoding, then chose utf-8
See regard
Martin
Am 29.06.2018 um 19:45 schrieb Zdenko Podobny:
this is not tesseract problem:
https://ask.libreoffice.org/en/question/97993/why-doesnt-lo-w
7 matches
Mail list logo