[tesseract-ocr] Accuracy for Korean language with Tesseract LSTM is coming for me around 50%, Is this expected or am I missing something ?

2017-12-15 Thread Subrato Namata
Hi, I have been running tesseract 4.0 alpha with LSTM mode for English, French and German and everywhere I'm getting close to 90% accuracy for these languages. However, with Korean language, I'm only getting 50% at max accuracy. Is it expected ? I'm using best trained data for Korean. And bel

[tesseract-ocr] Re: Invalid Digit recognition

2017-12-15 Thread james . quittenton
BTW, This is the image as I amended it. On Friday, December 15, 2017 at 6:54:10 AM UTC, richardc...@gmail.com wrote: > > Hi, I've been using the bytedeco javacpp wrap

[tesseract-ocr] Re: Invalid Digit recognition

2017-12-15 Thread james . quittenton
The first thing that strikes me is the numbers are slightly skewed. It also seems like quite a tall thin font and maybe tesseract isn't trained on it. I took your 463 image and straightened it, then resized it to make it about 25% shorter. Tesseract then reads it fine for me. If you know you alw

Re: [tesseract-ocr] Re: Trying to add chars to tesseract 4.0

2017-12-15 Thread ShreeDevi Kumar
>>Thanks, I have read that new tesseract-ocr 4.0 doesn't use wordlist anymore. It meat for older version? is that right? New 4.0alpha version does not REQUIRE the wordlist, but uses it, if available, and the accuracy is improved based on the wordlist. So, basically, 4.0alpha will work without wor

Re: [tesseract-ocr] Re: Trying to add chars to tesseract 4.0

2017-12-15 Thread Fahad Al-Saidi
Thanks, I have read that new tesseract-ocr 4.0 doesn't use wordlist anymore. It meat for older version? is that right? On Fri, Dec 15, 2017 at 12:13 PM, shree wrote: > > > On Friday, December 8, 2017 at 5:46:01 PM UTC+5:30, Fahad Al-Saidi wrote: >> >> >> I have the same problem, why not the new

[tesseract-ocr] Re: Trying to add chars to tesseract 4.0

2017-12-15 Thread shree
On Friday, December 8, 2017 at 5:46:01 PM UTC+5:30, Fahad Al-Saidi wrote: > > > I have the same problem, why not the new fine tuned traineddata include > the old wordlist? It suppose to do so. I followed the instructions in the > wiki but I got the same issue. Any help? > If you want the wordl