[tesseract-ocr] Similar pictures, different results

2018-05-15 Thread yang3781590
There are two similar pictures, the difference between them is the white edge size. One result is right(3.png) but the other is wrong(4.png). I don't know why, can you help me. I use the jTessBoxEditor to see the box. It shows that Tesseract has boxed out the right part.

[tesseract-ocr] How to train by tesseract 4.00

2018-06-03 Thread yang3781590
I have read that on the version of 4.00, the box file can be used only need to cover a textline instead of individual characters. So I make a box file like this 若存在,试求出实数λ的值; 0 0 256 48 0 Then I want to ask how to train it. Or is it the same version 3? 【tesseract chi_my.font.exp0.tif chi_

[tesseract-ocr] Unicharset_extractor meet ICU ERROR

2018-06-03 Thread yang3781590
Environment - Tesseract Version: <4.00> - Platform: Current Behavior: C:\Users\Jerry\Desktop\新建文件夹>unicharset_extractor chi_my.font.exp0.box Extracting unicharset from box file chi_my.font.exp0.box ICU ERROR: U_FILE_ACCESS_ERROR But I find this will be solved by use [tesseract-ocr-se