[tesseract-ocr] Tesseract Open Source OCR Engine v3.04.01 with Leptonica read_params_file: Can't open 7 read_params_file: Can't open c read_params_file: Can't open tessedit_char_whitelist=0123456789.%

2019-05-03 Thread maneesha . intern
All the issues i see around this subject are old but i still had many issues setting up tesseract on an aws ec2 instance. $TESSDATA_PREFIX is set to /usr/local/share and the eng.traineddata is present in the tessdata ( /usr/local/share/tessdata) folder along with the config files. Can someone pl

[tesseract-ocr] Tesseract-ocr 3.05 vb6 integration

2019-05-03 Thread Giuseppe Romano
Hi, my name is Giuseppe, i have this problem. I use tesseract-ocr 3.05 with shell integration, but is really slow. i haven't find the way to use it by dll, is this possible? The project unfortunatly must be developed on vb6 platform. Thanks -- You received this message because you are subscribe

Re: [tesseract-ocr] Fine tuning existing model

2019-05-03 Thread Lorenzo Bolzani
See answer inline. Il giorno ven 3 mag 2019 alle ore 03:48 Tairen Chen ha scritto: > > 1. I define the "--max_iterations 2" but the training stops at > 5700, like below: > " At iteration 351/5700/5700, Mean rms=0.117%, delta=0%, char > train=0%, word train=0%, skip ratio=0%, wr

Re: [tesseract-ocr] Fine tuning existing model

2019-05-03 Thread Shree Devi Kumar
>There are three model sizes: best, normal and fast. Each of these can also be converted to an integer model. Only `best` can be converted to integer and in fact the LSTM models in `tessdata` are the integer versions of best along with the base/legacy models. `fast` models have been trained with

Re: [tesseract-ocr] Fine tuning existing model

2019-05-03 Thread Lorenzo Bolzani
Shree, thanks for the clarification. Il giorno ven 3 mag 2019 alle ore 11:59 Shree Devi Kumar < shreesh...@gmail.com> ha scritto: > >There are three model sizes: best, normal and fast. Each of these can > also be converted to an integer model. > > Only `best` can be converted to integer and in fa

Re: [tesseract-ocr] Re: configure: error: Required OpenCL library not found!

2019-05-03 Thread jbdata31
I need to compile, by the way I use *OpenCL*. My roadact on Ubuntu: sudo apt-get install opencl-headers sudo apt install ocl-icd-opencl-dev export LDFLAGS=-L/usr/lib/x86_64-linux-gnu ./configure --enable-debug --enable-opencl ... Configuration is done. You can now build and install tesseract by ru

Re: [tesseract-ocr] Tesseract Open Source OCR Engine v3.04.01 with Leptonica read_params_file: Can't open 7 read_params_file: Can't open c read_params_file: Can't open tessedit_char_whitelist=01234567

2019-05-03 Thread Zdenko Podobny
First of all: 3.04 is old. If you want to use 3.x tesseract use 3.05.02 There were a lot of bugfixes (not related to to you problem) Next: if you see "read_params_file" in error message, it means (for 98% ;-)) your command for running tesseract is wrong. And you post everything else but not how yo

Re: [tesseract-ocr] Tesseract-ocr 3.05 vb6 integration

2019-05-03 Thread Zdenko Podobny
What do you mean with "i haven't find the way to use it by dll"? "is this possible?" yes it is. tesseract.exe use tesseract40.dll. So you can use is at any other library. Zdenko pi 3. 5. 2019 o 9:16 Giuseppe Romano napísal(a): > Hi, > my name is Giuseppe, i have this problem. I use tesseract-o

Re: [tesseract-ocr] Fails to recognize seemingly simple text

2019-05-03 Thread Arjun Bk
Hi Lorenzo, The link you shared was very helpful and your valuable suggestions helped me a lot. Now the image detection seems to work for at least 90% of my cases. Just for information, I started with a top-hat transform followed by the Sobel operator and then applied the Otsu thresholding. Th

Re: [tesseract-ocr] Tesseract-ocr 3.05 vb6 integration

2019-05-03 Thread Giuseppe Romano
Thanks for the answer but my tesseract version is the 3.05 and tesseract 40.dll in the installation folder unfortunately not exist. Tesseract.exe use libtesseract-3.dll and i haven't api entry point, can you help me for the api declarations? Thanks Il giorno ven 3 mag 2019 alle ore 16:09 Zdenko Po

Re: [tesseract-ocr] Tesseract-ocr 3.05 vb6 integration

2019-05-03 Thread Zdenko Podobny
it is just older version, but basics are the same: https://github.com/tesseract-ocr/tesseract/blob/3.05/api/baseapi.h Or C-API: https://github.com/tesseract-ocr/tesseract/blob/3.05/api/capi.h Zdenko pi 3. 5. 2019 o 16:57 Giuseppe Romano napísal(a): > Thanks for the answer but my tesseract ver

Re: [tesseract-ocr] Fine tuning existing model

2019-05-03 Thread Tairen Chen
Hi, Lorenzo, Thank you very much for your reply. It really gives more clue about the training. All the best, Tairen On Friday, May 3, 2019 at 2:30:12 AM UTC-7, Lorenzo Blz wrote: > > See answer inline. > > Il giorno ven 3 mag 2019 alle ore 03:48 Tairen Chen > ha scritt

Re: [tesseract-ocr] Fine tuning existing model

2019-05-03 Thread Tairen Chen
Thank you for your further explanation, Shree!! On Friday, May 3, 2019 at 2:59:12 AM UTC-7, shree wrote: > > >There are three model sizes: best, normal and fast. Each of these can > also be converted to an integer model. > > Only `best` can be converted to integer and in fact the LSTM models in

Re: [tesseract-ocr] How to increase tesseract model accuracy

2019-05-03 Thread tcs49
How did you add a blacklist? On Monday, April 29, 2019 at 11:32:14 PM UTC-4, Jonathan wrote: > > If you know you won't have numbers, what worked for me is blacklisting > numbers. Otherwise you will have to improve the image quality (like > resizing to bigger size and sharping the edges) > > On M

[tesseract-ocr] Tesseract - Fails on sharp image, but works on blurred

2019-05-03 Thread Kamal Muradov
The OCR gets more accurate with each blur... I've tried adding margins and resizing and the results are the same. Anyone else have this issue? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving e

Re: [tesseract-ocr] Tesseract-ocr 3.05 vb6 integration

2019-05-03 Thread Giuseppe Romano
Thank, i try to use this on my vb6 project. Best regards Giuseppe Il giorno ven 3 mag 2019 alle ore 17:04 Zdenko Podobny ha scritto: > it is just older version, but basics are the same: > https://github.com/tesseract-ocr/tesseract/blob/3.05/api/baseapi.h > > Or C-API: > https://github.com/tesser

[tesseract-ocr] After fine tunning training, how do i run on the new model?

2019-05-03 Thread thiyamjennil
I perform fine tuning training by adding some extra data on the existing model of ben.traindata, i named the "ds_10k.traindata", now how do i perform the OCR task on this model ? In default, using the ben.traindata, i just follow the command provided by tesseract, but now i wanna know do i nee