[tesseract-ocr] Re: Received error unexpected surface format -1 following tesseract training (LSTM) tutorial.

2019-09-16 Thread David Maung
I believe I have found the issue. The TrainingTesseract 4.0 document is unclear on which directory to run the command from. Specifically, under *Building the Training Tools*, it asks you to run the following commands. make make training sudo make training-install I performed this task where

[tesseract-ocr] Unclear error message when running tesstrain.sh

2019-09-16 Thread David Maung
Hello, I attempted to run the following command src/training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng --linedata_only --noextract_font_properties --langdata_dir ~/tesstutorial/langdata --tessdata_dir ~/tesstutorial/tesseract/tessdata --output_dir ~/tesstutorial/engtrain (which is

[tesseract-ocr] Re: individual word image buffers

2019-09-16 Thread David Maung
Hello, I don't know about dumping image buffers while debugging; however, there are 2 command line options and the API that might help you or point you in the right direction. https://github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage The TSV and HOCR options allow you to output the re

Re: [tesseract-ocr] Getting started with tesseract-ocr in a web app.

2019-09-16 Thread Clint William Theron
So I visited the pull request page of tesseract-ocr, copied the url, added a # in front of the url and concatenated it with the url of gitpot.io. This allowed me to auto install tesseract-ocr. Afterwards I ran those two commands in the link from @Lorenzo

[tesseract-ocr] Docker Image for Tesseract 4.1

2019-09-16 Thread Hongguo An
Hi: Is there any docker image for Tesseract 4.1? I tried tesseractshadow/tesseract4re . but its has an older version of 4.0 When I tried to upgrade based above image, it installed 4.1 but can't run it apt install tesseract-ocr debconf: dela

Re: [tesseract-ocr] Getting started with tesseract-ocr in a web app.

2019-09-16 Thread Clint William Theron
com'on guys, you might think this should be easy for me but it's not. I admit, I lack to see the simplicity. ;-) -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email t

Re: [tesseract-ocr] Getting started with tesseract-ocr in a web app.

2019-09-16 Thread Timothy Snyder
Have you tried calling the tesseract executable from the command line yet? Can we confirm that you've successfully downloaded and compiled Tesseract? On Monday, September 16, 2019 at 5:13:20 PM UTC-4, Clint William Theron wrote: > > com'on guys, you might think this should be easy for me but it'

Re: [tesseract-ocr] Getting started with tesseract-ocr in a web app.

2019-09-16 Thread Clint William Theron
The question of this post is not really applicable anymore. For now I just want to do OCR with tesseract on a node.js server which I have on gitpod.io service. I said what I have done till now... ps. What questions go through your mind while reading this that I omitted to answer that's preventi

Re: [tesseract-ocr] Getting started with tesseract-ocr in a web app.

2019-09-16 Thread Clint William Theron
*Have you tried calling the tesseract executable from the command line yet?* *Like so: * [image: Untitled.png] *Can we confirm that you've successfully downloaded and compiled Tesseract?* It's downloaded but not sure about compile...I'll research how to do that now. -- You received this mes

Re: [tesseract-ocr] Getting started with tesseract-ocr in a web app.

2019-09-16 Thread Timothy Snyder
If you downloaded Tesseract's source code from GitHub (which I think you did), you will have to follow the compilation steps for Linux on this page https://github.com/tesseract-ocr/tesseract/wiki/Compiling#linux On Mon, Sep 16, 2019 at 5:48 PM Clint William Theron < theronclintwill...@gmail.com>

Re: [tesseract-ocr] Getting started with tesseract-ocr in a web app.

2019-09-16 Thread Clint William Theron
should I run the following command apt-get install automake ca-certificates g++ git libtool libleptonica-dev make pkg-config I just want to make sure. Up until now all I did was: 1. Register on gitpod.io 2. concatenated gitpod.io with # https://github.com/tesseract-ocr/tesseract/pulls to downl

Re: [tesseract-ocr] Getting started with tesseract-ocr in a web app.

2019-09-16 Thread Clint William Theron
Consider the following screenshot... [image: Untitled.png] This is a sudo problem on gitpod.io. Is that correct? If yes, is there another way I can do this? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and

Re: [tesseract-ocr] Getting started with tesseract-ocr in a web app.

2019-09-16 Thread Clint William Theron
I also tried this: gitpod /workspace/tesseract $ sudo apt install libtesseract-dev sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges? gitpod /workspace/tesseract $ apt install tesseract-ocr E: Could not op

Re: [tesseract-ocr] Getting started with tesseract-ocr in a web app.

2019-09-16 Thread Clint William Theron
The above commands (with output) I found here -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails

[tesseract-ocr] Open source (BSD) MICR dataset for Tesseract v4 + evaluation app

2019-09-16 Thread Mamadou
Hello, We've open sourced (BSD 3-Clause License) our MICR dataset and *.traineddata for Tesseract v4. This was developed as an internal R&D project and never went to production as we ended using Tensorflow. Even as a PoC it's already more accurate than many commercial products. The repo conta

[tesseract-ocr] why finetune with adding few characters can work?

2019-09-16 Thread Du Kotomi
Hi, I know tesseract4.0 used LTSM network to train a classification task. Therefore, vector numbers in the last layer is equal to class numbers, e.g. we have 100 characters to recognize, then the vector number is 100. My question is when we add few characters for finetuning, lets say 3 charac

Re: [tesseract-ocr] Trained data for E13B font

2019-09-16 Thread Mamadou
Hello, Thanks again for sharing your E-13B traineddata, it was helpful. We’ve managed to get good accuracy for E-13B with Tesseract but failed with CMC-7. So, we ended using TensorFlow for both fonts. I’m curious to know which level of accuracy you’ve reached. You can check our accuracy for

Re: [tesseract-ocr] Open source (BSD) MICR dataset for Tesseract v4 + evaluation app

2019-09-16 Thread René Hansen
Very cool. Thank you for open sourcing this! /René On Tue, 17 Sep 2019 at 07:38, Mamadou wrote: > Hello, > > We've open sourced (BSD 3-Clause License) our MICR dataset and > *.traineddata for Tesseract v4. > > This was developed as an internal R&D project and never went to production > as we e