Didn't solve the issue
On Monday, February 4, 2019 at 6:34:25 PM UTC+5:30, shree wrote:
>
> https://github.com/tesseract-ocr/tessdata_best
>
> https://github.com/tesseract-ocr/tessdata
>
> On Mon, Feb 4, 2019 at 6:29 PM > wrote:
>
>> Where can i find the testdata_best or testdata?
>>
>> Still i am
Hi,
Recently i have success using Tesseract-ocr in converting PNG file into
Text.
Scenario: I am taking screenshot(PNG) of the Mobile app and using Tesseract
for converting PNG file into Text.
Question: When i convert PNG file into Text, can i also get
coordinates(X,Y) of the
To use ocrd you need to prepare image files and txt files with the same
name but different extension.
For example:
sample1.png
sample1.gt.txt
The gt.txt is a simple text file containing the correct text, 145, for
example.
The images must be cropped with no border or just a couple of pixels. Text
Can someone help me out on creating custom training data?
How is it done in tesseract ? Any tutorial or step by step guide would be
helpful.
Thank you.
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop re
Oh boy, where to start! So first of all you are not alone not finding any
information. Currently i am a week ahead of you, so im gonna share what i
found out.
Lets start with training_files.txt. Whats inside?
/home/kh/tesstutorial/engtrain/eng.Arial.exp0.lstmf/home/kh/tesstutorial/engtrain/eng.
Where can i find the testdata_best or testdata?
Still i am not able to get the result if i remove --oem 2 or use --oem 1
On Monday, February 4, 2019 at 4:45:04 PM UTC+5:30, sant...@artivatic.ai
wrote:
>
>
> I am using 'tesseract' command line to extract the information in this
> image.
>
> Tess
Where can i find the testdata_best or testdata?
Still i am not able to get the result if i remove --oem 2 or use --oem 1
On Monday, February 4, 2019 at 5:52:59 PM UTC+5:30, shree wrote:
>
> ubuntu@tesseract-ocr:~/TEST$ tesseract tmpy6s8p6m1.jpg stdout --psm 6
> --tessdata-dir ../tessdata_fast
>
https://github.com/tesseract-ocr/tessdata_best
https://github.com/tesseract-ocr/tessdata
On Mon, Feb 4, 2019 at 6:29 PM wrote:
> Where can i find the testdata_best or testdata?
>
> Still i am not able to get the result if i remove --oem 2 or use --oem 1
>
> On Monday, February 4, 2019 at 4:45:0
thx see this could be in the documentation it would be super awsome but
dont worry you dont have to do anything just answer my upcoming questions
and i will write it, but also gonna need a review on my final draft just to
make sure my wording and the facts i managed to dig up are correct
2019.
> kh@DSAD-6 /usr/share/tessdata
$ combine_tessdata -e ./eng.traineddata ~/tesstutorial/engoutput/eng.lstm
Extracting tessdata components from ./eng.traineddata
Wrote /home/kh/tesstutorial/engoutput/eng.lstm
You need the traineddata from tessdata_best repo for use with training.
On Mon, Feb 4
Im using Cygwin (64, on win10) to compile tesseract and I ran the
following commands and got the following error:
>
> kh@DSAD-6 /usr/share/tessdata
>
> $ tesstrain.sh --fonts_dir /usr/share/fonts --fontlist "Arial" "Impact
>> Condensed" --lang eng --linedata_only --noextract_font_properties
>>
Try your commands with --oem 1 or with default. It works fine
TESSDATA_PREFIX=/home/ubuntu/tessdata_best
$ tesseract -v
tesseract 4.0.0-272-g005f
leptonica-1.76.0
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff
4.0.6 : zlib 1.2.8 : libwebp 0.4.4 : libopenjp2 2.3.0
$
ubuntu@tesseract-ocr:~/TEST$ tesseract tmpy6s8p6m1.jpg stdout --psm 6
--tessdata-dir ../tessdata_fast
Warning: Invalid resolution 0 dpi. Using 70 instead.
1 GAAXCS8821M1Z8
ubuntu@tesseract-ocr:~/TEST$
ubuntu@tesseract-ocr:~/TEST$ tesseract tmpy6s8p6m1.jpg stdout --psm 6
--tessdata-dir ../tessdata_b
Really appreciate your help!! I will try to workout what you have sent.
Please send me your contact(email). Thanks again!
On Monday, February 4, 2019 at 1:12:36 PM UTC+5:30, Kristóf Horváth wrote:
>
> So i have the same issue as you, no clue how tesseract works because of
> bad documentaion, but
Also, what if there are huge gaps between words in line on picture? If I
set bounding box for the whole line, can Tesseract learn on that ?
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving email
I am using 'tesseract' command line to extract the information in this
image.
Tesseract 4.0.0-115-ge3a3
command used
tesseract tmpy6s8p6m1.jpg stdout --oem 2 --psm 6
result
19AAXCS8821M1Z8
tesseract 4.0.0-274-gc999
command used
tesseract tmpy6s8p6m1.jpg stdout --oem 2 --psm 6
result
I checked that too.. I cannot able to understand how should I give input to
tesseract, because it is not a book. I'm trying to do OCR for survey plans.
If possible, please send your working OCRD folder, So that I will have a
look and I will modify it. Please accept my invitation, So that I can a
Hello, after reading article about training Tesseract 4 (
https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00) I
found it very confusing.
My goal is to train existing model with new tiff/box pairs. After hours of
googling how to generate box files I found all this:
1) https://
Helllo
I'm completely new in tesseract, first version I'm using is 4.0.0. Sorry
for noob question, but I really didn't find answer despite quite long
searching.
Its about these two options -> eval_listfile and train_listfile. What
exactly should be in these files?
Is there in train_listfile.txt
19 matches
Mail list logo