Tesseract 3 without windows installation

2011-05-03 Thread Andrew
Is there any way to get Tesseract 3 to work in Windows without doing a windows installation? Can I just copy the tesseract folder and maybe set an environment variable in windows? Thanks for your help! Andrew -- You received this message because you are subscribed to the Google Groups

[tesseract-ocr] PyTesseract not recognizing decimal points

2020-10-05 Thread Andrew
As per my question on StackOverflow: PyTesseract not recognizing decimals I'm using PyTesseract to recognise text in table cells. When it comes to recognising drug doses with decimal points, the OCR fails to rec

Re: [tesseract-ocr] PyTesseract not recognizing decimal points

2020-10-19 Thread Andrew
Fixed! Thank you, your suggestion worked. On Tuesday, October 6, 2020 at 6:36:39 PM UTC+10:30 shree wrote: > Have you tried cropping the image to remove the arrowhead to see if that > improves the result? > > On Tue, Oct 6, 2020 at 9:42 AM Andrew wrote: > >> As per my ques

[tesseract-ocr] A Simple grayscale image cannot be OCR'd

2022-12-10 Thread Andrew
I have processed imaged that seems pretty simple: 1) The image is gray scale 2) The image is 300 dpi 3) The font is Arial 20 pt (72 dpi) The image can be found here: https://i.imgur.com/8fXlqZY.png Tesseract (via tesseract.js) is unable to OCR this image. I have read the https://github.com/tess

[tesseract-ocr] does tesseract command support confidence value

2018-07-07 Thread Andrew Wang
Hi, does tesseract command support confidence value? Thanks! Andrew -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc

[tesseract-ocr] Using different images for OCR and display

2022-04-22 Thread Andrew M.
I'm using the latest version of Tesseract (5.0), and I'm trying to determine whether or not I can insert some preprocessing steps that will -not- affect the form of the final image. For example, I might start out with an image such as this . There are dif

Re: [tesseract-ocr] Using different images for OCR and display

2022-05-19 Thread Andrew M.
ct is taking b&w thresholded images (BLACK TEXT on > WHITE BACKGROUND, mind you! ;-) ) and producing raw text from that; > everything else you need before and after that step should be custom > tailored to your specific needs and quality levels by using additional > tooling and p

[tesseract-ocr] tesseract.js differing results

2024-04-27 Thread Andrew Haberbosch
I'm trying to use the tesseract.js npm package, and i'm getting worse results than a linux command line version of tesseract when I'm using the same options and same file. Am I missing something obvious? -- You received this message because you are subscribed to the Google Groups "tesseract-o

cant quite get tesseract to build on osx 10.8

2013-04-24 Thread Andrew Graham
Thanks to todd bealmears guide here - http://t0dd.io/2012/05/02/using-python-tesseract-on-os-x-lion.html i got far with installing tesseract on osx. However when I build I get some nasty output now- python setup.py build > > include path=/usr/local/include > > Current Version : 0.7 > > running b

Tesseract linking with Xcode 4.6.3

2013-07-17 Thread Andrew Hassevoort
Hello, I'm having some issues with linking when I attempt to use Tesseract with OpenCV on Xcode 4.6.3. I have the C++ standard library set to libstdc++, and this works for all of my OpenCV code. I also have my header search path set to /usr/local/include/**, and this successfully works with my

user-patterns details and examples

2014-01-20 Thread Andrew McGrath
Hey Everyone, This is my first post :-) Thanks for working on and maintaining this excellent tool! I'm trying to refine the accuracy of the results we're getting back from Tesseract and seem to have encountered a lack of documentation around the user-patterns file. My belief is that I should

Re: OCR char restriction

2014-01-20 Thread Andrew McGrath
Hey Sam, Did you ever get this working sufficiently? I'm using a user-pattern file containing the following: (\d\d\d) \d\d\d-\d\d\d\d www.\n\*.ca\n\* www.\n\*.com\n\* CHANGE DUE $\d\*.\d\d My hope is to detect phone numbers in the format of "(123) 123-1234", website address that are .ca and .co

Re: User pattern

2014-01-20 Thread Andrew McGrath
Did you ever resolve this issue? i'm having the same problem :-( On Tuesday, June 18, 2013 11:33:37 PM UTC-4, duongkha wrote: > > Any one can help me? > > On Monday, June 17, 2013 10:44:21 AM UTC+7, duongkha wrote: >> >> Hello all, >> >> Could anyone advise me how I can test the Tesseract User Pat

[tesseract-ocr] SetVariable - classify_save_adapted_templates - problem

2014-05-28 Thread Andrew Reed
Hello! I have a problem with the classify_save_adapted_templates option within SetVariable. I set this to "1" but when running tesseract I get the following error: Saving adapted templates to .a ...Class->NumConfigs == this->fontset_table_.get( Class->font_set_id).size:Error:Assert failed:in f

[tesseract-ocr] Re: produce delimited output using hOCR or by preserving original document spacing

2014-10-07 Thread Andrew Defries
Hello, For an R solution to importing text for text mining us the package tm. Check out line 6-11 in the following repo: https://github.com/andrewdefries/CorpusReaders/blob/master/CorpusReader/server.R Using tm you can import text and perform some operations: MyCorpus<-tm_map(MyCorpus, tolowe

[tesseract-ocr] Re: Language file for MICR font

2014-10-30 Thread Andrew Litvinov
For me too. The one shared by Hunter doesn't work. (Ubuntu 14.04 , tesseract version 3.03) On Monday, June 9, 2014 10:32:59 PM UTC+3, Anurag Kalra wrote: > > Ok, the MICR training data shared by Quan is now working for me. > -- You received this message because you are subscribed to the Google

[tesseract-ocr] Bank statement hOCR issues

2015-12-06 Thread Andrew Lentvorski
I'm trying to chew through an OCR for some bank statements, and I'm having difficulty with the hOCR. I could use some overall advice as well as specific issues. 1) The insertion of tags like without a corresponding bbox attribute is really irritating when trying to programmatically extract t

[tesseract-ocr] Tesseract 4 with LSTM, and random combination of letters/digits

2017-10-23 Thread Andrew J
I'm having to OCR images that are random combinations of letters/digits, e.g. AB1312, NSR2342, 22328A40 I've created a training text that creates data that's somewhat similar, and trained Tesseract 4 against it using one font. I noticed that unless the segments are short (~12 characters), accur

[tesseract-ocr] Re: fine tune Tesseract

2017-10-24 Thread Andrew J
You'll need something like this: training/lstmtraining --stop_training \ --continue_from ./trained/base_checkpoint \ --traineddata ./trained/eng/eng.traineddata \ --model_output ./trained/engoutput/eng.traineddata To "finish" the training On Tuesday, October 24, 2017 at 10:52:02 AM UTC-4,

[tesseract-ocr] Re: Tesseract 4 with LSTM, and random combination of letters/digits

2017-10-26 Thread Andrew J
Bump! On Tuesday, October 24, 2017 at 2:46:42 AM UTC-4, Andrew J wrote: > > I'm having to OCR images that are random combinations of letters/digits, > e.g. AB1312, NSR2342, 22328A40 > > I've created a training text that creates data that's somewhat similar, > a

[tesseract-ocr] Re: Tesseract 4 with LSTM, and random combination of letters/digits

2017-10-30 Thread Andrew J
Bump again. TLDR: when is the LSTM used in Tesseract 4? Should my approach be to always use Tesseract with OEM=0 when dealing with random strings of text? On Tuesday, October 24, 2017 at 2:46:42 AM UTC-4, Andrew J wrote: > > I'm having to OCR images that are random combinations

[tesseract-ocr] Tesseract and Hadoop streaming

2014-10-19 Thread Andrew Defries PhD
Can someone please direct me advice to setup hadoop streaming with tesseract running with command line options? > On Oct 19, 2014, at 8:11 AM, tesseract-ocr@googlegroups.com wrote: > > > tesseract-ocr@googlegroups.comGoogle Groups > Today's topic summary > View all top

[tesseract-ocr] Re: Works perfectly...except skips several lines

2016-12-02 Thread Andrew J Freyer
I can confirm I am experiencing the same issue described above. Entire lines in (what should be) very readable images are skipped consistently. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving