[tesseract-ocr] help for training

2024-11-21 Thread Mark Blueberry
Hello everyone. I want to train a tesseract to recognize characters from drawings, I work on Windows 10. I just need to get the model so that it can better spread the letters If someone has come across something like this, write down what resources you used, maybe there will be more detailed in

[tesseract-ocr] Re: Why is tesseract adding noise to a very quiet image, also missing some text, and adding other

2024-11-16 Thread Mark Bussey
It turns out that using PageSegMode.SingleBlock helped. On Wednesday, November 13, 2024 at 8:22:48 AM UTC-6 Mark Bussey wrote: > First attempt at using tesseract ocr. I am using Tesseract-ocr v3.02 (I > know it's old, but didn't see the need to include neural network code) to &

Re: [tesseract-ocr] Re: Post OCR Verification and Editing

2024-04-10 Thread Mark Pellegrino
source code. If you could point me in the right direction it would be greatly appreciated. Thanks again for your hard work on this, I'll certainly be in touch with more questions about Scribe. Mark On Sunday 31 March 2024 at 04:18:09 UTC-4 Jeremiah wrote: > There currently is no

Re: [tesseract-ocr] Re: Post OCR Verification and Editing

2024-03-08 Thread Mark Pellegrino
ll the best, On Fri, Mar 8, 2024 at 7:03 AM Merlijn B.W. Wajer wrote: > Hi Mark, > > On 07/03/2024 20:53, Mark Pellegrino wrote: > > I found more info here: > > > https://github.com/tesseract-ocr/tesseract/issues/1769#issuecomment-509490277 > > > > Glyph

Re: [tesseract-ocr] Re: Post OCR Verification and Editing

2024-03-08 Thread Mark Pellegrino
ext layer by yourself with custom font, have a > look at PyMuPDF: > >- https://github.com/pymupdf/PyMuPDF/discussions/775 (Adding text >layer to a scanned PDF) >- https://github.com/pymupdf/PyMuPDF/discussions/2464 (invisible text >layer) > > > Zdenko &

[tesseract-ocr] Re: Post OCR Verification and Editing

2024-03-07 Thread Mark Pellegrino
urce image with the hOCR? Does anyone have a simple workflow for editing/correcting Tesseract OCR documents that they can share? Thanks again, On Thursday 7 March 2024 at 14:17:28 UTC-5 Mark Pellegrino wrote: > Hello, > I'm trying to check PDFs made with Tesseract 5.2 for correc

[tesseract-ocr] Post OCR Verification and Editing

2024-03-07 Thread Mark Pellegrino
Hello, I'm trying to check PDFs made with Tesseract 5.2 for correctness using an OCR editor but am unable to open them in either Abbyy or Acrobat. If I try to open a Tesseract PDF with Abbyy FineReader/OCR Editor, the software just hangs and crashes. I can open Tesseract PDFs with Acrobat Pro,

[tesseract-ocr] Sharpen Image Text

2021-08-28 Thread Mark L
Will tesseract be able to sharpen this image text? I would like to be able to sharpen the text so I can extract it and convert the image to text (with tesseract). Thanks! -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from thi

[tesseract-ocr] GetBestLSTMSymbolChoices

2019-03-08 Thread Mark Polak
, but any language would word) which demonstrates using this function. Thanks, Mark -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc

Re: [tesseract-ocr] Re: increase the quality of image so that it extracts proper text from it.

2018-09-27 Thread Mark Phillips
Try —psm 11 or 12 Mark > On Sep 27, 2018, at 3:05 AM, hebiyaoziz...@gmail.com wrote: > > <32935.jpg> > > I have the same problem > > 在 2018年8月8日星期三 UTC+8上午1:29:09,May写道: >> >> Could you share the image that you used to process? >> >> On

[tesseract-ocr] Re: Install and run tesseract 4.0 on MAC OSX step by step

2018-09-26 Thread Mark Phillips
Not sure if this was right or now but given the before errors were "warnings" I continued onward and got these errors with Scrollview.jar - [Wed Sep 26-19:17:29][MEPMBP2017][(👨💻)markphillips](~/Documents/Development/Tesseract/tesseract/java) =>>SCROLLVIEW_PATH=~/Documents/Development/Tesserac

[tesseract-ocr] Re: Install and run tesseract 4.0 on MAC OSX step by step

2018-09-26 Thread Mark Phillips
I guess the build failed from before... [Wed Sep 26-19:27:21][MEPMBP2017][(👨💻)markphillips](~/Documents/Development/Tesseract/tesseract) =>>text2image --list_available_fonts --fonts_dir=/Library/Fonts dyld: Library not loaded: /usr/local/opt/icu4c/lib/libicui18n.62.dylib Referenced from: /u

[tesseract-ocr] Re: Invalid Digit recognition

2018-01-10 Thread mark
Hi Just stumbled on this forum while looking for answers as to why the Tesseract Demo on the site would fail with my images (using very similar approach of single digits in images etc etc) Found that scaling the image height by 50% worked a charm thanks!! Never thought to do that!! Also cropp

[tesseract-ocr] No output file

2017-11-02 Thread Mark Ambrazhevich
hi, when i use tresseract test.png stdout i get text in cmd, but when i use tresseract test.png out i get no out.txt file in the directory, also i searched it but there is no out.txt on computer -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" g

[tesseract-ocr] Can tesseract read shiny metal surfaces?

2016-07-21 Thread Mark Greally
com/-AqH3mBd1irQ/V5CwYkeiTSI/AG8/VoQQG0fTzUEwa4km8nKQIv14IQBTXLa_ACLcB/s1600/Text.jpg> Is it possible for tesseract to read such images? Thanks, Mark. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from thi

[tesseract-ocr] Can user-words file have punctuation? Should it have or not?

2015-11-23 Thread Mark
Hello, I use Tesseract 3.04 on Ubuntu 12.04. I know the words of all the papers I need to scan and I'm gonna put all of them in the user-words file but many of them are included in brackets "()" or have "-" and I also have words like these: Anti-HAV (Angoron) AMH/MIS Aντισ.Εναντι (B-HCG) C1 +C

[tesseract-ocr] Re: Tesseract OCR x86 issue

2015-08-11 Thread Mark Seidner
Hi, I maintain the latest release of Tesseract 3.04 in a commercial product, please see www.topocr.com We build a 32-bit binary for Windows using VS2005, we had to do a small amount of hacking to get it to compile, but it runs just fine! On Sunday, August 9, 2015 at 5:19:23 AM UTC-5, Kenz Ken

[tesseract-ocr] Re: differences between version 3.03 and 3.04

2015-07-13 Thread Mark Seidner
uld not be necessary, and I'll try that later, but there were just two small problems to "fix" which didn't take long, and then everything worked fine even with an old compiler like that! -Mark-- On Saturday, July 11, 2015 at 12:14:55 AM UTC-5, Mark Seidner wrote: > >

[tesseract-ocr] Re: differences between version 3.03 and 3.04

2015-07-12 Thread Mark Seidner
according to my testing that release 3.04 has not provided any accuracy increase. Are there any other reasons to switch over to the 3.04 release? On Saturday, July 11, 2015 at 12:14:55 AM UTC-5, Mark Seidner wrote: > > Hi everyone, >I downloaded the latest 3.04 code from git and did a

[tesseract-ocr] Re: differences between version 3.03 and 3.04

2015-07-11 Thread Mark Seidner
ing useful I'll post here On Saturday, July 11, 2015 at 12:14:55 AM UTC-5, Mark Seidner wrote: > > Hi everyone, >I downloaded the latest 3.04 code from git and did a build on Windows, > when I tested on some english files with OEM_TESSERACT_CUBE_COMBINED, there > was no

[tesseract-ocr] differences between version 3.03 and 3.04

2015-07-10 Thread Mark Seidner
Hi everyone, I downloaded the latest 3.04 code from git and did a build on Windows, when I tested on some english files with OEM_TESSERACT_CUBE_COMBINED, there was no difference in accuracy between Tesseract 3.03 and Tesseract 3.04. I haven't tried OEM_TESSERACT_ONLY yet, to see if there's a

[tesseract-ocr] Automatic Number Plate Recognition

2014-11-20 Thread Mark Beylis
Hello I am making use of Tesseract OCR to perform number plate recognition on vehicles I am making use of jTessBoxEditor v1.1 to check my box and tif files At the moment each iteration of my training consists of using about 250 - 300 number plates I have read in many places that one should tr

[tesseract-ocr] Tesseract missing quite obvious word

2014-10-09 Thread Mark Zealey
rey, the words around it are also usually shown with joined blobs - can anyone recommend a config option to fix this, I've tried messing with all manner of different config options but can't seem to make any difference. The file is here: http://mark.zealey.org/tessmissingword.jp

[tesseract-ocr] Tesseract much more effective when run from terminal than from API

2014-08-23 Thread Mark Wang
e it seems to work much worse. Could anyone help me with why this might be? Thanks! Mark -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to te

Re: lib and all in win 64bit

2013-09-22 Thread Mark Hebberd
would love them please, nothumph...@gmail.com On Thursday, December 20, 2012 3:44:13 PM UTC+13, beigon...@gmail.com wrote: > > I have built the 64bit libs and dlls in windows.I want to share with > others. If you want ,send me emails. -- -- You received this message because you are subscribed

Re: Extracting text from Gas Sign

2012-09-02 Thread Mark Stephens
- > Zdenko > > Dňa 01.09.2012 22:17, Mark Stephens wrote / napísal(a): > > Perhaps it was a poor assumption but I would have thought it would be > > relatively easy to extract the text from a gas sign. I've tried several > > different psm settings as well as different

Re: Errors using tessnet 2

2012-07-25 Thread mark kuz
среда, 2 марта 2011 г., 8:16:41 UTC+4 пользователь dmerala написал: > > Hey everyone, > I'm trying to use tesseract and tessnet2 to OCR some stuff. I go to > this page: > > http://www.pixel-technology.com/freeware/tessnet2/ > > and do the steps under "Quick Tessnet2 usage." My project builds

Re: Tesseract testing within Japanese Windows environment?

2009-08-18 Thread Mark Nalevanko
l characters. Thanks, Mark On Aug 10, 5:18 pm, Ray Smith wrote: > There's no specific reason why it shouldn't work. If you can get details of > the failure, (stack trace etc) then I can investigate.When 3.00 becomes a > proper release, there will be a Japanese language file,

Tesseract testing within Japanese Windows environment?

2009-08-07 Thread Mark Nalevanko
seract may be returning when run in this environment so to report a more formal issue, but thought I'd go ahead and post to see if anyone had comments on whether they think Tesseract should/should not work under these conditions and if not, is it easily fixable