from:"Ben Bongalon"

Re: [tesseract-ocr] advice for OCR'ing 9-pin dot matrix BASIC code

2021-01-05 Thread Ben Bongalon

licly > so going forward I can help the next person. > > Keith > > > Original message > From: Ben Bongalon > Date: 1/5/21 11:56 PM (GMT-05:00) > To: Keith M > Cc: tesseract-ocr@googlegroups.com > Subject: Re: [tesseract-ocr] advice for OCR'i

Re: [tesseract-ocr] advice for OCR'ing 9-pin dot matrix BASIC code

2021-01-05 Thread Ben Bongalon

of document, DPI/resolution, font, or anything.I > know I sound like a broken record. Current numbers include stats like > 44% of the 100-page document is 95% or better confidence. Now those > lines could still be wrong, but they look pretty decent in a quick scan. > &g

Re: [tesseract-ocr] advice for OCR'ing 9-pin dot matrix BASIC code

2021-01-05 Thread Ben Bongalon

Hi Keith, Interesting project. Having looked at the sample OCR results that Alex posted, I think the poor recognition from Tesseract is more likely due to the underlying language model used (I'm assuming you used 'eng'?). For example, the "test1" OCR results correctly transcribes the variables

[tesseract-ocr] How to generate .lstmf file with non-randomized lines

2021-01-04 Thread Ben Bongalon

Hello and Happy New Year, I am training Tesseract 4 to recognize special characters in a Philippine bilingual dictionary (specifically Hanunoo -> English). Following the "Fine Tuning" tutorial but using Spanish as starting model, I am getting good recognition accuracy on some characters such a

Re: [tesseract-ocr] advice for OCR'ing 9-pin dot matrix BASIC code

Re: [tesseract-ocr] advice for OCR'ing 9-pin dot matrix BASIC code

Re: [tesseract-ocr] advice for OCR'ing 9-pin dot matrix BASIC code

[tesseract-ocr] How to generate .lstmf file with non-randomized lines

4 matches

Site Navigation

Mail list logo

Footer information