Re: [tesseract-ocr] How to restrict OCR character set.

2019-03-30 Thread Martin Emmerson
> >> >> https://github.com/Shreeshrii/tessdata_shreetest/commit/0108263ad0c4c9bd11e0c8190a81fb36e2e4e56a >> >> >> On Sat, Mar 30, 2019 at 1:47 AM Martin Emmerson > > wrote: >> >>> Yikes! Thanks for the reply, but I could barely follow the discus

Re: [tesseract-ocr] How to restrict OCR character set.

2019-03-29 Thread Martin Emmerson
#x27;m not). Thanks anyway; I'll try to figure out some external workarounds. On Thursday, March 28, 2019 at 11:03:59 PM UTC-7, shree wrote: > > See https://github.com/tesseract-ocr/tesseract/pull/2294 > > On Fri, 29 Mar 2019, 11:17 Martin Emmerson, > wrote: > >> Is there a

[tesseract-ocr] How to restrict OCR character set.

2019-03-28 Thread Martin Emmerson
Is there a way to restrict the character set that tesseract-ocr will attempt to identify? I'm scanning USA-based receipts which have a fairly simple set of monospaced characters but, for example, often '1' will get misidentified as '|', and a whole host of other simple substitution errors. If