Re: [tesseract-ocr] Tesseract OCR LCD digits doesn't work

2022-06-26 Thread Zdenko Podobny
Check your tesseract version (tesseract -v). Here is mine: tesseract 5.1.0-70-g0df5 leptonica-1.83.0 (Jun 24 2022, 17:48:50) [MSC v.1929 LIB Release x64] libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.0.91) : libpng 1.6.37 : libtiff 4.4.0 : zlib 1.2.12 : libwebp 1.2.2 : libopenjp2 2.5.0 Found AVX2

Re: [tesseract-ocr] Extracting alphanumeric identifiers (ISINs)

2022-06-26 Thread Zdenko Podobny
Hello Stefan, recognizing such codes (e.g. no words) is difficult since some letters could be easily replaced (e.g zero with capital O, 1 with l ). I had a discussion with one commercial provider of data extraction from invoices (based on commercial OCR engines) and their claim that you always ne