Any ideas?
Mert T schrieb am Donnerstag, 8. Februar 2024 um 17:16:16 UTC+1:
> Hello,
>
> I'm new to Tesseract and have the problem that the text recognition has
> many errors. What I'm doing is scanning a prescription in German, and I
> want to show only certain areas.
> So I created certain ar
Re "X" checkbox:
Since this is a (I assume) standardized form, those checkboxes are at
known, fixed, positions.
Couple of thoughts:
1: assuming everyone "crosses" a checkbox is a faulty assumption. Some
people, depending on circumstances, "blacken" the box in other ways, all
legal and to be expe
On Thu, 15 Feb 2024, 17:06 Ger Hobbelt, wrote:
> Re "X" checkbox:
>
>
More shorthand examples in your "input language":
Tabl. = tablet (pill)
tägl = täglich (German: daily dosage)
I mention these extra examples (visible in the scanned images) as I find
generally people have a hard time wrap
Re tesseract output for "mittag" etc in your sample: first port of call for
"cleaning up dot matrix printer" for OCR, i.e. dedicated image
preprocessing would be googling
leptonica image morphology, open close expand dilate dot matrix
or some such.
While I would go with using leptonica for that,
4 matches
Mail list logo