[tesseract-ocr] Re: Digits reading optimalisation.

Владимир Калачихин Sat, 30 Jan 2021 08:03:17 -0800

Heh. It's an old issue.
For 100% accuracy, you must use a digit-only language model. But there is 
no such thing.
Besides trivial perceptron shows good results on digits recognition.
суббота, 30 января 2021 г. в 18:41:13 UTC+3, Benek:


> Hello! I'm trying to read some digits and I thought it was a rather simple 
> task yet still I can't receive satisfying results. So my first question is: 
> is it possible to get 100% accuracy when reading some standardized input? 
> Or there will be always some errors when reading?
>
> Here are some sample inputs that I wanted to read:
> The digits that are being misread are:
> on the photo t3
> 5.1 is read as 9.1
> on the photo t4
> : 10.2 is read as 610.2
>
> I'm using:
>
> tesseract/4.1.1
>
> config :
>
> oem: 3,
>
>     psm: 11,
>
>     tessedit_char_whitelist: "0123456789.",
>
>     load_system_dawg: false,
>
>     load_freq_dawg: false,
>
> The images have 2700x2100 resolution.
>
> The 999 on the left are markers that I added to be able to recognize which 
> line belongs to which output text and they are always read correctly.
> I tried experimenting with some different image preprocessing techniques 
> like blur, median, changing the size of the image etc.
>
> Do you have any other tips that could lead to better reading accuracy?
> Thanks in advance for any help!

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/14120ee4-9d78-48ff-b088-d3fa503aefe5n%40googlegroups.com.

[tesseract-ocr] Re: Digits reading optimalisation.

Reply via email to