Looks like the "fast" models are better or on par with the "best" ones and
more robust.

Or is there a difference in the 20-40 range that is not visibile from the
chart at this resolution?


Thanks, Lorenzo



Il giorno mar 21 feb 2023 alle ore 22:22 wil...@gmail.com <will...@gmail.com>
ha scritto:

> Sorry it took a while.  Take a look here
> <https://willus.com/blog.shtml?tesseract_accuracy>.
>
> On Sunday, February 27, 2022 at 9:08:32 AM UTC-8 zdenop wrote:
>
>> Hello Willus,
>>
>> Can you also test tesseract 5? Can you share your input data for testing
>> or script for evaluation, how you generate output charts?
>>
>> Zdenko
>> Dátum: pondelok 31. decembra 2018, čas: 23:23:39 UTC+1, odosielateľ:
>> wil...@gmail.com
>>
>>> So I did some more experimenting and convinced myself that the "xres"
>>> and "yres" values in the PIX structure passed to Tesseract have virtually
>>> no impact to the results unless the resolution is so poor as to make the
>>> error rate very high.  Using that information, I re-ran my tests in a more
>>> systematic way on both Tesseract 4 (with the "TessBest" English training
>>> data file--14.7 MiB) and Tesseract 3.05 (with CUBE).  The results below
>>> show the average error rate for the six fonts and then excluding
>>> Bookman-Demi and Helvetica-Narrow since they're a little out of the
>>> ordinary.  The error-rate is plotted against the height of a capital letter
>>> in pixels, as before.  A couple of things to note:
>>> 1. Tess v4.0.0 does far better at the lower resolutions (fewer pixels in
>>> a capital letter).
>>> 2. Tess v4.0.0 is more consistent across a broader font selection than
>>> Tess v3.05.  This is very good to see.
>>> 3. However, if I exclude Bookman-Demi and Helvetica-Narrow, Tess v3.05
>>> does better for the higher resolutions (40-140 pixel heights).  Tess v4.0.0
>>> definitely has a consistent issue with high-res fonts which should be
>>> addressed, as I stated in my earlier posts.
>>>
>>> 6-font average:
>>> [image: tess_accuracy_6fonts.png]
>>>
>>> Without Bookman-Demi and Helvetica-Narrow:
>>> [image: tess_accuracy_4fonts.png]
>>>
>>>
>>>
>>>
>>>
>>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/647bdd8a-28bc-4111-bb36-bc8560a78d18n%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/647bdd8a-28bc-4111-bb36-bc8560a78d18n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAMgOLLyt94rMAxytehWtkw83QM1NgErRa2PjVxELis65iU4jhQ%40mail.gmail.com.

Reply via email to