Looks like the "fast" models are better or on par with the "best" ones and more robust.
Or is there a difference in the 20-40 range that is not visibile from the chart at this resolution? Thanks, Lorenzo Il giorno mar 21 feb 2023 alle ore 22:22 wil...@gmail.com <will...@gmail.com> ha scritto: > Sorry it took a while. Take a look here > <https://willus.com/blog.shtml?tesseract_accuracy>. > > On Sunday, February 27, 2022 at 9:08:32 AM UTC-8 zdenop wrote: > >> Hello Willus, >> >> Can you also test tesseract 5? Can you share your input data for testing >> or script for evaluation, how you generate output charts? >> >> Zdenko >> Dátum: pondelok 31. decembra 2018, čas: 23:23:39 UTC+1, odosielateľ: >> wil...@gmail.com >> >>> So I did some more experimenting and convinced myself that the "xres" >>> and "yres" values in the PIX structure passed to Tesseract have virtually >>> no impact to the results unless the resolution is so poor as to make the >>> error rate very high. Using that information, I re-ran my tests in a more >>> systematic way on both Tesseract 4 (with the "TessBest" English training >>> data file--14.7 MiB) and Tesseract 3.05 (with CUBE). The results below >>> show the average error rate for the six fonts and then excluding >>> Bookman-Demi and Helvetica-Narrow since they're a little out of the >>> ordinary. The error-rate is plotted against the height of a capital letter >>> in pixels, as before. A couple of things to note: >>> 1. Tess v4.0.0 does far better at the lower resolutions (fewer pixels in >>> a capital letter). >>> 2. Tess v4.0.0 is more consistent across a broader font selection than >>> Tess v3.05. This is very good to see. >>> 3. However, if I exclude Bookman-Demi and Helvetica-Narrow, Tess v3.05 >>> does better for the higher resolutions (40-140 pixel heights). Tess v4.0.0 >>> definitely has a consistent issue with high-res fonts which should be >>> addressed, as I stated in my earlier posts. >>> >>> 6-font average: >>> [image: tess_accuracy_6fonts.png] >>> >>> Without Bookman-Demi and Helvetica-Narrow: >>> [image: tess_accuracy_4fonts.png] >>> >>> >>> >>> >>> >>> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/647bdd8a-28bc-4111-bb36-bc8560a78d18n%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/647bdd8a-28bc-4111-bb36-bc8560a78d18n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAMgOLLyt94rMAxytehWtkw83QM1NgErRa2PjVxELis65iU4jhQ%40mail.gmail.com.