On Monday, April 21, 2025 at 12:03:33 PM UTC-4 mcarlo...@gmail.com wrote:
Honestly, I am having the same amount of (or even more) errors than with the standard model. I am trying to automatically transcribe documents such as the one attached (a simple excerpt from a longer file; see also e.g. https://royalsocietypublishing.org/doi/epdf/10.1098/rstl.1720.0013). *Any idea if there are more suitable models for this kind of 18th-century documents? *(Seems like a 18th-century Caslon font, which uses the long S <https://en.wikipedia.org/wiki/Long_s> quite often) You might want to look at some of the work that was done by the Early Modern OCR project: https://emop.tamu.edu/ Tom -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/3d81a21c-1245-4e6e-9dc9-c8ca02a10a2cn%40googlegroups.com.