Quote/Cytat - matthew christy <[email protected]> (Wed 18 Dec
2013 10:21:14 PM CET):
I haven't had to train an italic font yet. Would the printing sorts
have been slanted for some italic fonts? I suspect so (but don't
know; someone should look it up), which would result in the slight
overlap you see. If that is the case, I wonder if Tesseract takes it
into account? Arguably it should, but as far as I know it just deals
with regular rectangles. There is certainly some extra cleverness it
does to deal with italics... I suspect small overlaps of the kind
that you'll see with italic fonts are essentially just ignored. I
don't know whether that's also true in the training process. It will
be interesting to see how the new training tools to be released deal
with italics.
I've always assumed that they are slanted and I will ask our book history
expert next time I see him. Some italics fonts are more slanted than others
and can have a good deal of overlap. I've also kind of wondered if
specifying that a font is italic during training is how to indicate that it
should use slanted boxes while OCR'ing, or something like that.
I've never seen a mention of slanted printing sorts. Perhaps they were
just kerned, cf.
http://en.wikipedia.org/wiki/Kerning
Best regards
Janusz
--
Prof. dr hab. Janusz S. Bień - Uniwersytet Warszawski (Katedra
Lingwistyki Formalnej)
Prof. Janusz S. Bień - University of Warsaw (Formal Linguistics Department)
[email protected], [email protected], http://fleksem.klf.uw.edu.pl/~jsbien/
--
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
---
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.