Hi Brian, nice to hear from you. > We began using Aletheia because it was the only tool we were aware of at the > time which allows us to binarize an image, clean up artifacts, and bound not > only characters but words, lines, paragraphs, columns, pages, etc for > font-training purposes. The student workers who we pay to do much of this > work > have varying levels of comfort/expertise with computers, so Aletheia also > proved to be the most GUI driven, user-friendly tool out there.
When you say it binds words, lines, paragraphs for font training purposes, can you explain what you mean? I haven't used Aletheia, so it isn't obvious to me. Do you mean that the interface is separated by words, so people correcting the box files can (for example) see that "babe" is misrecognised as "bard" and then just click near the word and type "babe"? I can see that this could be a faster approach to correcting things, potentially. I don't think the current box editors we have are very focused towards this sort of "proofreading" model, and perhaps they should be more so. Looking forward to hearing more from you, Nick -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

