I have the same experience. Zdenko
On Sat, Dec 7, 2013 at 2:42 AM, Shree Devi Kumar <[email protected]>wrote: > Matthew, > I had tried registering for Aletheia a few months ago. No response so > far. > Shree > > Shree Devi Kumar > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > > > On Sat, Dec 7, 2013 at 2:57 AM, matthew christy <[email protected]>wrote: > >> Hi Janusz, >> >> You're right, Aletheia is not open-source. My mistake on a poor choice of >> words. However, it is free to use after registering, which is also free. >> The only restriction that I'm sure about on it's use is in a commercial >> product. I'll see if I can get a comment on that from someone at PRImA. >> >> Thanks, >> Matt >> >> >> On Friday, December 6, 2013 2:10:56 PM UTC-6, matthew christy wrote: >>> >>> Hi All, >>> >>> The Initiative for Digital Humanities, Media, and Culture (IDHMC) at >>> Texas A&M University, as part of its Early Modern OCR Project >>> (eMOP<http://emop.tamu.edu/>) >>> has created a new tool, called Franken+, that provides a way to create font >>> training for the Tesseract OCR engine using page images. This is in >>> contrast to Tesseract's documented >>> method<http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3>of >>> font training which involves using a word processing program with a >>> modern font. Franken+ has now been released for beta testing and we invite >>> anyone who's interested to give it a try and to please provide feedback. >>> >>> Franken+ works in conjunction with PRImA's open source Aletheia >>> tool<http://www.primaresearch.org/tools.php>and allows users to easily and >>> quickly identify one or more idealized forms >>> of each glyph found on a set of page images. These identified forms are >>> then used to generate a set of Franken-page images matching the page >>> characteristics documented in Tesseract's training instructions, but with a >>> font used in an actual early modern printed document. Franken+ allows you >>> to create Tesseract box files, but will also guide you through the entire >>> Tesseract training process, producing a .traneddata file, and even allow >>> you to identify and OCR documents using that training. In addition, >>> Franken+ makes it easy to combine training from multiple fonts into one >>> training set. >>> >>> For eMOP we are using Franken+ to create training for Tesseract from >>> page images of early modern printed works, but we also think it can be used >>> just as effectively to train Tesseract using images of any kind of font >>> that's not readily available via a word processor. For example, I've seen >>> posts in this group about wanting to train Tesseract to read the signs on >>> the front of buses. >>> >>> You can find out more about Franken+ at http://emop.tamu.edu/node/54and >>> http://dh-emopweb.tamu.edu/Franken+/. The code is also available open >>> source at https://github.com/idhmc-tamu/eMOP/tree/master/Franken%2B. >>> >>> Thanks, >>> Matt Christy >>> >> -- >> -- >> You received this message because you are subscribed to the Google >> Groups "tesseract-ocr" group. >> To post to this group, send email to [email protected] >> To unsubscribe from this group, send email to >> [email protected] >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en >> >> --- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/groups/opt_out. >> > > -- > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/groups/opt_out. > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

