Quote/Cytat - Shree Devi Kumar <[email protected]> (Sat 07 Dec 2013 02:42:11 AM CET):

Matthew,
I had tried registering for Aletheia a few months ago. No response so far.
Shree

Somehow I'm not surprised.

I'm familiar with the program as I had to work with it as a partner of the IMPACT project. This is the only program which support the PAGE format. If this format is suitable for you, then of course it is the only choice.

We prefer to work with hOCR, so the data created with Aletheia were immediately converted to hOCR, cf. pageparser at https://bitbucket.org/jwilk/marasca-wbl.

BTW, our ultimate goal is to create so called DjVu corpora, cf.

http://poliqarp.wbl.klf.uw.edu.pl

We intend to replace the dirty OCR created with FineReader by the output of trained tesseract, so we are looking for a good training tool.

Best regards

Janusz



--
Prof. dr hab. Janusz S. Bień - Uniwersytet Warszawski (Katedra Lingwistyki Formalnej)
Prof. Janusz S. Bień - University of Warsaw (Formal Linguistics Department)
[email protected], [email protected], http://fleksem.klf.uw.edu.pl/~jsbien/

--
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to