Quote/Cytat - Shree Devi Kumar <[email protected]> (Sat 07 Dec 2013
02:42:11 AM CET):
Matthew,
I had tried registering for Aletheia a few months ago. No response so far.
Shree
Somehow I'm not surprised.
I'm familiar with the program as I had to work with it as a partner of
the IMPACT project. This is the only program which support the PAGE
format. If this format is suitable for you, then of course it is the
only choice.
We prefer to work with hOCR, so the data created with Aletheia were
immediately converted to hOCR, cf. pageparser at
https://bitbucket.org/jwilk/marasca-wbl.
BTW, our ultimate goal is to create so called DjVu corpora, cf.
http://poliqarp.wbl.klf.uw.edu.pl
We intend to replace the dirty OCR created with FineReader by the
output of trained tesseract, so we are looking for a good training tool.
Best regards
Janusz
--
Prof. dr hab. Janusz S. Bień - Uniwersytet Warszawski (Katedra
Lingwistyki Formalnej)
Prof. Janusz S. Bień - University of Warsaw (Formal Linguistics Department)
[email protected], [email protected], http://fleksem.klf.uw.edu.pl/~jsbien/
--
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
---
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.