On Thu, 4 Mar 2010 Jakub Wilk <jw...@debian.org> wrote:
[...] > ocrodjvu indeed crashes, but on the garbage-in-garbage-out principle. If > you run ocrodjvu with the --debug option, you'll see that resulting hOCR > files don't contain anything legible. In fact, hOCR for page 2 contains > also some control characters, which completely break HTML parsing, > leading to a crash. > > I cannot do much about this, except making the error message more > helpful. You can skip the faulty page and continue processing. Best regards JSB -- , dr hab. Janusz S. Bien, prof. UW - Uniwersytet Warszawski (Katedra Lingwistyki Formalnej) Prof. Janusz S. Bien - Warsaw University (Department of Formal Linguistics) jsb...@uw.edu.pl, jsb...@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/ -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org