On Thu, 4 Mar 2010  Jakub Wilk <jw...@debian.org> wrote:

[...]

> ocrodjvu indeed crashes, but on the garbage-in-garbage-out principle. If 
> you run ocrodjvu with the --debug option, you'll see that resulting hOCR 
> files don't contain anything legible. In fact, hOCR for page 2 contains 
> also some control characters, which completely break HTML parsing, 
> leading to a crash.
>
> I cannot do much about this, except making the error message more 
> helpful.

You can skip the faulty page and continue processing.

Best regards

JSB

-- 
                     ,   
dr hab. Janusz S. Bien, prof. UW -  Uniwersytet Warszawski (Katedra Lingwistyki 
Formalnej)
Prof. Janusz S. Bien - Warsaw University (Department of Formal Linguistics)
jsb...@uw.edu.pl, jsb...@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to