> I could repeat this with several documents and resolutions. When I ran > tesseract > manually on the .tif file, I indeed saw non UTF-8 characters in the produced > html.
Could you give me some examples, please - just example hocr output that give problems. -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected]

