Hi, On 26/09/2022 18:45, Max Rehberg wrote:
I would like to do that as well. Is it possible? D schrieb am Dienstag, 8. Dezember 2020 um 19:19:30 UTC+1: Hey guys, I produce a .hocr file with Google Cloud Vision and gcv2hocr. I would like to know if there is an easy method to call the Tesseract PDF File creation, because it is better than the solutions found on GitHub. My Goal is to create a PDF from the .hocr File and Image-File. Happy for any kind of help!
Sorry for the delay in my reply, but I've created exactly this a few years ago: https://github.com/internetarchive/archive-pdf-tools
Use recode_pdf to create a (optionally, compressed) PDF from a hOCR file and a set of images. You might have to combine the hOCR files into a single file, using https://github.com/internetarchive/archive-hocr-tools
Cheers, Merlijn -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/d84e9b87-dd86-7045-19f1-33571fde7108%40archive.org.