Hey all, and thanks for assisting. I'm currently working on a pipeline that takes in PDFs, converts them to images, feeds them to Tesseract, and outputs a combined PDF at the end with a readable text layer.
I'm up to the Tesseract part, and I'm stuck with the API and unsure how to continue. Essentially I want to give Tesseract an image from memory, such as a Pix from Leptonica. This works currently for outputting a text string, but I can't find in the API any sort of method that uses the image that was given to the Tesseract instance, and can render a PDF output. They all seem to want a filepath rather than using the set image for the Tesseract instance. Is there an API somewhere for this, or a work around? Thanks! -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/7846e2c8-7451-4535-84c0-6909d0ea3305n%40googlegroups.com.