[tesseract-ocr] Take in image from memory, get PDF output

Michael Kadziela Wed, 03 Aug 2022 20:59:13 -0700

Hey all,  and thanks for assisting.

I'm currently working on a pipeline that takes in PDFs, converts them to 
images, feeds them to Tesseract, and outputs a combined PDF at the end with 
a readable text layer.


I'm up to the Tesseract part, and I'm stuck with the API and unsure how to 
continue. Essentially I want to give Tesseract an image from memory, such 
as a Pix from Leptonica. This works currently for outputting a text string, 
but I can't find in the API any sort of method that uses the image that was 
given to the Tesseract instance, and can render a PDF output. They all seem 
to want a filepath rather than using the set image for the Tesseract 
instance.
Is there an API somewhere for this, or a work around?

Thanks! 

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/7846e2c8-7451-4535-84c0-6909d0ea3305n%40googlegroups.com.

[tesseract-ocr] Take in image from memory, get PDF output

Reply via email to