Hello.

I've got some input document input.pdf. This comes straight from a scanner 
and thus I do some preprocessing to improve accuracy (i.e., unpaper, 
black/white, increased contrast), which yields preprocessed.png.

When using the command

tesseract preprocessed.png output pdf

I receive a document, which has the ocr'ed text embedded. Great! However: 
Can I tell tesseract to use the original document input.pdf as the 
background (i.e., the one without preprocessing) of the generated PDF while 
still performing ocr on the preprocessed input?

Thanks,
Jonas

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/63db35ed-fb19-41b5-ab83-0003538b236fn%40googlegroups.com.

Reply via email to