Hi Kim, One solution would be to use the pdfimages utility from Poppler to extract all the images from the PDF into a directory. You would then place the corresponding hocr files in the same directory and then run the hocr-pdf utility from hocr-tools.
Both software packages are readily available on many Linux systems. https://poppler.freedesktop.org/ https://github.com/tmbdev/hocr-tools Thanks, Rasan NYU Digital Library On Wed, May 6, 2020 at 2:42 PM Kimberly Kennedy <kimberlymkenn...@gmail.com> wrote: > I have an unusual situation. I've created a PDF that I want to be text > searchable. However, I would like to use OCR data from a different source > than that document. Is it possible to add a text file as the OCR layer to > an existing PDF? > > Any ideas would be appreciated! > > Thanks, > > Kim > > > Kimberly Kennedy > Digital Production Coordinator > Northeastern University Library > ki.kenn...@northeastern.edu >