Erica:

I've used Paperwork (https://openpaper.work/en/) in the past with good results. It's open source and runs on Linux and Windows. If you'd be interested in running a web application you might give some of these options a look (https://github.com/kba/awesome-ocr#ocr-gui) or maybe even look into a document management web application (https://github.com/awesome-selfhosted/awesome-selfhosted#document-management) though the later might be overkill for your use case.

Finally, if you're running a Mac somewhere and have money to spend, I cannot overstate how much I love DevonThink (https://www.devontechnologies.com/apps/devonthink) which has a server version and uses ABBYY on the backend. My quick test this morning suggests it doesn't have the issue you're describing.

best,

ak

--
ander kierig
Web Application Developer
University of Minnesota Libraries
[lib.umn.edu](https://www.lib.umn.edu)
they/them

On 2022-08-05 at 18:12 (-0500) Erica FINDLEY wrote:

All,

ABBYY has been a favorite program of mine for transforming batches of TIFF
files into a PDF and extracting the text.

However, I have recently run into this known issue
<https://support.abbyy.com/hc/en-us/articles/360013874239-Each-page-is-duplicated-with-the-thumbnail-image-while-converting-TIFF-to-PDF-in-FineReader>even
though each TIFF file is the same resolution.

I opened a support ticket with ABBYY and their proposed resolution is for me to convert to another format (jpg) then to pdf. I do not like this for two reasons 1)it is time and resource consuming to do two transformations
and 2) there is some image quality loss when doing this.


This leaves me with two questions:

1. Has anyone been able to find a better workaround for this issue?

2. Does anyone have recommendations for another GUI based OCR program? My quick research is pointing to Tesseract, but since I work with volunteers
I'd prefer a GUI based solution.

Thanks!

*Erica Findley (she/her)*
*Systems & Metadata Librarian*
*x80591*
Multnomah County Library
Isom Operations Center: Thu 8 am - 5 pm, Fri 1:30 pm - 5:30 pm
Teleworking: Mon - Wed 8 am - 5 pm, Fri 8 am - 12 pm
multcolib.org <http://www.multcolib.org/>
My pronouns are she/her/hers

Reply via email to