Erica:
I've used Paperwork (https://openpaper.work/en/) in the past with good
results. It's open source and runs on Linux and Windows. If you'd be
interested in running a web application you might give some of these
options a look (https://github.com/kba/awesome-ocr#ocr-gui) or maybe
even look into a document management web application
(https://github.com/awesome-selfhosted/awesome-selfhosted#document-management)
though the later might be overkill for your use case.
Finally, if you're running a Mac somewhere and have money to spend, I
cannot overstate how much I love DevonThink
(https://www.devontechnologies.com/apps/devonthink) which has a server
version and uses ABBYY on the backend. My quick test this morning
suggests it doesn't have the issue you're describing.
best,
ak
--
ander kierig
Web Application Developer
University of Minnesota Libraries
[lib.umn.edu](https://www.lib.umn.edu)
they/them
On 2022-08-05 at 18:12 (-0500) Erica FINDLEY wrote:
All,
ABBYY has been a favorite program of mine for transforming batches of
TIFF
files into a PDF and extracting the text.
However, I have recently run into this known issue
<https://support.abbyy.com/hc/en-us/articles/360013874239-Each-page-is-duplicated-with-the-thumbnail-image-while-converting-TIFF-to-PDF-in-FineReader>even
though each TIFF file is the same resolution.
I opened a support ticket with ABBYY and their proposed resolution is
for
me to convert to another format (jpg) then to pdf. I do not like this
for
two reasons 1)it is time and resource consuming to do two
transformations
and 2) there is some image quality loss when doing this.
This leaves me with two questions:
1. Has anyone been able to find a better workaround for this issue?
2. Does anyone have recommendations for another GUI based OCR program?
My
quick research is pointing to Tesseract, but since I work with
volunteers
I'd prefer a GUI based solution.
Thanks!
*Erica Findley (she/her)*
*Systems & Metadata Librarian*
*x80591*
Multnomah County Library
Isom Operations Center: Thu 8 am - 5 pm, Fri 1:30 pm - 5:30 pm
Teleworking: Mon - Wed 8 am - 5 pm, Fri 8 am - 12 pm
multcolib.org <http://www.multcolib.org/>
My pronouns are she/her/hers