Bug#1017872: RFA: ocrmypdf -- add an OCR text layer to PDF files

Sean Whitton Sun, 21 Aug 2022 14:57:49 -0700

Package: wnpp
Severity: normal
X-Debbugs-Cc: [email protected], [email protected]
Control: affects -1 src:ocrmypdf


I request an adopter for the ocrmypdf package.  I don't use it as often
as I did (hardly ever the past couple of years), and anyway it would be
better for a Python programmer to maintain it.

The package description is:
 OCRmyPDF generates a searchable PDF/A file from a regular PDF
 containing only images, allowing it to be searched.
 .
 It uses the Tesseract OCR engine and so supports all the languages
 that Tesseract does.
 .
 Some other main features:
 .
   * Places OCR text accurately below the image to ease copy / paste
   * Keeps the exact resolution of the original embedded images
   * When possible, inserts OCR information as a lossless operation
     without rendering vector information
   * Keeps file size about the same
   * If requested deskews and/or cleans the image before performing OCR
   * Validates input and output files
   * Provides debug mode to enable easy verification of the OCR results
   * Processes pages in parallel when more than one CPU core is
     available
   * Battle-tested on thousands of PDFs, a test suite and continuous
     integration.

-- 
Sean Whitton

signature.asc
Description: PGP signature

Bug#1017872: RFA: ocrmypdf -- add an OCR text layer to PDF files

Reply via email to