That scan is quite low resolution so it is hard to say how well any OCR will work. I'd expect better than garbage, but a lot of errors.
The DPI is quite significant for checking whether a group of pixels is noise or a glyph. It implies the minimum font size. 72 or 96 is a good guess for screenshots (or 200 for a retina screen). One possibility is that ocrmypdf fails to encode Cyrillic under the current settings and available system fonts. If you have problems with all Cyrillic images (even high quality scans), you could try adding the --pdf-renderer=tesseract --output-type=pdf . That seems to work better for non-Latin languages. If you want to install the latest version instead of the Ubuntu version, you could use the --sidecar argument to see what text is being found to discern if the issue is PDF encoding or the image itself. Aside: The "just print" feature would not have been helpful here even if it worked. On Sun, 4 Jun 2017 at 05:11 david braun <1687...@bugs.launchpad.net> wrote: > Sorry for the delay. > I'm trying to translate the text in the attached to english. I have loaded > the tesseract RUS language and executing > $ ocrmypdf -l rus --image-dpi 64 111684498_large_2.jpg > 111684498_large_2.pdf > completes with the following messages > INFO - Input file is not a PDF, checking if it is an image... > INFO - Input file is an image > INFO - Input image has no ICC profile, assuming sRGB > INFO - Image seems valid. Try converting to PDF... > INFO - Successfully converted to PDF, processing... > WARNING - 1: [tesseract] unsure about page orientation > INFO - Output file is a PDF/A-2B (as expected) > But Google translate produces garbage. > I was hoping to see what was being done by ocrmypdf to see if I could > figure out what might be the cause. > > BTW - I chose the DPI randomly - how significant is this parameter? > > > On Fri, May 26, 2017 at 12:51 AM, James R Barlow < > 1687...@bugs.launchpad.net > > wrote: > > > The code makes decisions at runtime based on the input file, so an > argument > > to skip executing all intermediates doesn't give an accurate picture of > > what will happen. There is a --flowchart argument that produces a SVG > file > > showing the processing path which helps development a lot, but it's > > probably not helpful to anyone else. > > > > What sort of use did you have for it? > > On Thu, May 25, 2017 at 17:56 david braun <1687...@bugs.launchpad.net> > > wrote: > > > > > > > > That's unfortunate! Any reason why you removed the options? > > > > > > -- > > > You received this bug notification because you are subscribed to > Ubuntu. > > > https://bugs.launchpad.net/bugs/1687308 > > > > > > Title: > > > ocrmypdf program and man page disagree about options > > > > > > Status in ocrmypdf package in Ubuntu: > > > Incomplete > > > > > > Bug description: > > > The man page for ocrmypdf claimes there is a "--just-print" option > but > > > the program rejects this. Also the man page claims the "-n" does the > > > same. It doesn't. The option is accepted but nothing obvious happens. > > > > > > ProblemType: Bug > > > DistroRelease: Ubuntu 17.04 > > > Package: ocrmypdf 4.3.5-2 > > > ProcVersionSignature: Ubuntu 4.10.0-20.22-generic 4.10.8 > > > Uname: Linux 4.10.0-20-generic x86_64 > > > ApportVersion: 2.20.4-0ubuntu4 > > > Architecture: amd64 > > > CurrentDesktop: Unity:Unity7 > > > Date: Sun Apr 30 13:55:46 2017 > > > EcryptfsInUse: Yes > > > InstallationDate: Installed on 2015-05-31 (699 days ago) > > > InstallationMedia: Ubuntu 14.04.2 LTS "Trusty Tahr" - Release amd64 > > > (20150218.1) > > > PackageArchitecture: all > > > ProcEnviron: > > > LANGUAGE=en_US > > > PATH=(custom, no user) > > > XDG_RUNTIME_DIR=<set> > > > LANG=en_US.UTF-8 > > > SHELL=/bin/bash > > > SourcePackage: ocrmypdf > > > UpgradeStatus: Upgraded to zesty on 2017-04-28 (1 days ago) > > > > > > To manage notifications about this bug go to: > > > > > > https://bugs.launchpad.net/ubuntu/+source/ocrmypdf/+bug/ > > 1687308/+subscriptions > > > > > > > > > > -- > > You received this bug notification because you are subscribed to the bug > > report. > > https://bugs.launchpad.net/bugs/1687308 > > > > Title: > > ocrmypdf program and man page disagree about options > > > > Status in ocrmypdf package in Ubuntu: > > Incomplete > > > > Bug description: > > The man page for ocrmypdf claimes there is a "--just-print" option but > > the program rejects this. Also the man page claims the "-n" does the > > same. It doesn't. The option is accepted but nothing obvious happens. > > > > ProblemType: Bug > > DistroRelease: Ubuntu 17.04 > > Package: ocrmypdf 4.3.5-2 > > ProcVersionSignature: Ubuntu 4.10.0-20.22-generic 4.10.8 > > Uname: Linux 4.10.0-20-generic x86_64 > > ApportVersion: 2.20.4-0ubuntu4 > > Architecture: amd64 > > CurrentDesktop: Unity:Unity7 > > Date: Sun Apr 30 13:55:46 2017 > > EcryptfsInUse: Yes > > InstallationDate: Installed on 2015-05-31 (699 days ago) > > InstallationMedia: Ubuntu 14.04.2 LTS "Trusty Tahr" - Release amd64 > > (20150218.1) > > PackageArchitecture: all > > ProcEnviron: > > LANGUAGE=en_US > > PATH=(custom, no user) > > XDG_RUNTIME_DIR=<set> > > LANG=en_US.UTF-8 > > SHELL=/bin/bash > > SourcePackage: ocrmypdf > > UpgradeStatus: Upgraded to zesty on 2017-04-28 (1 days ago) > > > > To manage notifications about this bug go to: > > https://bugs.launchpad.net/ubuntu/+source/ocrmypdf/+bug/ > > 1687308/+subscriptions > > > > > ** Attachment added: "111684498_large_2.jpg" > > https://bugs.launchpad.net/bugs/1687308/+attachment/4888804/+files/111684498_large_2.jpg > > ** Attachment added: "111684498_large_2.pdf" > > https://bugs.launchpad.net/bugs/1687308/+attachment/4888805/+files/111684498_large_2.pdf > > -- > You received this bug notification because you are subscribed to Ubuntu. > https://bugs.launchpad.net/bugs/1687308 > > Title: > ocrmypdf program and man page disagree about options > > Status in ocrmypdf package in Ubuntu: > Incomplete > > Bug description: > The man page for ocrmypdf claimes there is a "--just-print" option but > the program rejects this. Also the man page claims the "-n" does the > same. It doesn't. The option is accepted but nothing obvious happens. > > ProblemType: Bug > DistroRelease: Ubuntu 17.04 > Package: ocrmypdf 4.3.5-2 > ProcVersionSignature: Ubuntu 4.10.0-20.22-generic 4.10.8 > Uname: Linux 4.10.0-20-generic x86_64 > ApportVersion: 2.20.4-0ubuntu4 > Architecture: amd64 > CurrentDesktop: Unity:Unity7 > Date: Sun Apr 30 13:55:46 2017 > EcryptfsInUse: Yes > InstallationDate: Installed on 2015-05-31 (699 days ago) > InstallationMedia: Ubuntu 14.04.2 LTS "Trusty Tahr" - Release amd64 > (20150218.1) > PackageArchitecture: all > ProcEnviron: > LANGUAGE=en_US > PATH=(custom, no user) > XDG_RUNTIME_DIR=<set> > LANG=en_US.UTF-8 > SHELL=/bin/bash > SourcePackage: ocrmypdf > UpgradeStatus: Upgraded to zesty on 2017-04-28 (1 days ago) > > To manage notifications about this bug go to: > > https://bugs.launchpad.net/ubuntu/+source/ocrmypdf/+bug/1687308/+subscriptions > > -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1687308 Title: ocrmypdf program and man page disagree about options To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ocrmypdf/+bug/1687308/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs