hi. The tesseract-ocr-ita package uses language packages from tessdata_fast <https://github.com/tesseract-ocr/tessdata_fast> that do not support the legacy recognizer. The language pack you downloaded from tessdata <https://github.com/tesseract-ocr/tessdata> supports the legacy recognizer and LSTM models.
ср, 18 сент. 2019 г. в 22:45, Davide Viti <zino...@gmail.com>: > Package: tesseract-ocr-ita > Version: 1:4.00~git30-7274cfa-1 > Severity: important > > Dear Maintainer, > > the following command: > > tesseract list.txt mypage -l ita --oem 2 > > fails with the following error: > > Failed loading language 'ita' > Tesseract couldn't load any languages! > Could not initialize tesseract. > > > A little bit of googling got me to [1] > As suggested, I've tried the following: > > wget https://github.com/tesseract-ocr/tessdata/raw/4.00/ita.traineddata > > and copied it to /usr/share/tesseract-ocr/4.00/tessdata > > it now works. > > Regards, > Davide > > [1] > https://www.mail-archive.com/tesseract-ocr@googlegroups.com/msg15127.html > [2] https://github.com/tesseract-ocr/tesseract/wiki/Data-Files > > > > -- System Information: > Debian Release: 10.1 > APT prefers stable-updates > APT policy: (500, 'stable-updates'), (500, 'stable') > Architecture: amd64 (x86_64) > > Kernel: Linux 4.19.0-6-amd64 (SMP w/8 CPU cores) > Kernel taint flags: TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE > Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), > LANGUAGE=en_US.UTF-8 (charmap=UTF-8) > Shell: /bin/sh linked to /usr/bin/dash > Init: systemd (via /run/systemd/system) > LSM: AppArmor: enabled > > tesseract-ocr-ita depends on no packages. > > Versions of packages tesseract-ocr-ita recommends: > ii tesseract-ocr 4.0.0-2 > > tesseract-ocr-ita suggests no packages. > > -- no debconf information >