Hello,

I have the similar response

pytesseract.pytesseract.TesseractError: (1, "read_params_file: Can't open 
tessedit_char_blacklist=,;: Error: Tesseract (legacy) engine requested, but 
components are not present in 
external/tesstrain/data/eng_pcb/eng_pcb.traineddata!! Failed loading 
language 'eng_pcb' Tesseract couldn't load any languages! Could not 
initialize tesseract.")

tesseract --version:
tesseract -v
tesseract 4.1.1
 leptonica-1.82.0
  libgif 5.1.9 : libjpeg 8d (libjpeg-turbo 2.1.1) : libpng 1.6.37 : libtiff 
4.3.0 : zlib 1.2.11 : libwebp 1.2.2 : libopenjp2 2.4.0
 Found AVX512BW
 Found AVX512F
 Found AVX2
 Found AVX
 Found FMA
 Found SSE
 Found libarchive 3.6.0 zlib/1.2.11 liblzma/5.2.5 bz2lib/1.0.8 liblz4/1.9.3 
libzstd/1.4.8

I am using best float tessdata files 
from: https://github.com/tesseract-ocr/tessdata_best/blob/main/eng.traineddata

also tried some of possibilities 
in https://github.com/ocrmypdf/OCRmyPDF/issues/209

I am looking for the source of the issue ---> could someone help if 
understood the source. so I can work further.
On Tuesday, January 19, 2021 at 5:30:46 PM UTC+1 Shree Devi Kumar wrote:

> >*wget https://github.com/tesseract-ocr/tessdata/blob/master/eng.traineddata 
> <https://github.com/tesseract-ocr/tessdata/blob/master/eng.traineddata>*
>
> That is not correct. You need to get the `raw` file.
>
> https://github.com/tesseract-ocr/tessdata_best/raw/master/eng.traineddata
>
> *wget https://github.com/tesseract-ocr/tessdata/blob/master/eng.traineddata 
> <https://github.com/tesseract-ocr/tessdata/blob/master/eng.traineddata>*  
>
> On Tue, Jan 19, 2021 at 9:49 PM Roparzh Hemon <roparz...@gmail.com> wrote:
>
>>
>> I downloaded it as you suggested, and as the terminal output below shows, 
>> the file is now present at the correct place :
>>
>> $file /home/mbalambala/tesseract/tessdata/eng.traineddata
>> /home/mbalambala/tesseract/tessdata/eng.traineddata : HTML document, 
>> UTF-8 Unicode text, with very long lines
>>
>> $ echo TESSDATA_PREFIX
>> /home/mbalambala/tesseract/tessdata
>>
>> but the error message stays exactly the same :
>>
>> $ tesseract Downloads/p1.pdf p1
>> Error opening data file 
>> /home/mbalambala/tesseract/tessdata/eng.traineddata
>> Please make sure the TESSDATA_PREFIX environment variable is set to your 
>> "tessdata" directory.
>> Failed loading language 'eng'
>> Tesseract couldn't load any languages!
>> Could not initialize tesseract.
>>
>>
>> Whatever the real problem is, the error message is not detecting it.
>>
>> On Sunday, January 17, 2021 at 10:37:22 AM UTC+1 ... wrote:
>>
>>> Run the following command in order to get the eng.traineddata file 
>>> within the tessdata directory: *wget 
>>> https://github.com/tesseract-ocr/tessdata/blob/master/eng.traineddata 
>>> <https://github.com/tesseract-ocr/tessdata/blob/master/eng.traineddata>*
>>>
>>
>>  
>>
>> -- 
>>
> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com.
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/47e8b734-5de9-4624-8872-ed91ac8775b4n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/47e8b734-5de9-4624-8872-ed91ac8775b4n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
>
> -- 
>
> ____________________________________________________________
> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/c0a86f51-b876-40ba-8d46-afdc3eccc96dn%40googlegroups.com.

Reply via email to