No, you are not using best float tessdata files from:
https://github.com/tesseract-ocr/tessdata_best/blob/main/eng.traineddata
There is nothing like eng_pcb.traineddata. (read your error message)


Zdenko


po 22. 4. 2024 o 17:40 Surya VaraPrasad Alla <asvp.0...@gmail.com>
napísal(a):

> Hello,
>
> I have the similar response
>
> pytesseract.pytesseract.TesseractError: (1, "read_params_file: Can't open
> tessedit_char_blacklist=,;: Error: Tesseract (legacy) engine requested, but
> components are not present in
> external/tesstrain/data/eng_pcb/eng_pcb.traineddata!! Failed loading
> language 'eng_pcb' Tesseract couldn't load any languages! Could not
> initialize tesseract.")
>
> tesseract --version:
> tesseract -v
> tesseract 4.1.1
>  leptonica-1.82.0
>   libgif 5.1.9 : libjpeg 8d (libjpeg-turbo 2.1.1) : libpng 1.6.37 :
> libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.2 : libopenjp2 2.4.0
>  Found AVX512BW
>  Found AVX512F
>  Found AVX2
>  Found AVX
>  Found FMA
>  Found SSE
>  Found libarchive 3.6.0 zlib/1.2.11 liblzma/5.2.5 bz2lib/1.0.8
> liblz4/1.9.3 libzstd/1.4.8
>
> I am using best float tessdata files from:
> https://github.com/tesseract-ocr/tessdata_best/blob/main/eng.traineddata
>
> also tried some of possibilities in
> https://github.com/ocrmypdf/OCRmyPDF/issues/209
>
> I am looking for the source of the issue ---> could someone help if
> understood the source. so I can work further.
> On Tuesday, January 19, 2021 at 5:30:46 PM UTC+1 Shree Devi Kumar wrote:
>
>> >*wget https://github.com/tesseract-ocr/tessdata/blob/master/eng.traineddata
>> <https://github.com/tesseract-ocr/tessdata/blob/master/eng.traineddata>*
>>
>> That is not correct. You need to get the `raw` file.
>>
>> https://github.com/tesseract-ocr/tessdata_best/raw/master/eng.traineddata
>>
>> *wget https://github.com/tesseract-ocr/tessdata/blob/master/eng.traineddata
>> <https://github.com/tesseract-ocr/tessdata/blob/master/eng.traineddata>*
>>
>>
>> On Tue, Jan 19, 2021 at 9:49 PM Roparzh Hemon <roparz...@gmail.com>
>> wrote:
>>
>>>
>>> I downloaded it as you suggested, and as the terminal output below
>>> shows, the file is now present at the correct place :
>>>
>>> $file /home/mbalambala/tesseract/tessdata/eng.traineddata
>>> /home/mbalambala/tesseract/tessdata/eng.traineddata : HTML document,
>>> UTF-8 Unicode text, with very long lines
>>>
>>> $ echo TESSDATA_PREFIX
>>> /home/mbalambala/tesseract/tessdata
>>>
>>> but the error message stays exactly the same :
>>>
>>> $ tesseract Downloads/p1.pdf p1
>>> Error opening data file
>>> /home/mbalambala/tesseract/tessdata/eng.traineddata
>>> Please make sure the TESSDATA_PREFIX environment variable is set to your
>>> "tessdata" directory.
>>> Failed loading language 'eng'
>>> Tesseract couldn't load any languages!
>>> Could not initialize tesseract.
>>>
>>>
>>> Whatever the real problem is, the error message is not detecting it.
>>>
>>> On Sunday, January 17, 2021 at 10:37:22 AM UTC+1 ... wrote:
>>>
>>>> Run the following command in order to get the eng.traineddata file
>>>> within the tessdata directory: *wget
>>>> https://github.com/tesseract-ocr/tessdata/blob/master/eng.traineddata
>>>> <https://github.com/tesseract-ocr/tessdata/blob/master/eng.traineddata>*
>>>>
>>>
>>>
>>>
>>> --
>>>
>> You received this message because you are subscribed to the Google Groups
>>> "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesseract-oc...@googlegroups.com.
>>>
>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/47e8b734-5de9-4624-8872-ed91ac8775b4n%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/47e8b734-5de9-4624-8872-ed91ac8775b4n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
>>
>> --
>>
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/c0a86f51-b876-40ba-8d46-afdc3eccc96dn%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/c0a86f51-b876-40ba-8d46-afdc3eccc96dn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8y8f9X%2BUcRa8nADS3JDbS8Gn%3DZPtszgafmcSe3dt8yz1Q%40mail.gmail.com.

Reply via email to