Hello,

I am not able to reproduce error, errors come from here [1] where
pytesseract tries to cleanup temporary files.
You should report it to pytesseract project as there is no option to skip
this code.
Maybe you can try to modify this part of pytesseact code[2]:

finally:
    cleanup(f.name)

to

finally:
    f.close()
    cleanup(f.name)


[1]
https://github.com/madmaze/pytesseract/blob/master/src/pytesseract.py#L131
[2]
https://github.com/madmaze/pytesseract/blob/7fef19ff176bd9f837753dc4c0ebc76b16267775/src/pytesseract.py#L176

Zdenko


ne 1. 3. 2020 o 14:11 Supharerk Thawillarp <raynus.blue...@gmail.com>
napísal(a):

> ok, it gave me WinErr5 again.
>
>
> PS C:\Users\Supharerk\ocr_server> pipenv run python .\test_tess.py
> C:\Users\SUPHAR~1\AppData\Local\Temp\tess_g9e7avw0
> Image shape: (1150, 835, 3)
> Traceback (most recent call last):
>   File ".\test_tess.py", line 19, in <module>
>     data_dict = pytesseract.image_to_data(img, output_type=Output.DICT)
>   File
> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
> , line 426, in image_to_data
>     }[output_type]()
>   File
> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
> , line 424, in <lambda>
>     Output.DICT: lambda: file_to_dict(run_and_get_output(*args), '\t', -1
> ),
>   File
> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
> , line 264, in run_and_get_output
>     return output_file.read().decode('utf-8').strip()
>   File
> "c:\users\supharerk\appdata\local\continuum\anaconda3\lib\contextlib.py",
> line 119, in __exit__
>     next(self.gen)
>   File
> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
> , line 176, in save
>     cleanup(f.name)
>   File
> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
> , line 136, in cleanup
>     raise e
>   File
> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
> , line 133, in cleanup
>     remove(filename)
> PermissionError: [WinError 5] Access is denied:
> 'C:\\Users\\SUPHAR~1\\AppData\\Local\\Temp\\tess_69cggzq3'
>
>
>
>
>
> เมื่อ วันอาทิตย์ที่ 1 มีนาคม ค.ศ. 2020 3 นาฬิกา 05 นาที 39 วินาที UTC+7,
> zdenop เขียนว่า:
>>
>> 1. Make sure you have the latest version of tesseract.
>> Then try this script and provide exact/full error message:
>>
>> import tempfile
>>
>> import cv2
>> import pytesseract
>> from PIL import Image
>> from pytesseract import Output
>>
>> pytesseract.pytesseract.tesseract_cmd = 'C:\\Program 
>> Files\\Tesseract-OCR\\tesseract.exe'
>>
>> *i*mg = cv2.imread('images/invoice-sample.jpg')
>>
>> # check temp file
>> temp_file = tempfile.NamedTemporaryFile(prefix='tess_')
>> print(temp_file.name)
>> image = Image.fromarray(img)
>> image.save(temp_file.name + '.png', format='png', **image.info)
>> temp_file.close()
>>
>> if img.any():
>> print("Image shape:", img.shape)
>> data_dict = pytesseract.image_to_data(img, output_type=Output.DICT)
>> n_boxes = len(data_dict['level'])
>> for i in range(n_boxes):
>> (x, y, w, h) = (data_dict['left'][i], data_dict['top'][i], data_dict[
>> 'width'][i], data_dict['height'][i])
>> cv2.rectangle(img, (x, y), (x + w, y + h), (255, 125, 125), 2)
>> cv2.imshow('img', img)
>> cv2.waitKey(0)
>> else:
>> print("Can not open input file")
>>
>>
>>
>>
>> Zdenko
>>
>>
>> so 29. 2. 2020 o 19:04 Supharerk Thawillarp <raynus...@gmail.com>
>> napísal(a):
>>
>>> Sure
>>>
>>> >>> pytesseract.pytesseract.tesseract_cmd = 'C:\\Program
>>> Files\\Tesseract-OCR\\tesseract.exe'
>>> >>> pytesseract.get_tesseract_version()
>>> LooseVersion ('5.0.0-alpha.20200223')
>>>
>>>
>>> เมื่อ วันอาทิตย์ที่ 1 มีนาคม ค.ศ. 2020 0 นาฬิกา 21 นาที 26 วินาที UTC+7,
>>> zdenop เขียนว่า:
>>>>
>>>> This means there is problem with pytesseract/python permissions.
>>>>
>>>> Can you get output for pytesseract.get_tesseract_version()?
>>>>
>>>> Zdenko
>>>>
>>>>
>>>> so 29. 2. 2020 o 12:10 Supharerk Thawillarp <raynus...@gmail.com>
>>>> napísal(a):
>>>>
>>>>> No, the tesserect successfully run with output generated in textfile.
>>>>>
>>>>> (base) PS C:\Users\Supharerk\ocr_server> & 'C:\Program
>>>>> Files\Tesseract-OCR\tesseract.exe' .\images\invoice-sample.jpg invoice
>>>>> -sample
>>>>> Tesseract Open Source OCR Engine v5.0.0-alpha.20200223 with Leptonica
>>>>>
>>>>>
>>>>>
>>>>> However, the WinError 5 arise again when running from python (with
>>>>> pipenv)
>>>>> (base) PS C:\Users\Supharerk\ocr_server> pipenv run python .\app2.py
>>>>> Traceback (most recent call last):
>>>>>   File ".\app2.py", line 10, in <module>
>>>>>     d=pytesseract.image_to_data(img,output_type=Output.DICT)
>>>>>   File
>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>> , line 426, in image_to_data
>>>>>     }[output_type]()
>>>>>   File
>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>> , line 424, in <lambda>
>>>>>     Output.DICT: lambda: file_to_dict(run_and_get_output(*args), '\t',
>>>>> -1),
>>>>>   File
>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>> , line 264, in run_and_get_output
>>>>>     return output_file.read().decode('utf-8').strip()
>>>>>   File
>>>>> "c:\users\supharerk\appdata\local\continuum\anaconda3\lib\contextlib.py"
>>>>> , line 119, in __exit__
>>>>>     next(self.gen)
>>>>>   File
>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>> , line 176, in save
>>>>>     cleanup(f.name)
>>>>>   File
>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>> , line 136, in cleanup
>>>>>     raise e
>>>>>   File
>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>> , line 133, in cleanup
>>>>>     remove(filename)
>>>>> PermissionError: [WinError 5] Access is denied:
>>>>> 'C:\\Users\\SUPHAR~1\\AppData\\Local\\Temp\\tess_y3d570lt'
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> เมื่อ วันเสาร์ที่ 29 กุมภาพันธ์ ค.ศ. 2020 16 นาฬิกา 19 นาที 41 วินาที
>>>>> UTC+7, zdenop เขียนว่า:
>>>>>>
>>>>>> Can you replicate problem with command line /"pure" tesseract? e,g,
>>>>>> 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'   
>>>>>> images/invoice-sample.jpg
>>>>>> invoice-sample
>>>>>>
>>>>>> Zdenko
>>>>>>
>>>>>>
>>>>>> pi 28. 2. 2020 o 20:31 Supharerk Thawillarp <raynus...@gmail.com>
>>>>>> napísal(a):
>>>>>>
>>>>>>>
>>>>>>> I'm new to tesseract and trying to follow tutorial on Windows 10
>>>>>>> using the code below
>>>>>>>
>>>>>>> import cv2
>>>>>>> import pytesseract
>>>>>>> from pytesseract import Output
>>>>>>> pytesseract.pytesseract.tesseract_cmd = 'C:\\Program
>>>>>>> Files\\Tesseract-OCR\\tesseract.exe'
>>>>>>>
>>>>>>>
>>>>>>> img=cv2.imread('images/invoice-sample.jpg')
>>>>>>>
>>>>>>>
>>>>>>> d=pytesseract.image_to_data(img,output_type=Output.DICT)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> print(d.keys)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The problem is, I keep getting error PermissionError: [WinError 5]
>>>>>>> Access is denied: 'from implementing image_to_data and image_to_string 
>>>>>>> in
>>>>>>> Windows 10.
>>>>>>>
>>>>>>> Only resource I found in stackoverflow is to set tesseract_cmd, PATH
>>>>>>> and TESSDATA_PREFIX which did not work for me. Not even using the
>>>>>>> administrative cmd works.
>>>>>>>
>>>>>>> After spending a couple hours I found setting permission for
>>>>>>> tesseract.exe (right click, select property and go to security tab) by
>>>>>>> checking Full control and Modify below to make it works.
>>>>>>>
>>>>>>> Hope this will help some people strugglingthe same problem.
>>>>>>>
>>>>>>>
>>>>>>> [image: 1582917756731.jpg][image: 1582917788913.jpg]
>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "tesseract-ocr" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to tesser...@googlegroups.com.
>>>>>>> To view this discussion on the web visit
>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/2d9f9f66-40a5-4ce9-9f14-cca48307e9f5%40googlegroups.com
>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/2d9f9f66-40a5-4ce9-9f14-cca48307e9f5%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "tesseract-ocr" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to tesser...@googlegroups.com.
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/tesseract-ocr/06df8a53-6027-4dc5-af29-b7e29d446b29%40googlegroups.com
>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/06df8a53-6027-4dc5-af29-b7e29d446b29%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesser...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/71abd149-93fe-478c-a637-6a9faf117c32%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/71abd149-93fe-478c-a637-6a9faf117c32%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/ccc5d777-e0af-4683-9881-0efc8798ecb8%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/ccc5d777-e0af-4683-9881-0efc8798ecb8%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8x4pGn0E6M%3DSwNUHrp_dhvgNdKcD1msPxBsw06zEbRxqg%40mail.gmail.com.

Reply via email to