Re: [tesseract-ocr] Re: Reading image from Rubber

محمود محمد Thu, 19 Dec 2024 22:14:45 -0800

OK thanks

في الجمعة، ٢٠ ديسمبر ٢٠٢٤، ١٠:٠٨ ص Taresh Chaudhari <
tareshchaudh...@gmail.com> كتب:


> HI,
> Sure, can we connect tomorrow around 11:30 am IST at Google meet.  My Id
> is "tareshchaudh...@gmail.com".
>
>
> On Wednesday, 11 December 2024 at 18:53:17 UTC+5:30 mahmoud...@gmail.com
> wrote:
>
>> Hello I want make or generated with you a simple file trainddata by
>> jtessboxeditor for Tesseract and test it can you inform me time to discuss
>> The steps.  Thanks
>>
>> في الثلاثاء، ٢٦ نوفمبر ٢٠٢٤، ٥:٠١ م Taresh Chaudhari <
>> tareshc...@gmail.com> كتب:
>>
>>> Thanks Mahmoud for sharing. I did apply these techniques, but still
>>> results are not good and still trying to solve this problem. Let me see how
>>> does it proceed.
>>>
>>> On Tuesday, 26 November 2024 at 00:31:29 UTC+5:30 mahmoud...@gmail.com
>>> wrote:
>>>
>>>> To improve the accuracy of text extraction, you can preprocess the
>>>> image before passing it to the OCR engine. Preprocessing techniques like
>>>> converting the image to grayscale, enhancing contrast, or applying filters
>>>> can help reduce noise and improve readability. Additionally, tweaking the
>>>> pytesseract settings like changing the --psm value may also improve the
>>>> results.
>>>>
>>>> Here’s an updated version of your code with some preprocessing steps:
>>>> import pytesseract
>>>> from PIL import Image, ImageEnhance, ImageFilter
>>>>
>>>> pytesseract.pytesseract.tesseract_cmd =
>>>> 'C:\\Users\\M562765\\AppData\\Local\\Programs\\Tesseract-OCR\\tesseract.exe'
>>>>
>>>> # Path to your image
>>>> image_path = 'C:/Users/M562765/Downloads/Unable-images/Unable/crop1.jpg'
>>>>
>>>> def extract_text_from_image(image_path):
>>>>     # Open the image
>>>>     img = Image.open(image_path)
>>>>
>>>>     # Convert the image to grayscale to improve text-background contrast
>>>>     img = img.convert('L')  # Convert image to grayscale
>>>>     img = ImageEnhance.Contrast(img).enhance(2)  # Increase contrast
>>>>     img = img.filter(ImageFilter.SHARPEN)  # Sharpen the image
>>>>
>>>>     # Use pytesseract to extract text
>>>>
>>>>
>>>>     extracted_text = pytesseract.image_to_string(img, config='--psm
>>>> 6')  # PSM 6 assumes a block of text
>>>>     return extracted_text.strip()
>>>>
>>>> # Extract and print text
>>>> text = extract_text_from_image(image_path)
>>>> print(f"Text extracted from {image_path}: {text}")
>>>>
>>>> في الاثنين، ٢٥ نوفمبر ٢٠٢٤، ٤:١٢ م Taresh Chaudhari <
>>>> tareshc...@gmail.com> كتب:
>>>>
>>>>> Attaching a image for reference.
>>>>>
>>>>> On Monday, 25 November 2024 at 15:52:27 UTC+5:30 Taresh Chaudhari
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>> I am trying to read the characters from the image, which has
>>>>>> characters with black color in the background. Attaching the code which i
>>>>>> used to extract, currently its giving the partial output. Can you help me
>>>>>> to guide how to make it accurate?
>>>>>>
>>>>>>
>>>>>> import pytesseract
>>>>>> from PIL import Image
>>>>>> pytesseract.pytesseract.tesseract_cmd =
>>>>>> 'C:\\Users\\M562765\\AppData\\Local\\Programs\\Tesseract-OCR\\tesseract.exe'
>>>>>> # Paths to your images
>>>>>> image_paths = [
>>>>>>    'C:/Users/M562765/Downloads/Unable-images/Unable/crop1.jpg']
>>>>>>
>>>>>> # Function to process an image and extract text
>>>>>> def extract_text_from_image(image_path):
>>>>>>     # Open the image
>>>>>>     img = Image.open(image_path)
>>>>>>
>>>>>>     # Use pytesseract to perform OCR
>>>>>>     extracted_text = pytesseract.image_to_string(img, config='--psm
>>>>>> 6')  # PSM 6 assumes a block of text
>>>>>>     return extracted_text.strip()
>>>>>>
>>>>>> # Process all images and print results
>>>>>> for img_path in image_paths:
>>>>>>     text = extract_text_from_image(img_path)
>>>>>>     print(f"Text extracted from {img_path}: {text}")
>>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "tesseract-ocr" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to tesseract-oc...@googlegroups.com.
>>>>> To view this discussion visit
>>>>> https://groups.google.com/d/msgid/tesseract-ocr/83985355-a349-4ed7-a2a9-c938fda1a5f4n%40googlegroups.com
>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/83985355-a349-4ed7-a2a9-c938fda1a5f4n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesseract-oc...@googlegroups.com.
>>>
>> To view this discussion visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/050091bf-ff93-4907-8f8d-74c06edd9f3en%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/050091bf-ff93-4907-8f8d-74c06edd9f3en%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion visit
> https://groups.google.com/d/msgid/tesseract-ocr/f4cda1a1-15e8-49b9-9cd0-b37c791cdf9bn%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/f4cda1a1-15e8-49b9-9cd0-b37c791cdf9bn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAB5aXsmf_vvH9J0%3DcGLrquPzYfRrH2YF4UB2M6Q26DKUnxG1kg%40mail.gmail.com.

Re: [tesseract-ocr] Re: Reading image from Rubber

Reply via email to