post:

   1. Original image (without preprocessing)
   2. + image used for OCR (preprocessed)
   3. + output from tesseract executable (not tesseract wrappers) and used
   parameters/option

Otherwise, nobody can reproduce the problem and therefore suggest a
solution.

Zdenko


ne 31. 12. 2023 o 10:53 Jason Shepherd <jmanshepher...@gmail.com>
napĂ­sal(a):

> I'm using pytesseract and tesseract v5.3.3 to read some text from some
> images and I sometimes get these weird phantom characters. I've tried to
> do some image preprocessing like increasing the image size, erosion,
> thresholding, etc, but nothing seems to get rid of this random character
> that's spawing from nothing. Attached are two image examples (left side
> is processed, right is original with rect bounding boxes drawn), The blue
> rectangle to right of "KB PNG" is a '_' being detected even tho that
> space is completely blank. Any ideas on getting rid of this?
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/8800b99f-b92d-4dbf-83b8-d1d3da9c2bf4n%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/8800b99f-b92d-4dbf-83b8-d1d3da9c2bf4n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8wZqRPS17_TXa05XyvMJ41h-4FuFNS9egUcm0c%2Be2Oh4A%40mail.gmail.com.

Reply via email to