Hi, Combination of image deskewing and image inversion should work.
Use PSM 11 on processed image, it will detect most of the characters. Thanks, Nikhil On Tue, Nov 1, 2022, 6:16 AM kwmz...@gmail.com <kwmz2...@gmail.com> wrote: > So I think you could also try using some morphological transformation; I > feel like dilation could help a lot too > > On Monday, October 31, 2022 at 2:01:36 PM UTC-4 nabil-ak wrote: > >> https://imagetotext.info/ >> >> If i use a website like this one it can extract the text perfectly and >> they also use tesseract. >> There has to be some preproccesing/setting that makes tesseract detect >> the text perfectly. >> >> nabil-ak schrieb am Montag, 31. Oktober 2022 um 13:55:56 UTC+1: >> >>> *I also tried Rotation but its still not working.* >>> >>> *These are the preprocessing steps that i used:* >>> >>> import cv2 >>> import pytesseract >>> import numpy as np >>> from scipy import ndimage >>> >>> img = cv2.imread('voucher.png') >>> >>> img = cv2.bitwise_not(img) >>> >>> img = cv2.resize(img, None, fx=1.2, fy=1.2, interpolation >>> =cv2.INTER_CUBIC) >>> >>> img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) >>> >>> kernel = np.ones((1, 1), np.uint8) >>> img = cv2.dilate(img, kernel, iterations=1) >>> img = cv2.erode(img, kernel, iterations=1) >>> >>> img = ndimage.rotate(img, -20) >>> >>> cv2.imwrite("changed.png",img) >>> >>> pytesseract.pytesseract.tesseract_cmd = r'C:\Program >>> Files\Tesseract-OCR\tesseract.exe' >>> print(pytesseract.image_to_string(img)) >>> >>> >>> abey...@gmail.com schrieb am Montag, 31. Oktober 2022 um 10:46:59 UTC+1: >>> >>>> Did you try all the preprocessing steps? Rotation / Deskewing ? I >>>> think Tesseract finds it difficult to identify skewed images. >>>> >>>> On Mon, Oct 31, 2022 at 1:17 PM nabil-ak <akir...@gmail.com> wrote: >>>> >>>>> Hello, >>>>> >>>>> I want to detect a code (combination of characters) on an image in >>>>> python. >>>>> >>>>> I already tried EasyOCR <https://github.com/JaidedAI/EasyOCR> and >>>>> Tesseract >>>>> Open Source OCR Engine <https://github.com/tesseract-ocr/tesseract> but >>>>> noone could detect the characters. >>>>> >>>>> I also tried to preprocess the picture by inverting the white font to >>>>> a black font and painting the background white to make it for the engine >>>>> easier to detect the characters. >>>>> >>>>> *What am i doing wrong?* >>>>> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to tesseract-oc...@googlegroups.com. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/tesseract-ocr/2e764275-4992-495f-b0ac-ffe668254231n%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/tesseract-ocr/2e764275-4992-495f-b0ac-ffe668254231n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/e49c5a3e-b810-4714-92e8-1b78d03f5e40n%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/e49c5a3e-b810-4714-92e8-1b78d03f5e40n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAEyZNF_MtUU4cTQUjAcU0eoW_Ou%2BnVpKSRso3NjBSwOw4vnOcA%40mail.gmail.com.