By using the following preprocessing methods, the output of tesseract will
be better in the {I 02, I 03, I 04, I 05, I 06, I 08, I 10, I 11, I 12},
but not in other images.1. Grayscale Image: img_gray = cv2.imread(img_path, 0) 2. Erosion: img_eroded = cv2.erode(img_gray, np.ones((4, 4), np.uint8), iteration=1) 3. Rescaling: rescaling_img = cv2.resize(img_eroded, None, fx=3, fy=3, interpolation=cv2.INTER_CUBIC) Lisa Ki <[email protected]> در تاریخ شنبه ۲۰ اوت ۲۰۲۲ ساعت ۱۰:۵۰ نوشت: > Hi guys, I am trying to extract text from some simple clips and it just > keeps reading capital I into number 1. Does anyone have any suggestions? > > I have only added borders to the original images as code below: > > i = Image.open(ifp).convert('RGB') > colour = [255, 255, 255] > top, bottom, left, right = [150]*4 > i_with_border = cv2.copyMakeBorder(np.array(i), top, bottom, left, right, > cv2.BORDER_CONSTANT, value=colour) > ocr_result = pytesseract.image_to_string(i_with_border) > > results: > 101. > > 102. > > 103. > > 104. > > 105. > > 106. > > 107. > > 108. > > 109. > > 110. > > I'11. > > 112. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/59710fba-c1f8-43b7-ba93-7ad84f9318f2n%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/59710fba-c1f8-43b7-ba93-7ad84f9318f2n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAPzZBjxB6pO2oiF%3Dvn5pwfgPWV3bvWKFUVm4ktmCdG91tUdyQg%40mail.gmail.com.

