I am not sure what you mean by "I have tried setting the Region of Interest (ROI) ", but when I cut region and pre-processed it as described in the documentation I got the correct results:
tesseract frame_1-ROI1_preprocessed.png - --psm 7 GOH SCE YUAN tesseract frame_1-ROI2_preprocessed.png - --psm 4 0197782267 073351668 0197732267 Zdenko st 28. 6. 2023 o 4:26 Lee Kar Yee <leekarye...@gmail.com> napísal(a): > Hi, > > Apologies. Kindly refer to the following. > > With the following code, I managed to draw rectangle on the region that > are processed by Tesseract OCR. > > pytesseract.pytesseract.tesseract_cmd = r"C:\Program > Files\Tesseract-OCR\tesseract.exe" > > video = r"C:\Users\User\Downloads\FATHER\test.mp4" > cap = cv2.VideoCapture(video) > frame_count = 0 > > while cap.isOpened() and frame_count < 2: > ret, frame = cap.read() > > if not ret: > break > > # Perform OCR on the entire frame without dictionaries > text = pytesseract.image_to_string(frame, config='--psm 1 -l eng --oem > 1') > > print(text) > > # Get the bounding box coordinates of the detected text regions > boxes = pytesseract.image_to_boxes(frame, config='--psm 1 -l eng --oem > 1') > > # Draw bounding box rectangles on the frame > for box in boxes.splitlines(): > _, x, y, w, h, _ = box.split(' ') > x, y, w, h = int(x), int(y), int(w), int(h) > # Draw rectangles on the frame > cv2.rectangle(frame, (x, y), (w, h), (0, 0, 255), 1) > > # Save the frame as an image > cv2.imwrite(f"frame_{frame_count}.jpg", frame) > > frame_count += 1 > > cap.release() > cv2.destroyAllWindows() > > And the results are as below. > > ntes F-Farm Annlicatinns Service Reiest 9 Oar > > individual Name IDK > > GOH SCE YUAN 600 > > > nten F-Farm Annlicatinns Service Request «9 [or > > Individual Name IDK > > GOH SCE YUAN 600 > > Kindly refer to the objective.jpg for what I actually intend to capture. > > Thanks, > > Lee > > On Tuesday, June 27, 2023 at 6:25:50 PM UTC+8 zdenop wrote: > >> without an example image nobody can help you. >> >> Zdenko >> >> >> ut 27. 6. 2023 o 12:01 Lee Kar Yee <leeka...@gmail.com> napísal(a): >> >>> Hi all, >>> >>> I am new to Tesseract OCR. I am trying to achieve extracting alphabets >>> and numbers from images. >>> These images are being converted from a mp4 video into frames as JPG. >>> >>> While using Page segmentation method 3, it is working wonders on >>> extracting alphabets, but it failed to extract numbers. >>> >>> I have tried setting the Region of Interest (ROI) but yet it still >>> failed. >>> >>> Any thoughts or direction that can point me to so that I can improve it? >>> >>> Thanks, >>> >>> Lee >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to tesseract-oc...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/67b453f2-7781-44a5-be05-05676d3ee5fan%40googlegroups.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/67b453f2-7781-44a5-be05-05676d3ee5fan%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/be985e79-79d0-41f9-8b14-139af74e94e9n%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/be985e79-79d0-41f9-8b14-139af74e94e9n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8yT%3Dr3g4USx0aP8LDMGCys%3DGTEak9TWo02_sPSszMEsJA%40mail.gmail.com.