I tried this method as well. I even tried performing some pre-processing to the image to give tesseract a better idea of what's going on but its still not working.
On Thursday, July 20, 2023 at 2:18:13 AM UTC-4 tomwi...@gmail.com wrote: > The code you provided uses Tesseract OCR with a custom configuration (-l > eng+equ) to recognize English and mathematical equations (equ) in the > image. However, there is a small issue with the code – > pytesseract.image_to_string() expects the image in PIL (Python Imaging > Library) format, not OpenCV format (NumPy array). > > To fix the issue, you can convert the image from OpenCV format to PIL > format before passing it to Tesseract. You can use the > PIL.Image.fromarray() function to perform this conversion. > > Here's the updated code: > pythonCopy code > import pytesseract import cv2 from PIL import Image custom_config = r'-l > eng+equ' img = cv2.imread("tessa.png") # Convert the image from OpenCV > format (NumPy array) to PIL format pil_image = Image.fromarray(img) # > Perform OCR using Tesseract and extract text from the image text = > pytesseract.image_to_string(pil_image, config=custom_config) print(text) > > Make sure to replace "tessa.png" with the actual path to your image file. > > With this code, Tesseract OCR will attempt to recognize both English text > and mathematical equations present in the image. The custom_config > parameter with the value -l eng+equ instructs Tesseract to use the > English and mathematical equation language data for recognition. > > Please note that while Tesseract is a powerful OCR engine, recognizing > complex mathematical expressions accurately might be challenging. If you > encounter issues with accuracy, consider using specialized OCR libraries or > APIs that are designed specifically for math recognition. > source: Chat gpt > Vào lúc 04:55:05 UTC+7 ngày Thứ Tư, 19 tháng 7, 2023, kwmz...@gmail.com > đã viết: > >> Hi everyone, >> >> I'm trying to use Tesseract to detect both the english part and the >> mathematical part of the image below and it doesn't seem to work >> >> [image: tessa.png] >> >> The code I'm using is : >> >> *import pytesseract* >> *import cv2* >> >> *custom_config = r'-l eng+equ'* >> >> >> *img = cv2.imread("tessa.png")text = pytesseract.image_to_string(img, >> config=custom_config,)print(text)* >> >> The output being produced is just (see below) without the mathematical >> part even though I've used eng+equ >> [image: Screenshot 2023-07-18 at 5.54.13 PM.png] >> >> Did anyone find a workaround for this or must I retrain tesseract? >> >> Regards, >> Nash >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/da499789-b564-4af8-8e6e-aba76ae6e0e8n%40googlegroups.com.