You are correct, I was able to resolve this by using these two page segmentation modes: - PSM 6 (single uniform block) - PSM 4 (single column variable sizes)
I use Tesseract with python and ran into this issue with both pytesseract.image_to_data and pytesseract.image_to_text commands with version 5.2 of Tesseract. Thanks On Wednesday, October 23, 2024 at 10:42:25 AM UTC-4 tfmo...@gmail.com wrote: > On Wednesday, October 23, 2024 at 1:13:05 AM UTC-4 mattjo...@gmail.com > wrote: > > I am having an issue with Tesseract splitting text lines incorrectly for > the attached file of a metes and bounds legal description. It returns this: > > [...] > > Any ideas on how to fix this? > > > It would be helpful if you included the version you are using, language > model, the command line, etc. > > The most likely fix is to use a different page segmentation mode on the > command line. > > Tom > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/80816da1-9e87-470d-9867-9f166b698d50n%40googlegroups.com.