Generally: read and follow https://github.com/tesseract-ocr/tessdoc/blob/main/ImproveQuality.md
Basically: pre-process image: remove not text element, or OCR only text areas (search internet for "text detection") Zdenko št 21. 10. 2021 o 23:34 Schuyler Reinken <xarly...@gmail.com> napísal(a): > I'm using the english tessdata_best on linux > > On Thursday, October 21, 2021 at 5:32:17 PM UTC-4 Schuyler Reinken wrote: > >> I am using tesseract 4.1.1 and the results on this Image are as follows: >> ----------------------------------------------------- >> roan >> nian >> Er >> Preferred i) >> PRODUCED & wa >> SPRINGGATES >> FARMS AND VINEYARD >> Le >> 1 >> Tome Son a Woon >> Hui Sov vet Aoinii >> BEVERAGES UF >> a i od oR De pa 1 >> primi ett >> ‘OPERATE MACHNERY, AND MAY CAUSE >> 375 mL 7% ALC BY VOL REATH PROBES. COMANSSUFTES >> Jon 2 To 5 GIP \Y » ) SIR VW, T=" Wa COO pn a TEES gemma >> >> ------------------------------------------------------------------------------------------------------------- >> On Friday, October 15, 2021 at 10:30:10 AM UTC-4 Schuyler Reinken wrote: >> >>> >>> Hello! I am having trouble using Tesseract to read inconsistently spaced >>> text. >>> >>> It tends to miss entire lines of text in the government warning in image >>> attached. I don't need to read the blue angled text, only the stuff on the >>> white sidebar. Is there a way to improve it's reading of this sort of image? >>> [image: SPRING GATE VINEYARD_a.jpg] >>> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/123a18f9-c281-4063-b197-45a9a35e6090n%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/123a18f9-c281-4063-b197-45a9a35e6090n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zhA8iac2R_SNHXj5uzfDHeY0ui_QO3gUK-wPD6KM36-w%40mail.gmail.com.