Re: [tesseract-ocr] HRe: tesseract 4.1.1 slow in aws instance centos7

2022-11-29 Thread Giuseppe Coniglio
In my Oracle Linux Server 8.6 max version available is tesseract 4.1.1, in my spring boot microservices in pom.xml : net.sourceforge.tess4j tess4j 4.3.1 org.apache.pdfbox pdfbox 2.0.22 It

[tesseract-ocr] Re: Extract Text From A Scanned PDF Using OCR In Java: low elaboration in a Oracle Linux Server 8.6

2022-11-29 Thread Giuseppe Coniglio
Code is https://medium.com/gft-engineering/creating-an-ocr-microservice-using-tesseract-pdfbox-and-docker-155beb7f2623 Have a nice day Il giorno lunedì 28 novembre 2022 alle 15:50:10 UTC+1 Giuseppe Coniglio ha scritto: > Hi to all, > I have implemented a Spring boot microservice which use tess

Re: [tesseract-ocr] Can't get bib#'s from tshirt JPG: should be simple.

2022-11-29 Thread Zdenko Podobny
Tesseract is an OCR engine. You need to search for " text detection from natural scenes" e.g.: https://scholar.google.com/scholar?q=text+detection+in+natural+scenes&as_sdt=0&as_vis=1&oi=scholart https://www.sciencedirect.com/science/article/pii/S1877050922001867 https://d1wqtxts1xzle7.cloudfront.ne