Re: [tesseract-ocr] Re: Tesseract training ground truth: I'm confused about the box files

2024-09-05 Thread Mateusz Matela
See my first answer, I've run an experiment and the training went exactly the same with both approaches (separate box per character or the same line-box for all characters). Mateusz czwartek, 5 września 2024 o 17:41:50 UTC+2 Danny napisał(a): Hi Zdenko, Thanks for the response. However, ocrd-

Re: [tesseract-ocr] Re: Tesseract training ground truth: I'm confused about the box files

2024-09-05 Thread Zdenko Podobny
What about reading tesstrain Readme and using the example data to understand the training process better? Zdenko št 5. 9. 2024 o 17:41 'Danny' via tesseract-ocr < tesseract-ocr@googlegroups.com> napísal(a): > Hi Zdenko, > Thanks for the response. However, ocrd-testset.zip contains training > i

Re: [tesseract-ocr] Tesseract 5 with dnf

2024-09-05 Thread Zdenko Podobny
No. We do not distribute binary packages. Volunteers create and maintain them. Zdenko št 5. 9. 2024 o 20:56 Chris Crutts (agentc313) napísal(a): > on my Oracle Linux 8.10 distribution, doing > > $ sudo dnf install tesseract > > installs tesseract version 4.1.1-2.el8 and leptonica version 1.76.

[tesseract-ocr] Tesseract 5 with dnf

2024-09-05 Thread Chris Crutts (agentc313)
on my Oracle Linux 8.10 distribution, doing $ sudo dnf install tesseract installs tesseract version 4.1.1-2.el8 and leptonica version 1.76.0-2.el8 As of today, 9/5/2024, the newest version is Release 5.4.1 · tesseract-ocr/tesseract (github.com)

Re: [tesseract-ocr] Re: Tesseract training ground truth: I'm confused about the box files

2024-09-05 Thread 'Danny' via tesseract-ocr
Hi Zdenko, Thanks for the response. However, ocrd-testset.zip contains training images and ground truth text without boxes. True, the images contain a full line of text: [image: alexis_ruhe01_1852_0099_012.png] But there are no box files in the training set. I'd like to confirm if the LSTM t

Re: [tesseract-ocr] Re: Tesseract training ground truth: I'm confused about the box files

2024-09-05 Thread Zdenko Podobny
have a look at provided example ocrd-testset.zip Zdenko ut 3. 9. 2024 o 16:04 'Danny' via tesseract-ocr < tesseract-ocr@googlegroups.com> napísal(a): > @zdenop wrote: > | Tesseract LSTM engine (tesseract >=v4) training scr