Re: [tesseract-ocr] Re: ErrorInInitializerError - zip file closed out of tess4j.util.LoadLibs.getTesseractLibName

2022-09-23 Thread Ralph Cook
How is this supposed to help me? I have a program using the Tesseract library to do OCR and then process the resulting text; I don't need a GUI front end. On Thu, Sep 22, 2022 at 10:09 PM Quan Nguyen wrote: > You may want to try VietOCR, a Java desktop app that uses Tess4J. It works > with Java

[tesseract-ocr] Question : can I force Tesseract to follow an existing layout?

2022-09-23 Thread Vincent Sarbach-Pulicani
Hello, I'm working on historical newspaper from the interwar period written in 3 different languages : corsican, french and italian. After many tries, Tesseract seems to be the best OCR for me but the layout analysis of a newspaper is complex. However, using the API of Gallica (French national li

[tesseract-ocr] scripts vs single languages

2022-09-23 Thread Sylwia Kowalska
Hello ! I need to OCR some poor quality documents which contain different alphabets e.g. german/polish/english. I got a hint to use script/Latin instead of single languages. (I mean -l scrpit/Latin vs -l eng+pol+deu ) Why is it better? How scrpits works? Are those models for whole alphabets?

Re: [tesseract-ocr] Question : can I force Tesseract to follow an existing layout?

2022-09-23 Thread Zdenko Podobny
Tesseract support uzn file[1] with psm 4. Seach forum for more details [1] https://github.com/OpenGreekAndLatin/greek-dev/wiki/uzn-format Zdenko pi 23. 9. 2022 o 17:20 Vincent Sarbach-Pulicani napísal(a): > Hello, > I'm working on historical newspaper from the interwar period written in 3 >

Re: [tesseract-ocr] Question : can I force Tesseract to follow an existing layout?

2022-09-23 Thread Vincent Sarbach-Pulicani
Ok, I'll check that, thanks again. Le ven. 23 sept. 2022 à 18:44, Zdenko Podobny a écrit : > Tesseract support uzn file[1] with psm 4. Seach forum for more details > > [1] https://github.com/OpenGreekAndLatin/greek-dev/wiki/uzn-format > > > Zdenko > > > pi 23. 9. 2022 o 17:20 Vincent Sarbach-Pul

Re: [tesseract-ocr] AdaptiveClassifierIsEmpty read-access violation

2022-09-23 Thread Darren Morby
I said that the problem was in AdaptiveClassifierIsEmpty because Windows dumped the state of the process when the read-access violation occurred, and AdaptiveClassifierIsEmpty was the currently-executing function at the top of the call stack. This was deep within a call to the public function