[tesseract-ocr] Re: Trying to use --oem 0 but now cannot load languages

2019-07-14 Thread Kyle Foley
I solved this problem, but when I reverted to an old Tesseract the accuracy went down from 99% to a shocking 75%. I can't believe this is happening. Why would anyone remove an entirely useful feature from their software? Do I really have to spend 10 hours learning how to train this thing to

[tesseract-ocr] Trying to use --oem 0 but now cannot load languages

2019-07-14 Thread Kyle Foley
I'm trying to set the tessedit_char_whitelist but it does not work in tesseract 4 so I read here https://github.com/tesseract-ocr/tesseract/issues/751#issuecomment-423521780 from amitdo that I need to use --oem 0. I put in the following syntax str4 = pytesseract.image_to_string(Image.open(str3)

Re: [tesseract-ocr] how to train tesseract to detect superscripts and subscripts

2019-07-14 Thread Kyle Foley
Actually, on second thought, I am going to have to learn how to use the train feature anyway, so I might as well learn it now. Still, I want to know how many images do I need to train it with first. Do you know the answer to this? How many images per new character would I need before I get relia

Re: [tesseract-ocr] how to train tesseract to detect superscripts and subscripts

2019-07-14 Thread shree
You can try training from scratch. Use training text and font similar to what you need to recognize. Alternately, try ocrd-train with line images with ground truth. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group

[tesseract-ocr] Re: Java GUI frontend for Tesseract OCR engine

2019-07-14 Thread Serious Hacker
Hello Gaara Sabaku, I am not sure if you are still active, however, I am facing an issue with Tesseract which says : ERROR [Tesseract] Need to install JAI Image I/O package.https://java.net/projects/jai-imageio/ java.lang.RuntimeException: Need to install JAI Image I/O package. I have the re

Re: [tesseract-ocr] how to train tesseract to detect superscripts and subscripts

2019-07-14 Thread fady taher
Dear shree, am having a problem training the model, When I added more samples ... the result got worse, is there a best practice to add training data to train the model ? Regards On Thu, Jul 11, 2019 at 3:15 PM fady taher wrote: > so ... I added "Cr⁶⁺" 66 times but am getting "Cr³+" instead .

Re: [tesseract-ocr] segmentation fault when go app in run in docker container with tesseract installation

2019-07-14 Thread Zdenko Podobny
This is absolutely not sufficient information. + seems like you are using tesseract 3 which is quite outdated and not supported version anymore. Zdenko ne 14. 7. 2019 o 9:39 Chanda Nikhil kumar napísal(a): > Hey team, > > I am facing segmentation fault (core dumped) error when trying to run ap

[tesseract-ocr] segmentation fault when go app in run in docker container with tesseract installation

2019-07-14 Thread Chanda Nikhil kumar
Hey team, I am facing segmentation fault (core dumped) error when trying to run app binary with gosseract usage with tesseract and leptonica installations. The following is the warning show when debugged using gdb: libtesseract.so.3 is something which is throwing the segmentation fault can som