Re: [tesseract-ocr] Box file layout for training tesseract4

2019-04-09 Thread mohitolp .
thank you very much, that helped a lot :D Timothy Snyder schrieb am Fr., 1. März 2019, 16:43: > Sorry for the delay. You have access now. I need to set the link to public! > > On Mon, Feb 25, 2019 at 8:10 AM mohito wrote: > >> Hi, >> >> would you be so kind to make this link public or give me p

Re: [tesseract-ocr] Making custom traineddata

2019-04-09 Thread shree
see https://github.com/Shreeshrii/tessdata_ocrb Retrained to add missing X using 3 fonts at 3 exposures and a larger training text compared to previous version. Both float/best and integer/fast versions are provided. - Download best version

Re: [tesseract-ocr] Making custom traineddata

2019-04-09 Thread shree
Correction: fast version is *ocrb_int (not ocrb-int).* -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to th

[tesseract-ocr] Re: Training Tesseract 4 from Scratch

2019-04-09 Thread Shobhit Kapil
Hi Shree, Could you please share your valuable feedback on on the below points... Hi , Before starting this training process i would like to know a bit about the process 1. i have files which are not very clear and have different sort of noises will the training will be helpful in such sc

[tesseract-ocr] Re: Tesseract on VS

2019-04-09 Thread Shobhit Kapil
Thanks for your reply, actually i am also using Tesseract on VS and facing few challenges like lifting the data from the images is not as good as expected due to below reasons: 1. Fewer of the images are not very clean have different sort of noises and character are miss reading like Z as 2 B a

Re: [tesseract-ocr] Questions about recognize Chinese characters

2019-04-09 Thread Aaron Shieh
I get '焊接' with the following: tesseract 67.png o -l chi_tra --oem 0 --psm 7 i'm using tesseract 4.1.0 64-bit build on windows 10, and traineddata from https://github.com/tesseract-ocr/tessdata -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.

Re: [tesseract-ocr] Questions about recognize Chinese characters

2019-04-09 Thread Shree Devi Kumar
I think you will get better results with --oem 1. The legacy models are better only in limited cases. For complex scripts the LSTM engine and models are better, as far as I can tell. On Wed, 10 Apr 2019, 10:23 Aaron Shieh, wrote: > I get '焊接' with the following: > tesseract 67.png o -l chi_tra