You can look in this repo. https://github.com/Shreeshrii/tessdata_ocrb.
Use finetune-ocrb.sh On Wed, Oct 24, 2018 at 10:29 PM Shree Devi Kumar <shreesh...@gmail.com> wrote: > See the wiki page on training 4.0 and follow the tutorial. > > On Wed, 24 Oct 2018, 08:09 , <vivek.vija...@teknowmics.co.in> wrote: > >> training/lstmtraining --model_output /path/to/output [--max_image_MB 6000] \ >> --continue_from /path/to/existing/model \ >> --traineddata /path/to/original/traineddata \ >> [--perfect_sample_delay 0] [--debug_interval 0] \ >> [--max_iterations 0] [--target_error_rate 0.01] \ >> --train_listfile /path/to/list/of/filenames.txt >> >> In this command, what should be passed to the argument *continue_from* and >> *traineddata*? I'm a bit confused. >> >> >> On Wednesday, 24 October 2018 17:11:16 UTC+5:30, Soumik Ranjan Dasgupta >> wrote: >>> >>> Please see tesseract/src/training/language_specific.sh >>> You need to add the fonts under the respective category after >>> installation. >>> >>> On Wed, Oct 24, 2018, 5:04 PM <vivek....@teknowmics.co.in> wrote: >>> >>>> 'Add the same in font_properties and language_specific.sh' ? Can you >>>> please elaborate? Thank you >>>> >>>> On Wednesday, 17 October 2018 20:18:26 UTC+5:30, Soumik Ranjan Dasgupta >>>> wrote: >>>>> >>>>> You'll need to install the fonts in your system add the same in >>>>> font_properties and language_specific.sh for fine-tuning or training from >>>>> scratch. For further details please see >>>>> https://github.com/tesseract-ocr/tesseract/issues/1672. >>>>> >>>>> On Tue, Oct 16, 2018 at 3:40 PM kislay bajpai <kislay....@gmail.com> >>>>> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> Thanks for prompt reply, I want to train tesseract 4.0 alpha for font >>>>>> E13B. How could i train? Please share the knowledge. >>>>>> >>>>>> On Tuesday, October 16, 2018 at 1:57:17 PM UTC+5:30, Soumik Ranjan >>>>>> Dasgupta wrote: >>>>>>> >>>>>>> Please see >>>>>>> https://github.com/tesseract-ocr/tesseract/wiki/Fonts#fonts-for-tesseract-training >>>>>>> . >>>>>>> >>>>>>> On Tue, Oct 16, 2018 at 1:49 PM <kislay...@imageinfosystems.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hello all, >>>>>>>> >>>>>>>> I want to train tesseract 4.0 alpha for a new font, is there anyone >>>>>>>> who can help me on this topic. >>>>>>>> >>>>>>>> On Monday, May 14, 2018 at 7:15:15 PM UTC+5:30, reza wrote: >>>>>>>>> >>>>>>>>> hi >>>>>>>>> i tested tesseract 4 beta on persian lang , the results was good. >>>>>>>>> but i think needs more training on more fonts and texts. >>>>>>>>> how could we train more fonts and texts on model that exist in >>>>>>>>> tesseract 4 beta for persian lang ? >>>>>>>>> >>>>>>>>> and last question is, how could we apply dictionary to correct >>>>>>>>> that words OCRing with error ? >>>>>>>>> >>>>>>>>> thanks >>>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "tesseract-ocr" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to tesseract-oc...@googlegroups.com. >>>>>>>> To post to this group, send email to tesser...@googlegroups.com. >>>>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/1ee9528e-d8fd-4438-9cd0-4925ae7763d5%40googlegroups.com >>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/1ee9528e-d8fd-4438-9cd0-4925ae7763d5%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Regards, >>>>>>> Soumik Ranjan Dasgupta >>>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "tesseract-ocr" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to tesseract-oc...@googlegroups.com. >>>>>> To post to this group, send email to tesser...@googlegroups.com. >>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/tesseract-ocr/72b70562-15f4-4b6f-96a9-62b6d792980c%40googlegroups.com >>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/72b70562-15f4-4b6f-96a9-62b6d792980c%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> >>>>> -- >>>>> Regards, >>>>> Soumik Ranjan Dasgupta >>>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to tesseract-oc...@googlegroups.com. >>>> To post to this group, send email to tesser...@googlegroups.com. >>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/8eafa0fa-6129-4c87-a53b-ae8a5659ae79%40googlegroups.com >>>> <https://groups.google.com/d/msgid/tesseract-ocr/8eafa0fa-6129-4c87-a53b-ae8a5659ae79%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to tesseract-ocr+unsubscr...@googlegroups.com. >> To post to this group, send email to tesseract-ocr@googlegroups.com. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/d374762e-28e2-4118-847f-edec3065b3a8%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/d374762e-28e2-4118-847f-edec3065b3a8%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduX-km279eFQ%3D0Lx-63E5AoUoYerdha6GKenZ15Fcs%2BvrA%40mail.gmail.com > <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduX-km279eFQ%3D0Lx-63E5AoUoYerdha6GKenZ15Fcs%2BvrA%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAN557ay3G49NWCQXticRa%2BUNDmA0-Ob-eUftQ4mrqnbGU-CObA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.