Try traineddata from tessdata_best and tessdata_fast On Thu 12 Jul, 2018, 6:45 PM mahendrag gajera, <mahendra.gaj...@gmail.com> wrote:
> Hello all > > I am try to ocr japanese images via below code. But it give junk character. > My tesseract version is 4.0 > > Please let me know what is missing here. > > void Test(char* imagePath) > { > char *outText; > > tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI(); > // Initialize tesseract-ocr with English, without specifying tessdata path > if (api->Init("D:\\tessdata", "jpn", > tesseract::OcrEngineMode::OEM_TESSERACT_ONLY)) > { > fprintf(stderr, "Could not initialize tesseract.\n"); > exit(1); > } > > // Open input image with leptonica library > Pix *image = pixRead(imagePath); > api->SetImage(image); > // Get OCR result > outText = api->GetUTF8Text(); > printf("OCR output:\n%s", outText); > > // Destroy used object and release memory > api->End(); > delete[] outText; > pixDestroy(&image); > } > > Using train data from here > > https://github.com/tesseract-ocr/tessdata > > Test data image > > > <https://lh3.googleusercontent.com/-nn1FgPUWwZA/W0S_PJ_D8UI/AAAAAAAACaY/Y9Y6uByvN3kP1vN8tKFP8VMKlIwPIPwyACLcBGAs/s1600/japan4.png> > > Thanks, > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/7bfe8e31-91ea-491c-8e8c-61bdab47dff4%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/7bfe8e31-91ea-491c-8e8c-61bdab47dff4%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXNJEr83NA6OSpBZ%3D8GvSAxhXcHy8qoR%2BjdEOkZwkisAw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.