[tesseract-ocr] not able to run autogen.sh building tesseract-master 4.0.0
Guys, I am trying to build tesseract 4.0.0 from master branch I am facing following issue Running aclocal Running /usr/bin/libtoolize libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, `config'. libtoolize: copying file `config/ltmain.sh' libtoolize: putting macros in AC_CONFIG_MACRO_DIR, `m4'. libtoolize: copying file `m4/libtool.m4' libtoolize: copying file `m4/ltoptions.m4' libtoolize: copying file `m4/ltsugar.m4' libtoolize: copying file `m4/ltversion.m4' libtoolize: copying file `m4/lt~obsolete.m4' Running autoheader Running automake --add-missing --copy unittest/Makefile.am:100: variable `EXTRA_apiexample_test_DEPENDENCIES' is defined but no program or unittest/Makefile.am:100: library has `EXTRA_apiexample_test' as canonical name (possible typo) Running autoconf configure.ac:10: error: possibly undefined macro: m4_esyscmd_s If this token and others are legitimate, please use m4_pattern_allow. See the Autoconf documentation. Something went wrong, bailing out! Something does not seem right with this configure.ac. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/9e98cf55-ce04-4c88-ac52-c55beab984ba%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[tesseract-ocr] Re: not able to run autogen.sh building tesseract-master 4.0.0
I think, you are right... I upgraded to autoconf 2.69. And process moved a bit. But still when I am executing ./configure --enable-debug I got an error which says configure: error: Your compiler does not have the necessary c++11 support! Cannot proceed. Thus, looking into it. On Wednesday, 25 July 2018 20:38:03 UTC+5:30, Yogesh Sanchihar wrote: > > Guys, > > I am trying to build tesseract 4.0.0 from master branch > > I am facing following issue > > >Running aclocal > Running /usr/bin/libtoolize > libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, `config'. > libtoolize: copying file `config/ltmain.sh' > libtoolize: putting macros in AC_CONFIG_MACRO_DIR, `m4'. > libtoolize: copying file `m4/libtool.m4' > libtoolize: copying file `m4/ltoptions.m4' > libtoolize: copying file `m4/ltsugar.m4' > libtoolize: copying file `m4/ltversion.m4' > libtoolize: copying file `m4/lt~obsolete.m4' > Running autoheader > Running automake --add-missing --copy > unittest/Makefile.am:100: variable `EXTRA_apiexample_test_DEPENDENCIES' is > defined but no program or > unittest/Makefile.am:100: library has `EXTRA_apiexample_test' as > canonical name (possible typo) > Running autoconf > configure.ac:10: error: possibly undefined macro: m4_esyscmd_s > If this token and others are legitimate, please use m4_pattern_allow. > See the Autoconf documentation. > > Something went wrong, bailing out! > > > > > Something does not seem right with this configure.ac. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/63e1dccb-0892-4bc1-976f-f5959701222b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[tesseract-ocr] tesseract does not recognize grey colored fonts in the images..
If we have a text not black, but light greyish. tesseract does not recognize it. Any solutions to this problem. Have attached images of the sample bill. Suppose I want to extract Base Fare Base Fare - *Rs 500* But Since Base Fare is light greyish. Tesseract does not recognize it at all. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/0d7548e7-7c00-4c39-88c1-9212a1dab38a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [tesseract-ocr] Re: tesseract does not recognize grey colored fonts in the images..
okay, James.. Than you for your response. I would try. On Tue, Jul 31, 2018 at 5:04 PM, James Q wrote: > It could be that a threshold operation is taking place at a lower > brightness than you grey text. Try binarizing the image with a high > threshold value befo sending to tesseract (e.g.200) this should make all > the text black. > > On Saturday, July 28, 2018 at 4:00:16 PM UTC+1, Yogesh Sanchihar wrote: >> >> If we have a text not black, but light greyish. tesseract does not >> recognize it. >> >> Any solutions to this problem. >> >> Have attached images of the sample bill. >> >> Suppose I want to extract Base Fare >> >> Base Fare - *Rs 500* >> >> But Since Base Fare is light greyish. Tesseract does not recognize it at >> all. >> >> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/tesseract-ocr/f1c49f5b-27f8-4ed4-8d4d-8f01efe4a58f% > 40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/f1c49f5b-27f8-4ed4-8d4d-8f01efe4a58f%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAC1TBcG8gGVBdHa179-RWibucztvuGj1zTRtoZThX0-h1FzdqQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: [tesseract-ocr] tesseract does not recognize grey colored fonts in the images..
Namastey! okay, I will try this.. Could you help me how to build an image preprocessing pipeline? or atleast sequential steps that I should use to build one. On Wed, Aug 1, 2018 at 1:04 PM, chandra churh chatterjee < chandrachurh.chatterje...@gmail.com> wrote: > Binarize the image and it might give a good solution. > > Chandra Churh Chatterjee > > On Sat, Jul 28, 2018, 8:30 PM Yogesh Sanchihar gmail.com> wrote: > >> If we have a text not black, but light greyish. tesseract does not >> recognize it. >> >> Any solutions to this problem. >> >> Have attached images of the sample bill. >> >> Suppose I want to extract Base Fare >> >> Base Fare - *Rs 500* >> >> But Since Base Fare is light greyish. Tesseract does not recognize it at >> all. >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to tesseract-ocr+unsubscr...@googlegroups.com. >> To post to this group, send email to tesseract-ocr@googlegroups.com. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit https://groups.google.com/d/ >> msgid/tesseract-ocr/0d7548e7-7c00-4c39-88c1-9212a1dab38a% >> 40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/0d7548e7-7c00-4c39-88c1-9212a1dab38a%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/tesseract-ocr/CAD_EDkZ7mPTo--n-WhwoU%3D3Vssf_ > Nmp_ciJw5U%3DnB%3D%3DkJFnLhQ%40mail.gmail.com > <https://groups.google.com/d/msgid/tesseract-ocr/CAD_EDkZ7mPTo--n-WhwoU%3D3Vssf_Nmp_ciJw5U%3DnB%3D%3DkJFnLhQ%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAC1TBcEs88mTnJziP%2Bk4Urv%2BckCSQLejXYokRjhP_GxfP1HD-Q%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.