By figure text, so you mean "Figure 1: figure supplement 1 Vera et al."?

If so I would do a two-pass approach of cropping out the clearly separated
top right figure text, then resizing it to Tesseract-friendly resolution,
then OCR it.

It worked for me (MacOS, ImageMagick, Tesseract 3.04.01) ...

➜  ocr convert -crop 720x100+0+0 Figure1.jpg Figure1_Crop.jpg


➜  ocr convert -density 72 -resample 300x300 Figure1_Crop.jpg
Figure1_Resampled.jpg

➜  ocr tesseract Figure1_Resampled.jpg fig1


Tesseract Open Source OCR Engine v3.04.01 with Leptonica

Warning in pixReadMemJpeg: work-around: writing to a temp file

➜  ocr cat fig1.txt


Figure 1: figure supplement 1 Vera et al.



On 24 November 2016 at 05:12, <exeter.h...@gmail.com> wrote:

> tesseract extracting  Text Figure as ngure
>
> how to get Figure text from the above image using tesseract
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/21e4b8e3-851d-43d0-8928-c4b12b4db0af%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/21e4b8e3-851d-43d0-8928-c4b12b4db0af%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAORW5vitS5hpx8emmQiqsx44dky8P%2BJJVZjPhG_hVqWttGpk7g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to