First I test tesseract on file generated as flat image.
I generate Lorem Ipsum text:
5 paragraphs, 452 words 2978 bytes, 24 lines + 4 blank lines, maximal line
len in my editor was 135 chars.
Result: 100% accurate but two full stop marks, fantastic.
Next, I rotate image. Only 0.7 degree caused
I have Linux and prefer batch. I found
https://gist.github.com/endolith/334196bac1cac45a4893 (from
https://tesseract-ocr.github.io/tessdoc/ImproveQuality.html#examples). It
correct recognizes 1.00 degree. How it combine with tesseract?
--
You received this message because you are subscribed to
I found mzucker/page_dewarp on github - tool for dewarp books and convert
color to black and white
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to tesseract-oc
I wanna:
- scan page to image
- generate OCR tekst from image
- generate pdf page with image and tekst
I must know where in image are specified words
Is any option of tesseract to give such information?
How to use tesseract ads library instead of command line?
--
You received this message becaus
I attach to this thread. I new in tesseract. What is the tesseract main
algorithm?
--
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com
To unsubscribe from this group, send emai
I install Tesseract-OCR. If I process eurotext.tif it will be ok, but if I
try recognize mean quality text - fro example first from
https://www.google.com/recaptcha/digitizing, a get results:
Th: llxnuuundge ma Lane nmomc: Elvin;
men courage Al Inc ncenz uslem nd Vic:-s, In ur-
whereas when I giv
Big accuracy improvement is when I resize to 500% the same image.
--
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc
Is any place where are described algorithms which Tesseract uses? It is too
much code to analyse.
--
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com
To unsubscribe from this g
Maybe preprocessing is important. The resizing increased a lot accuracy.
Probably converting to binary image breaks small letters
W dniu środa, 12 marca 2014 22:05:45 UTC+1 użytkownik zdenop napisał:
>
> Let's summarize it:
>
>- You used tesseract
>- http://www.free-ocr.com/ uses tesser
I do not write stricte OCR application, but recognize images for medicine.
I get after Canny edge detection this images:http://i.imgur.com/HIJQupz.png
and http://i.imgur.com/UNaUZZ9.png
How distinguish this images with eight-like or small g-like narrowing from
simple ellipses and lines?
I need OC
void TextWindow::recognize(const char *imagepath)
{
Pix* pixs = pixRead(imagepath);
if (!pixs)
{
fprintf(stderr, "Cannot open input file: %s\n", imagepath);
exit(2);
}
tesseract::TessBaseAPI api;
const char* lang = "pol";
const char* datapath = "/usr/shar
11 matches
Mail list logo