Hi.

1.
I've an image that's written in a "Science Fiction" style font, where 'E' 
is written similarly to '='.
Therefore, the attached image is recognized as 
"AR= YOU SURE YOU WANT TO QuIT >"

However, since Tesseract is using an English dictionary, I'd expect it to 
understand that "ARE" is much more likely than "AR=".

I assume this can be controlled by some configuration?

2.
I tried using https://www.newocr.com/ , which is based on Tesseract, and it 
correctly recognized it:
"ARE YOU SURE YOU WANT TO QUIT ?"
(I've erased the new line)

So, I assume it should be feasible.

3.
Note that https://www.newocr.com/ also correctly recognized the 'U' of 
"QUIT"' as uppercase, and also the ending question mark - I assume that's 
also can be achieved by vanilla Tesseract, the question is how?

Thanks,
Zvika

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/6ca9ab15-56cc-4ead-8708-caac097fc03bn%40googlegroups.com.

Reply via email to