[tesseract-ocr] Mixed Language (Greek-Latin Alphabet) OCR: OCR of Scientific Documents

Karen G Wed, 15 Mar 2023 00:01:19 -0700

I am new to OCR and also Tesseract OCR.

I need to do OCR of scientific documents, which contain a mix of both Latin 
letters (English language) and Greek letters (Greek language), and also 
mathematical symbols.


Our commercial vendor (Dassault Systemes, BIOVIA) recommends integrating 
with Tesseract OCR but we are finding issues using this for our purposes.
(We are not experts in OCR -- only Ph.D. scientists developing code to 
parse scientific instrument data files & reports.)
 
*QUESTION: Can anyone please point us to any conversations or other 
references or projects that might help us optimize Tesseract OCR for this 
use case (mixed Greek-English-Math)?*

Thank you in advance for your patience & any info you might be able to 
provide.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/920e1722-246b-452b-992e-e20f9f9ec7d0n%40googlegroups.com.

[tesseract-ocr] Mixed Language (Greek-Latin Alphabet) OCR: OCR of Scientific Documents

Reply via email to