Thank you, just saw from your link that it is posted !! 
I'm so glad to hear this news  

Ajinkya

On Wednesday, 9 April 2025 at 10:26:46 UTC+5:30 zdenop wrote:

Thank your tool - it is already listed in tesseract doc:
https://github.com/tesseract-ocr/tessdoc/blob/main/User-Projects-%E2%80%93-3rdParty.md#4-others-utilities-tools-command-line-interfaces-cli-etc

Zdenko


ut 8. 4. 2025 o 6:09 Ajinkya Bobade <ajinkya...@gmail.com> napĂ­sal(a):

I have noticed that text cleaning is the most difficult part in OCR 
pipeline. I have struggled alot on this part, without properly cleaned text 
OCR simply fails in terms of accuracy. In order to handle text cleaning 
seperately I created  a GitHub repo that uses AI to clean up all text in a 
image. Once the text is cleaned we can choose our own custom OCR models on 
it. I have personally seen OCR accuracy shoot up to 99% on a properly 
preprocessed and cleaned image. 

Here is a Github: https://github.com/ajinkya933/ClearText link. 

Regards 
Ajinkya

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an 
email to tesseract-oc...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAHy6iNOjhs7ZY7r26fGzqJOUr2e%2BF3bY%3DeDCHjM-VD7XH5M%3DTA%40mail.gmail.com
 
<https://groups.google.com/d/msgid/tesseract-ocr/CAHy6iNOjhs7ZY7r26fGzqJOUr2e%2BF3bY%3DeDCHjM-VD7XH5M%3DTA%40mail.gmail.com?utm_medium=email&utm_source=footer>
.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/tesseract-ocr/2dba36a5-ca5f-4cf2-95df-4b418eef20a2n%40googlegroups.com.

Reply via email to