Dear Respected group members, 

I am new to OCR and currently working to extract text data from an image. 
The current performance with my custom image preprocessing working well. 
But still some texts are not accurate(example: 'G' is detected as 6, 'O' as 
'D', '0' as 'o' etc). In such cases I would like to follow the steps 
mentioned in "https://tesseract-ocr.github.io/tessdoc/ImproveQuality.html"; 
doc. 
-- I would like to change the configvariables " configuration variables 
<https://tesseract-ocr.github.io/tessdoc/ControlParams>
 load_system_dawg and load_freq_dawg to false."
-- My output data follows certain pattern , so I would also include  
"tessedit_char_whitelist" change.

Spent lot of time in figuring out how to make these changes in my code. But 
unable to found. Kindly help me how to tune these parameters. That can be 
helpful.

Using :
python 3.7
pytesseract 4.1.1
IDE - Jupyter-lab

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/3c29e4b0-0025-419f-b3bc-637ec603400cn%40googlegroups.com.

Reply via email to