Hi all, I have an issue with tesseract (.js if that matters) erroneously 
detecting the wrong things in the image. In the following image, it picks 
up the artefact in the top-right quadrant and for some reason only outputs 
"LEVEL", with no digits.

 [image: fail_gyarados.png] 
I realize that removing the artefacts is the best solution, but they can be 
unpredictable in position and shape.
Does anyone have any good ideas or resources you can point me towards to 
isolate and remove these artefacts?
They always start on an edge, so my intuition is that I could (somehow) 
remove any pixel adjacent to a pixel that is (recursively) adjacent to the 
edge. But not sure how to read and modify image data in such a way or if I 
should use an existing library to do so. Also not sure what search terms to 
employ to research such algorithms.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/c32d97ca-795f-4490-8833-e7d7953845b7n%40googlegroups.com.

Reply via email to