The parameter i meant is "*textord_max_noise_size*" and it defines the maximum size of noise in pixels. You could also try the one you have found in the list "*textord_heavy_nr*".
"Opening and Closing Operators" are morphological operators. I searched Wikipedia fo a nice example, but the english version is only a stub. In your case the opening-operation is the way to go. Many image processing frameworks include morphological operations. If your software does not provide a opening operator look for *erosion* and *dilation*.(opening is just a erosion followed by dilation) I made a quick example in gimp. the picture "before.png" shows my object (the circle) with some noise i want to remove. I executed the erosion operation on this picture with a proper filter mask. The result is in picture "after erosion.png". The circle has changed in size (and shape). As last step i executed the dilation operation in gimp. The resulting image "after dilation.png" shows only the circle. Depending on your objects and noise you need to choose a proper filter mask for this operations. This operation will change the shape of your characters slightly. Am Mittwoch, 29. Mai 2013 19:34:53 UTC+2 schrieb Dmitry Katsubo: > > Thanks for your reply. > > What parameters do you actually mean? I went through the list of > them<http://www.sk-spell.sk.cx/tesseract-ocr-parameters-in-302-version>, > and the only two I was able to find are *matcher_avg_noise_size* and * > textord_heavy_nr*. I have set *matcher_avg_noise_size=100* with no visual > effect. > > And if you can point me to *Opening and Closing - Operators* I will > appreciate. Do you mean that C++ *operator-()* is overridden? Any code > examples about their use? > > On Wednesday, 29 May 2013 08:39:07 UTC+2, Johannes Richter wrote: >> >> There are multiple options to improve this particular case. >> >> - You could preprocess the image, to supress this kind of noise. >> (Look for Opening and Closing - Operators) >> - There is a tesseract-parameter, which takes the minimum size of a >> blob, just count the "noise", add some pixels(just to be sure) and let >> tesseract filter this >> - You could do the blob-size-filtering by yourself >> >> Characters like {, . '} may get deleted too. >> > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
<<attachment: after dilation.png>>
<<attachment: after erosion.png>>
<<attachment: before.png>>

