All,
I need some help extracting the text from this image. I'm using the
command line version of Tesseract from UBMannheim. I think it's 5.2
installed. I've tried every PSM, and nothing seems to pull it out. If I
crop off the minus sign, it works perfectly.
Any tips at all would be appreci
ransform your image as a real Black and white image and it works well.
> -l eng--psm 6 '
>
> Le ven. 23 févr. 2024 à 06:49, Will Fetherolf a
> écrit :
>
>> All,
>>
>> I need some help extracting the text from this image. I'm using the
>> co
send you the python cv2 code
>
> Le ven. 23 févr. 2024 à 22:56, Will Fetherolf a
> écrit :
>
>> Do you know what threshold was used to convert the color to black on
>> white?
>> I'm doing these operations through a home-grown automated test system,
>> and
>>
>> That seems to be enough to separate out the “N” and the “/”.
>>
>>
>>
>> art
>>
>>
>>
>> *From:* tesser...@googlegroups.com *On
>> Behalf Of *Will Fetherolf
>> *Sent:* Monday, October 7, 2024 9:33 PM
>>
I also understand that part of the problem is the kerning used by the
TrueType fonts, and I do not have the ability to get it switched to a
monospaced font. If that were the case this would be easy.
On Wednesday, October 9, 2024 at 11:13:16 AM UTC-5 Will Fetherolf wrote:
> Using differ
try resizing the image, with imagemagick, something like:
>
>
>
> convert test.bmp -resize 200% test.png
>
>
>
> That seems to be enough to separate out the “N” and the “/”.
>
>
>
> art
>
>
>
> *From:* tesser...@googlegroups.com *On
> Behalf
The application I'm attempting to OCR is using what I think is Arial for
the font, but every time I run the attached image through Tesseract 5.4.0
on Windows I get "NVA" or "NIA" depending on which PSM I use. If I use 7,
I always get back "NIA". I have tried running training on a variety of
c
7 matches
Mail list logo