Hi Lorenzo,

thank you so much for your ideas! Unfortunately, I don't think I can get a 
better image quality. It's a VGA signal that's being grabbed, and well, 
that's the result. Maybe I'll try a different converter.
I did some more tests, too, and the only way I found to get a little better 
results is to segment the image manually and then feed the individual 
segments into tesseract. My problem is, that I need to rely on the results 
(perhaps not 99%, but at least 90%), and that sounds pretty hard to 
achieve. 

Greetings, 
Chris




On Sunday, November 13, 2022 at 1:41:36 PM UTC+1 Lorenzo Blz wrote:

> I did it by hand with Gimp.
>
> The code depends on what you know about the image. If it is fixed size and 
> fixed location you can easily do this, for example, with python and opencv: 
> crop, invert header, two different thresholds.
>
> If the size/alignment are not fixed you could use SIFT to align the image 
> with a fixed template (or use Hough lines to rotate it or something similar 
> if there is not a lot of perspective correction to do).
>
> If it is aligned but not fixed size, you can detect the darkest part with 
> threshold and findContours (with open/close/erode to clean the image) or in 
> simpler ways, it really depends how much the gray tones changes between 
> frames. You could do a floodFill in a few know locations of the header with 
> a different color and find the contours for this colored region (and use 
> the rectangle rotation to rotate the image, if needed)
>
> It may take a few hours of a few days depending on the images.
>
>
> Bye
>
> Lorenzo
>
>
>
> Il giorno dom 13 nov 2022 alle ore 13:27 Mehmet Furkan <
> bakirmeh...@gmail.com> ha scritto:
>
>> Waaw, good job! Could you share the source code of this ocr? If that's 
>> okay, I'll be really happy.
>>
>> On Sunday, 13 November 2022 at 14:15:17 UTC+3 Lorenzo Blz wrote:
>>
>>> Hi Chris,
>>> you should try to get something like this:
>>>
>>> [image: temp2b.jpg]
>>>
>>>
>>>
>>> I inverted the headers section and then did two different threshold on 
>>> each part. If you are not interested in the titles you can just crop them 
>>> out.
>>>
>>> The image is blurry, maybe it was upscaled a little? If so, try 
>>> different levels of upscale, probably better if full integers like 2x, 3x, 
>>> etc. to see if it improves. Or see if other frames from the video might be 
>>> better or improve the video capture (resolution, lighting, frame rate, 
>>> etc.).
>>>
>>> This is what I get:
>>>
>>> Modes Dunchieachiungszet
>>>
>>> = ro
>>> oF wn [3
>>> HF omen | mm
>>> Gesamt 00s 0%
>>>
>>> quite unusable but at least it is starting to find something.
>>>
>>>
>>> I think training will help IF all your images have this kind of blurry 
>>> text and you use actual crops from these images for training.
>>>
>>>
>>> Bye
>>>
>>> Lorenzo
>>>
>>> Il giorno sab 12 nov 2022 alle ore 18:57 Chris E. <goaf...@gmail.com> 
>>> ha scritto:
>>>
>>>> Hi,
>>>>
>>>> I want to OCR this kind of image, which is from a video grabber, 
>>>> unfortunately of pretty bad quality. With the default options of 
>>>> tesseract, 
>>>> it's pretty useless.
>>>> Before I start digging deeper into training tesseract, I would love to 
>>>> hear some recommendations. Would it be possible to achieve a good result 
>>>> from this kind of image with proper training?
>>>> Any further ideas/tips would be appreciated!
>>>>
>>>> Greetings,
>>>> Chris
>>>>
>>>> [image: temp2.jpg]
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to tesseract-oc...@googlegroups.com.
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/tesseract-ocr/edf2898d-e442-46a5-bf0c-46f38561c20en%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/edf2898d-e442-46a5-bf0c-46f38561c20en%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com.
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/c1c375a2-2581-4230-9997-235e210fa7acn%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/c1c375a2-2581-4230-9997-235e210fa7acn%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/33480348-6df9-4a36-8c0d-ea09fd5e5734n%40googlegroups.com.

Reply via email to