Thanks 

On Saturday, July 20, 2024 at 5:15:44 PM UTC+3 ger.h...@gmail.com wrote:

> Too little information provided for anyone to try and (at least) reproduce 
> your problem.
>
> Besides, if this is your source image you're toast anyway. For you and 
> others:
>
> [image: mekur-bad-rez2.webp]
>
>
> your image reports as ~ 400x500-something pixels in size. (In the chart 
> image above numbers' unit is *hundreds of pixels* i.e. '4' = 400 px) and *for 
> tesseract to have a chance at all a single text line's C[apitals]-height 
> should be around 30px*; higher can be scaled down if needed, during image 
> preprocessing done before feeding your stuff to tesseract.
>
> TL;DR: that '30' number means the number of text lines in a section of 100 
> pixels should be about *3* (or rather less as line-height > C-height > 
> x-height), not **9** lines as counted in your image!
>
> I don't know this language, but for you & anyone else who likes to have at 
> least a fighting chance of OCR-ing something: 30px D-height implies a 
> ball-park number of 20px for x-height and "reasonable" line heights to be 
> 40px or more. And, please, don't get me started on "I resize the image if 
> you want it to be bigger!" 🤦  To the machine, the above image is just a 
> bunch of pixelated noise, alas, irrespective of what language the original 
> was ever written in. Lower pixel measurement values, not surpassing the 
> benchmark of 30px per line? Redo your scans, get better hardware, do a 
> better job at the image preprocessing (this image is also failing that 
> benchmark, incidentally, but one can write a book on that subject alone, so 
> we'll leave that out)
>
>
>
> Met vriendelijke groeten / Best regards,
>
> Ger Hobbelt
>
> --------------------------------------------------
> web:    http://www.hobbelt.com/
>         http://www.hebbut.net/
> mail:   g...@hobbelt.com
> mobile: +31-6-11 120 978
> --------------------------------------------------
>
>
> On Wed, Jul 10, 2024 at 11:12 AM Mekuriaw Aze <mekur...@gmail.com> wrote:
>
>> Dear All
>> Cooperation request
>> My question is, if I do it again and again in Python to change the image 
>> to text and make it readable, it give me an error, help me?
>> Is the image attached below? Is Geez an Ethiopian language?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/4f47a021-d4ee-4994-bb1b-65009a443153n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/4f47a021-d4ee-4994-bb1b-65009a443153n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/0c5ab05c-9f4e-46c5-950b-99afd248a0dan%40googlegroups.com.

Reply via email to