[tesseract-ocr] Tesseract completely fails to recognize consolas font from high resolution image

Are Fri, 28 Apr 2023 10:18:03 -0700

Hello,

I have this simple Tesseract code which takes the attached image and prints 
the result to the console.
I cropped the image to only include the neccessary information (the full 
document has sensitive information). Either way, using the cropped image or 
the full one, it successfully reads most of the text, except for the text 
with the consolas font.


The output I get from the attached image is: ">BUWVveAmæUw >» >> U U"
Although, when I use the full image, it is able to read the bot

I'm using the nor.traineddata, but the result is very similar with 
eng.traineddata also.



Here's my code:

using System;
using Tesseract;

namespace ConsoleApp1
{
    class Program
    {
        static void Main(string[] args)
        {
            using (var engine = new TesseractEngine(@"./tessdata", "nor", 
EngineMode.Default))
            {
                using (var img = Pix.LoadFromFile(@"./images/unnamed2.jpg"))
                {
                    using (var page = engine.Process(img))
                    {
                        var text = page.GetText();
                        Console.WriteLine(text);
                    }
                }
            }
        }
    }
}



*Here's the image:*

[image: unnamed2.jpg]

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/329a8635-723f-4664-957a-0ef952094912n%40googlegroups.com.

[tesseract-ocr] Tesseract completely fails to recognize consolas font from high resolution image

Reply via email to