Hi,

I am new to Tesseract. I searched for an OCR library, found Tesseract and 
now I want to use it for a specific measure protocol.

I built Tesseract 5.3.2 from source and the dependencies leptonica-1.83, 
libpng and OpenJPEG for Windows with the Latex Visual C++ compiler for 
Windows, x64.

Then I did some first tests based on the examples from the documentation ( 
*Basic_example 
*and *SetRectangle_example*). As data set I use *eng.traineddata* from the 
*testdata_best* repo.

Now, I have a behaviour which I can't classify. I tried to recognize a 
float value in a given rectangle (with *SetRectangle * ). Tesseract didn't 
converted it (empty return). Then I manually copied the rectangle and saved 
it in a new file (see attached Single_Number.png). Then I tried this file 
without the *SetRectangle *call*. *Now it works.

The attached* Protocol_table.png *is the original image, but I removed all 
other stuff in the picture. So it's empty except the number at the original 
position. Now I have the following behaviour: in DEBUG mode the conversion 
works, in RELEASE mode not.

I also tried to slighty enlarge the rectangle area (see last SetRectangle 
call in the code below). But now I got a runtime exception. The resolution 
of the picture is 2625x1682. So there should be no buffer overflow?!

Am I doning something wrong here? Or what's the problem for this behaviour?

This is my basic code:

//std includes
#include <iostream>

//tesseract includes
#include "tesseract/baseapi.h"

//Leptonica includes
#include "allheaders.h"



//! 
int main()
{
    tesseract::TessBaseAPI api;
    // Initialize tesseract-ocr with English, without specifying tessdata 
path
    if (api.Init(nullptr, "eng"))
    {
        std::cout << "Could not initialize tesseract." << std::endl;
        return 1;
    }

    //
    Pix* image = 
pixRead("D:/projects/cpp/Tesseract-Test/Protocol_Table.png");
    //Pix* image = 
pixRead("D:/projects/cpp/Tesseract-Test/Single_Number.png");
    api.SetImage(image);
    // Restrict recognition to a sub-rectangle of the image
    // SetRectangle(left, top, width, height)
    api.SetRectangle(807, 1393, 93, 49);
    //api.SetRectangle(707, 1293, 193, 149);
    // Get OCR result
    char* outText = api.GetUTF8Text();
    if (outText)
        printf("OCR output:\n%s", outText);

    // Destroy used object and release memory
    api.End();    
}

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/626e06c6-ea15-45d2-86da-1bba6c069e1cn%40googlegroups.com.

Reply via email to