Greetings,

I found a potential solution to rewrite each pixel to either white or black 
based on a set threshold. After looking at OpenCV functions I found 
"threshold" would do just that but Tesseract was still finding "ghost" 
characters in the white areas of the image.  So I had to find where the 
string starts and grab an ROI from that point.  Note that the 
THRESH_BINARY_INV parameter to threshold will also convert dark colors to 
white and light colors to black.  From things I've read Tesseract likes 
black characters on white backgrounds.

So the solution I came up with is the following using OpenCV and tesseract:
  
    Mat img;  // should already have the image
    Mat cropped;
    Mat grayed;
    Mat inverted;
    Mat cropNum;

    // Crop the original image to the defined ROI 
    Rect roi(xStart,yStart,xMove,yMove);
    cropped = img(roi);

   // Convert Image to Gray
    cvtColor(cropped, grayed, COLOR_BGR2GRAY);
    
    // Invert Image to black and white
    threshold(grayed,  inverted, 100, 255, THRESH_BINARY_INV);  
    
    // Use tesseract to OCR
    tesseract::TessBaseAPI *ocr = new tesseract::TessBaseAPI();
    ocr->Init(NULL, "eng", tesseract::OEM_LSTM_ONLY);

    ocr->SetPageSegMode(tesseract::PSM_SINGLE_WORD);
   
    ocr->SetImage( inverted .data,  inverted .cols,  inverted .rows, 1,  
inverted .step);

    popupNum = string(ocr->GetUTF8Text());


    NOTE: Be careful with the 4th parameter in ocr->SetImage  function.   
This is the number of bits per pixel.  
                After converting to grayscale it's 1 and not 3.   I forgot 
about this and I was getting 3 strings back.  Quite strange.




On Thursday, February 24, 2022 at 11:02:27 PM UTC-7 Ed Dow wrote:

> Greetings,
>
> I'm using tesseract 4.0.0 in a C/C++ application where I capture an image 
> and then "scrape" text/data from it.  I am having issues with tesseract 
> recognizing the ROI with just several characters ( see attached).  
>
> The attached image is:  *014*
> Recognized as:  */~—6h014 5*
>
> If I get rid of extra space around the number it gets better but the 
> problem is sometimes the string of characters is outside the ROI so I have 
> to increase the size to get all of them.
>
> I've tried using OpenCV to grayscale, blur and resize which has seemed to 
> help a little.  I've also tried all the PSM modes.
>
> The other thing that is puzzling is that from the command line it works 
> great.  Maybe this is due to the image being saved as a jpg first before 
> the OCR is done.  Inside the application it's raw data.
>
> Any thoughts?
> Ed
>
>
> Tesseract Version:
>
> tesseract 4.0.0-beta.1
>  leptonica-1.75.3 
>   libgif 5.1.4 : 
>   libjpeg 8d (libjpeg-turbo 1.5.2) :
>    libpng 1.6.34 :
>   libtiff 4.0.9 :
>   zlib 1.2.11 :
>   libwebp 0.6.1 :
>   libopenjp2 2.3.0
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/bae3383d-84ee-402c-aa2f-af4fe7273a4fn%40googlegroups.com.

Reply via email to