Yes, there is a problem with SetRectangle or there is a mismatch between
other API functions (e.g. GetThresholdedImage).
It could be demonstrated with the attached simple code.

According to API [1] SetRectangle(left, *top*, width, height) e.g.
SetRectangle(left, top, width, height *.3)  should OCR the first 30% of the
image. Indeed GetThresholdedImage provides it correctly.
But GetUTF8Text() OCRed "last" 30% of the image (e.g. it acts like
SetRectangle(left, *bottom*, width, height)

IMO safer solution is to use the cropped image for SetImage.

[1]
https://github.com/tesseract-ocr/tesseract/blob/0768e4ff4c21aaf0b9beb297e6bb79ad8cb301b0/include/tesseract/baseapi.h#L340


Zdenko


ut 1. 8. 2023 o 20:40 CraigLandrum <cra...@mindwrap.com> napĂ­sal(a):

> We use tesseract in our document imaging app - first started with version
> 2.x and recently upgraded from 3.05 to 5.3.1, and something broke.  We
> supply images to tesseract using SetImage and then SetRectangle.  In one of
> our apps, we often OCR the top third of invoices to gather info on a
> vendor.  This worked fine in 3.05 but not in 5.3.1.  If I specify the full
> image dimensions in SetRectangle (as provided to SetImage), all works fine,
> but if I specify dimensions in SetRectangle to just do the top third of the
> image, I get total garbage back. We are providing one-bit B&W images to
> SetImage (white = 1)and specify the target area in pixels. Something
> changed between 3.05 and 5.3.1 to make this not work.  Is there something I
> missed in the interim?  Perhaps SetRectangle(x,y,w,h) wants dimensions that
> start on 8-bit bounds or something equally restrictive?  Any suggestions
> welcome.
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/3959f739-c152-4526-93bc-3ea63b9e088an%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/3959f739-c152-4526-93bc-3ea63b9e088an%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zfPC8jquGvFWS0q%2BPV05Awr0a5FzwWev8VGFvAG-F-UA%40mail.gmail.com.
/*
invoice.png -> 
https://images.ctfassets.net/lzny33ho1g45/5HzGPfsoZo3g7klt0Aww6X/89adc1672b7872667eb5f781adeccfac/fcb74faee4c0576ceaacf82777f6bc93__1_.png?w=1400
*/

#include <leptonica/allheaders.h>
#include <tesseract/baseapi.h>

int main() {

  tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
  api->Init(NULL, "eng");

  Pix *image = pixRead("invoice.png");
  api->SetImage(image);
  int w = pixGetWidth(image);
  int h =  pixGetHeight(image);
  // int h_adj = h - h * .7;
  int h_adj = h * .3;
  api->SetRectangle(0, 0, w, h_adj);
  char *outTextSR = api->GetUTF8Text();
  printf("********\tOCR output after SetRectangle:\n%s", outTextSR);
  Pix *rect_pix = api->GetThresholdedImage();
  pixWrite("ocred_pix.png", rect_pix, IFF_PNG);

  api->SetImage(rect_pix);
  char *outTextSI = api->GetUTF8Text();
  printf("\n********\tOCR output SetImage:\n%s", outTextSI);

  api->End();
  pixDestroy(&image);
  pixDestroy(&rect_pix);
  delete[] outTextSR;
  delete[] outTextSI;
  delete api;
  return 0;
}

Reply via email to