Yes, there is a problem with SetRectangle or there is a mismatch between other API functions (e.g. GetThresholdedImage). It could be demonstrated with the attached simple code.
According to API [1] SetRectangle(left, *top*, width, height) e.g. SetRectangle(left, top, width, height *.3) should OCR the first 30% of the image. Indeed GetThresholdedImage provides it correctly. But GetUTF8Text() OCRed "last" 30% of the image (e.g. it acts like SetRectangle(left, *bottom*, width, height) IMO safer solution is to use the cropped image for SetImage. [1] https://github.com/tesseract-ocr/tesseract/blob/0768e4ff4c21aaf0b9beb297e6bb79ad8cb301b0/include/tesseract/baseapi.h#L340 Zdenko ut 1. 8. 2023 o 20:40 CraigLandrum <cra...@mindwrap.com> napĂsal(a): > We use tesseract in our document imaging app - first started with version > 2.x and recently upgraded from 3.05 to 5.3.1, and something broke. We > supply images to tesseract using SetImage and then SetRectangle. In one of > our apps, we often OCR the top third of invoices to gather info on a > vendor. This worked fine in 3.05 but not in 5.3.1. If I specify the full > image dimensions in SetRectangle (as provided to SetImage), all works fine, > but if I specify dimensions in SetRectangle to just do the top third of the > image, I get total garbage back. We are providing one-bit B&W images to > SetImage (white = 1)and specify the target area in pixels. Something > changed between 3.05 and 5.3.1 to make this not work. Is there something I > missed in the interim? Perhaps SetRectangle(x,y,w,h) wants dimensions that > start on 8-bit bounds or something equally restrictive? Any suggestions > welcome. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/3959f739-c152-4526-93bc-3ea63b9e088an%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/3959f739-c152-4526-93bc-3ea63b9e088an%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zfPC8jquGvFWS0q%2BPV05Awr0a5FzwWev8VGFvAG-F-UA%40mail.gmail.com.
/* invoice.png -> https://images.ctfassets.net/lzny33ho1g45/5HzGPfsoZo3g7klt0Aww6X/89adc1672b7872667eb5f781adeccfac/fcb74faee4c0576ceaacf82777f6bc93__1_.png?w=1400 */ #include <leptonica/allheaders.h> #include <tesseract/baseapi.h> int main() { tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI(); api->Init(NULL, "eng"); Pix *image = pixRead("invoice.png"); api->SetImage(image); int w = pixGetWidth(image); int h = pixGetHeight(image); // int h_adj = h - h * .7; int h_adj = h * .3; api->SetRectangle(0, 0, w, h_adj); char *outTextSR = api->GetUTF8Text(); printf("********\tOCR output after SetRectangle:\n%s", outTextSR); Pix *rect_pix = api->GetThresholdedImage(); pixWrite("ocred_pix.png", rect_pix, IFF_PNG); api->SetImage(rect_pix); char *outTextSI = api->GetUTF8Text(); printf("\n********\tOCR output SetImage:\n%s", outTextSI); api->End(); pixDestroy(&image); pixDestroy(&rect_pix); delete[] outTextSR; delete[] outTextSI; delete api; return 0; }