> > Is it still broken in version 5? The thread you posted is from 2017!
[image: image.png] Zdenko št 14. 9. 2023 o 17:10 Gilad Pellaeon <ld.pella...@gmail.com> napísal(a): > Is it still broken in version 5? The thread you posted is from 2017! > > One thing I noticed in the meantime: I stored my PNGs with paint.net and > auto bit depth recognition. The image is grayscale. So the image wasn't > stored in 32bit. > Now, I forced to save the image 32bit color depth and now it works. So I > assume it's a bug regarding the bit depth handling. This also explains the > memory access violation althrough the pixel range doesn't violet the image > pixel size. If the system internally assumes a bigger depth then it also > wants to process bigger memory chunks. > > Is this problem known? Otherwise I can create a bug report. > > > Best regards > > zdenop schrieb am Donnerstag, 14. September 2023 um 16:52:58 UTC+2: > >> >> https://github.com/tesseract-ocr/tesseract/issues/845 >> >> Zdenko >> >> >> št 14. 9. 2023 o 16:49 Gilad Pellaeon <ld.pe...@gmail.com> napísal(a): >> >>> Hi, >>> >>> I am new to Tesseract. I searched for an OCR library, found Tesseract >>> and now I want to use it for a specific measure protocol. >>> >>> I built Tesseract 5.3.2 from source and the dependencies leptonica-1.83, >>> libpng and OpenJPEG for Windows with the Latex Visual C++ compiler for >>> Windows, x64. >>> >>> Then I did some first tests based on the examples from the documentation >>> ( *Basic_example *and *SetRectangle_example*). As data set I use >>> *eng.traineddata* from the *testdata_best* repo. >>> >>> Now, I have a behaviour which I can't classify. I tried to recognize a >>> float value in a given rectangle (with *SetRectangle * ). Tesseract >>> didn't converted it (empty return). Then I manually copied the rectangle >>> and saved it in a new file (see attached Single_Number.png). Then I tried >>> this file without the *SetRectangle *call*. *Now it works. >>> >>> The attached* Protocol_table.png *is the original image, but I removed >>> all other stuff in the picture. So it's empty except the number at the >>> original position. Now I have the following behaviour: in DEBUG mode the >>> conversion works, in RELEASE mode not. >>> >>> I also tried to slighty enlarge the rectangle area (see last >>> SetRectangle call in the code below). But now I got a runtime exception. >>> The resolution of the picture is 2625x1682. So there should be no buffer >>> overflow?! >>> >>> Am I doning something wrong here? Or what's the problem for this >>> behaviour? >>> >>> This is my basic code: >>> >>> //std includes >>> #include <iostream> >>> >>> //tesseract includes >>> #include "tesseract/baseapi.h" >>> >>> //Leptonica includes >>> #include "allheaders.h" >>> >>> >>> >>> //! >>> int main() >>> { >>> tesseract::TessBaseAPI api; >>> // Initialize tesseract-ocr with English, without specifying >>> tessdata path >>> if (api.Init(nullptr, "eng")) >>> { >>> std::cout << "Could not initialize tesseract." << std::endl; >>> return 1; >>> } >>> >>> // >>> Pix* image = >>> pixRead("D:/projects/cpp/Tesseract-Test/Protocol_Table.png"); >>> //Pix* image = >>> pixRead("D:/projects/cpp/Tesseract-Test/Single_Number.png"); >>> api.SetImage(image); >>> // Restrict recognition to a sub-rectangle of the image >>> // SetRectangle(left, top, width, height) >>> api.SetRectangle(807, 1393, 93, 49); >>> //api.SetRectangle(707, 1293, 193, 149); >>> // Get OCR result >>> char* outText = api.GetUTF8Text(); >>> if (outText) >>> printf("OCR output:\n%s", outText); >>> >>> // Destroy used object and release memory >>> api.End(); >>> } >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to tesseract-oc...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/626e06c6-ea15-45d2-86da-1bba6c069e1cn%40googlegroups.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/626e06c6-ea15-45d2-86da-1bba6c069e1cn%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/d262537a-9065-4876-b435-071a8e596745n%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/d262537a-9065-4876-b435-071a8e596745n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8yzvVyFq1Y67w%2BQF5o%2ByGiFV-%2BuRf-TacbhP%2BanyPuHUA%40mail.gmail.com.