>
> Is it still broken in version 5? The thread you posted is from 2017!

[image: image.png]


Zdenko


št 14. 9. 2023 o 17:10 Gilad Pellaeon <ld.pella...@gmail.com> napísal(a):

> Is it still broken in version 5? The thread you posted is from 2017!
>
> One thing I noticed in the meantime: I stored my PNGs with paint.net and
> auto bit depth recognition. The image is grayscale. So the image wasn't
> stored in 32bit.
> Now, I forced to save the image 32bit color depth and now it works. So I
> assume it's a bug regarding the bit depth handling. This also explains the
> memory access violation althrough the pixel range doesn't violet the image
> pixel size. If the system internally assumes a bigger depth then it also
> wants to process bigger memory chunks.
>
> Is this problem known? Otherwise I can create a bug report.
>
>
> Best regards
>
> zdenop schrieb am Donnerstag, 14. September 2023 um 16:52:58 UTC+2:
>
>>
>> https://github.com/tesseract-ocr/tesseract/issues/845
>>
>> Zdenko
>>
>>
>> št 14. 9. 2023 o 16:49 Gilad Pellaeon <ld.pe...@gmail.com> napísal(a):
>>
>>> Hi,
>>>
>>> I am new to Tesseract. I searched for an OCR library, found Tesseract
>>> and now I want to use it for a specific measure protocol.
>>>
>>> I built Tesseract 5.3.2 from source and the dependencies leptonica-1.83,
>>> libpng and OpenJPEG for Windows with the Latex Visual C++ compiler for
>>> Windows, x64.
>>>
>>> Then I did some first tests based on the examples from the documentation
>>> ( *Basic_example *and *SetRectangle_example*). As data set I use
>>> *eng.traineddata* from the *testdata_best* repo.
>>>
>>> Now, I have a behaviour which I can't classify. I tried to recognize a
>>> float value in a given rectangle (with *SetRectangle * ). Tesseract
>>> didn't converted it (empty return). Then I manually copied the rectangle
>>> and saved it in a new file (see attached Single_Number.png). Then I tried
>>> this file without the *SetRectangle *call*. *Now it works.
>>>
>>> The attached* Protocol_table.png *is the original image, but I removed
>>> all other stuff in the picture. So it's empty except the number at the
>>> original position. Now I have the following behaviour: in DEBUG mode the
>>> conversion works, in RELEASE mode not.
>>>
>>> I also tried to slighty enlarge the rectangle area (see last
>>> SetRectangle call in the code below). But now I got a runtime exception.
>>> The resolution of the picture is 2625x1682. So there should be no buffer
>>> overflow?!
>>>
>>> Am I doning something wrong here? Or what's the problem for this
>>> behaviour?
>>>
>>> This is my basic code:
>>>
>>> //std includes
>>> #include <iostream>
>>>
>>> //tesseract includes
>>> #include "tesseract/baseapi.h"
>>>
>>> //Leptonica includes
>>> #include "allheaders.h"
>>>
>>>
>>>
>>> //!
>>> int main()
>>> {
>>>     tesseract::TessBaseAPI api;
>>>     // Initialize tesseract-ocr with English, without specifying
>>> tessdata path
>>>     if (api.Init(nullptr, "eng"))
>>>     {
>>>         std::cout << "Could not initialize tesseract." << std::endl;
>>>         return 1;
>>>     }
>>>
>>>     //
>>>     Pix* image =
>>> pixRead("D:/projects/cpp/Tesseract-Test/Protocol_Table.png");
>>>     //Pix* image =
>>> pixRead("D:/projects/cpp/Tesseract-Test/Single_Number.png");
>>>     api.SetImage(image);
>>>     // Restrict recognition to a sub-rectangle of the image
>>>     // SetRectangle(left, top, width, height)
>>>     api.SetRectangle(807, 1393, 93, 49);
>>>     //api.SetRectangle(707, 1293, 193, 149);
>>>     // Get OCR result
>>>     char* outText = api.GetUTF8Text();
>>>     if (outText)
>>>         printf("OCR output:\n%s", outText);
>>>
>>>     // Destroy used object and release memory
>>>     api.End();
>>> }
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesseract-oc...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/626e06c6-ea15-45d2-86da-1bba6c069e1cn%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/626e06c6-ea15-45d2-86da-1bba6c069e1cn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/d262537a-9065-4876-b435-071a8e596745n%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/d262537a-9065-4876-b435-071a8e596745n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8yzvVyFq1Y67w%2BQF5o%2ByGiFV-%2BuRf-TacbhP%2BanyPuHUA%40mail.gmail.com.

Reply via email to