I said that the problem was in AdaptiveClassifierIsEmpty because Windows 
dumped the state of the process when the read-access violation occurred, 
and AdaptiveClassifierIsEmpty  was the currently-executing function at the 
top of the call stack.  This was deep within a call to the public function 
Recognize.

I have since found these problems:

1. During the Init call, if the eng.traineddata file is not found, then 
init_tesseract_lang_data called tprintf with an error message and then 
returned -1.  At this point, AdaptedTemplates is still a null pointer 
because its object is allocated later.  I believe that this is not a bug.
2. We were not getting the error message generated by tprintf.  We have our 
own version of tprintf that sprintf's to a string (which is sent to a 
searchable logging system) rather than fprintf's to stderr.  But our own 
version wasn't getting linked in so the error message was lost.  I believe 
that this is a bug on our side.
3. We were calling Init but not checking the return value.  It appears to 
have returned -1 and we ignored it and called Recognize anyway.  I believe 
that this is a bug on our side.

I have resolved the bugs in 2 and 3 and things seem to be normal again.

Thank you for your time.
On Thursday, September 22, 2022 at 2:15:53 PM UTC-4 zdenop wrote:

> Tesseract 4.x is an old and unsupported version.
>
> So it would be nice if you could provide an example code with the public 
> API that causes the read-access violation problem.
> function AdaptiveClassifierIsEmpty is not part of the public API (
> https://github.com/tesseract-ocr/tesseract/tree/main/include/tesseract). 
>
>
> Zdenko
>
>
> št 22. 9. 2022 o 8:42 Darren Morby <darren...@gmail.com> napísal(a):
>
>> This is in Tesseract 4.01.
>>
>> I get a read-access violation in this function in classify.h:
>>
>>   bool AdaptiveClassifierIsEmpty() const {
>>     return AdaptedTemplates->NumPermClasses == 0;
>>   }
>>
>> This function does not check that AdaptedTemplates is nullptr or not 
>> nullptr.  It is being called by Tesseract::recog_all_words, which in turn 
>> is being called by TessBaseAPI::Recognize.  Is there a function that I 
>> should call to make sure that the Tesseract object is being initialized 
>> correctly?
>>
>> I notice that only two functions actually make sure that AdaptedTemplates 
>> is not nullptr: InitForAnalysePage and FindLines().  Should I be calling 
>> one of these functions before Recognize?  (InitForAnalysePage curiously 
>> says that "Calls that attempt recognition will generate an error" but I 
>> don't see why.)
>>
>> Thanks.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/19c511d3-0544-469c-add3-a9ecea3efb68n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/19c511d3-0544-469c-add3-a9ecea3efb68n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/2af2e1ee-b7e5-4086-aa20-9b621fbc4637n%40googlegroups.com.

Reply via email to