I said that the problem was in AdaptiveClassifierIsEmpty because Windows dumped the state of the process when the read-access violation occurred, and AdaptiveClassifierIsEmpty was the currently-executing function at the top of the call stack. This was deep within a call to the public function Recognize.
I have since found these problems: 1. During the Init call, if the eng.traineddata file is not found, then init_tesseract_lang_data called tprintf with an error message and then returned -1. At this point, AdaptedTemplates is still a null pointer because its object is allocated later. I believe that this is not a bug. 2. We were not getting the error message generated by tprintf. We have our own version of tprintf that sprintf's to a string (which is sent to a searchable logging system) rather than fprintf's to stderr. But our own version wasn't getting linked in so the error message was lost. I believe that this is a bug on our side. 3. We were calling Init but not checking the return value. It appears to have returned -1 and we ignored it and called Recognize anyway. I believe that this is a bug on our side. I have resolved the bugs in 2 and 3 and things seem to be normal again. Thank you for your time. On Thursday, September 22, 2022 at 2:15:53 PM UTC-4 zdenop wrote: > Tesseract 4.x is an old and unsupported version. > > So it would be nice if you could provide an example code with the public > API that causes the read-access violation problem. > function AdaptiveClassifierIsEmpty is not part of the public API ( > https://github.com/tesseract-ocr/tesseract/tree/main/include/tesseract). > > > Zdenko > > > št 22. 9. 2022 o 8:42 Darren Morby <darren...@gmail.com> napísal(a): > >> This is in Tesseract 4.01. >> >> I get a read-access violation in this function in classify.h: >> >> bool AdaptiveClassifierIsEmpty() const { >> return AdaptedTemplates->NumPermClasses == 0; >> } >> >> This function does not check that AdaptedTemplates is nullptr or not >> nullptr. It is being called by Tesseract::recog_all_words, which in turn >> is being called by TessBaseAPI::Recognize. Is there a function that I >> should call to make sure that the Tesseract object is being initialized >> correctly? >> >> I notice that only two functions actually make sure that AdaptedTemplates >> is not nullptr: InitForAnalysePage and FindLines(). Should I be calling >> one of these functions before Recognize? (InitForAnalysePage curiously >> says that "Calls that attempt recognition will generate an error" but I >> don't see why.) >> >> Thanks. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to tesseract-oc...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/19c511d3-0544-469c-add3-a9ecea3efb68n%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/19c511d3-0544-469c-add3-a9ecea3efb68n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2af2e1ee-b7e5-4086-aa20-9b621fbc4637n%40googlegroups.com.