You should use  eng.traineddata file from the tesseract "best" repository 
as your requirement
https://github.com/tesseract-ocr/tessdata_best

for that error you may use a wrong eng.traineddata file

ในวันที่ วันอาทิตย์ที่ 23 มีนาคม ค.ศ. 2025 เวลา 1 นาฬิกา 59 นาที 37 วินาที 
UTC+7 zdenop เขียนว่า:

> Hello,
>
> I notice there may be some gaps in your understanding of Tesseract and its 
> training requirements. Training Tesseract effectively requires careful 
> adherence to its documentation and established processes. Proceeding 
> without this foundation risks wasting both your time and ours. Anyway I put 
> some notes below (inline with blue color) 
>
> Kind regards,
>
> Zdenko
>
> pi 21. 3. 2025 o 18:56 Mitya <mityah...@gmail.com> napísal(a):
>
>> <https://stackoverflow.com/posts/79526256/timeline>
>>
>> I’ve been following this tutorial from YouTube: Guide to Tesseract 
>> Training https://www.youtube.com/watch?v=KE4xEzFGSU8&t=13s and its 
>> corresponding GitHub repository: astutejoe/tesseract_tutorial. 
>> https://github.com/astutejoe/tesseract_tutorial
>>
>> The tutorial walks through the process of training a custom Tesseract 
>> model, but I've run into an issue when trying to continue training the model
>>
> If the tutorial doesn't produce working results, you should contact its 
> author.
>
>> *What we tried*: Setup: I followed the steps in the tutorial to set up 
>> the environment, downloaded the necessary files, and began the training 
>> process using the base eng.traineddata model.
>>
>> *Training Command*: After preparing the training data and ground truth, 
>> I ran the following command to initiate the training:
>> make training MODEL_NAME=Apex START_MODEL=eng 
>> TESSDATA=../tesseract/tessdata MAX_ITERATIONS=100 
>>
>> *Model Generation*: This command successfully generated the Apex.lstm 
>> model file. However, I encountered an issue when trying to use the 
>> Apex.lstm file for further training.
>>
> What does the statement ' *Model Generation*: This command successfully 
> ...' mean? Which command did you run? What is the Apex.lstm model file? 
> Tesseract uses traineddata files for models, correct?"
>
>> *Error:* When attempting to continue training the model, 
>>
> Could you describe how you attempted to continue training the model? Also, 
> can you specify which part of the Tesseract documentation (
> https://tesseract-ocr.github.io/tessdoc/tess5/TrainingTesseract-5.html) 
> or the tesstrain step (https://github.com/tesseract-ocr/tesstrain) you 
> were referring to?"
>
>> I received the following error:Error, data/eng/Apex.lstm is an integer 
>> (fast) model, cannot continue training
>>
>> **What we faced:**I have verified that the eng.traineddata file is 
>> located correctly in /usr/share/tesseract-ocr/5/tessdata/ (path may differ 
>> depending on installation).Despite following the tutorial and using the 
>> correct paths for the eng.traineddata,
>>
> Not sure what you try to communicate with this as you use  
> `../tesseract/tessdata` for training which seems to be a different location 
> than `/usr/share/tesseract-ocr/5/tessdata/`.
>
>> I’m getting an error related to the model being an "integer model" and 
>> unable to continue training.I tried downloading the latest eng.traineddata 
>> from GitHub, but the error persists.
>>
> Try to search e.g. 
> https://github.com/search?q=org%3Atesseract-ocr%20integer%20model&type=code
>  
>
>> *Questions*: What does the "integer (fast) model" error mean, and how 
>> can I resolve it? Is there something I missed in the training process that 
>> would allow me to continue training Apex.lstm? Any advice or insights would 
>> be greatly appreciated. *Environment*: Tesseract version: 5.3.0 OS: 
>> Ubuntu 20.04 (MacBook Pro) Tesseract Data Path: 
>> /usr/share/tesseract-ocr/5/tessdata/Base Model: eng.traineddata Makefile: 
>> https://github.com/tesseract-ocr/tesstrain/blob/43ff10012af31914bb5b72304d9c21c8fdf4f464/Makefile
>>
>> Thank you in advance for your help!
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com.
>> To view this discussion visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/d09b45da-1e8a-4194-ad28-505857f0ad54n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/d09b45da-1e8a-4194-ad28-505857f0ad54n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/tesseract-ocr/ff6efe1f-e822-4e0d-bd57-2165338e0d74n%40googlegroups.com.

Reply via email to