HI,
i am trying to finetune eng.traindata as per my images i have tried to
train but all time i am stuck somewhere can you tell me how can i procced
further.
current steps
step 1 make box files
%%bash
for file in *.jpg; do
echo $file
base=`basename $file .jpg`
tesseract $file $base lstmbox
done
step 2 make lstmf file
%%bash
for file in *.jpg; do
echo $file
base=`basename $file .jpg`
tesseract $file $base lstm.train
done
step 3 create unichar set
%%bash
function wrap {
for i in `seq 0 $1`; do
echo "$2$i$3"
done
}
N=0
unicharset_extractor `wrap $N "eng.arial.exp" ".box"`
step 4 start training
!lstmtraining \
--model_output output/ \
--continue_from lstm_model/eng.lstm \
--traineddata /usr/share/tesseract-ocr/5/tessdata/eng.traineddata \
--train_listfile list.train \
--eval_listfile list.eval \
--max_iterations 400 \
in step 4 it will give following output
Loaded file lstm_model/eng.lstm, unpacking...
Warning: LSTMTrainer deserialized an LSTMRecognizer!
Continuing from lstm_model/eng.lstm
Loaded 128/128 lines (1-128) of document eng.arial.exp0.lstmf
Loaded 131/131 lines (1-131) of document eng.arial.exp9.lstmf
Loaded 135/135 lines (1-135) of document eng.arial.exp7.lstmf
Loaded 114/114 lines (1-114) of document eng.arial.exp2.lstmf
Loaded 93/93 lines (1-93) of document eng.arial.exp6.lstmf
Loaded 104/104 lines (1-104) of document eng.arial.exp4.lstmf
Loaded 88/88 lines (1-88) of document eng.arial.exp5.lstmf
Loaded 131/131 lines (1-131) of document eng.arial.exp3.lstmf
This is not training after this.
so can you tell me what changes i can do to successfull training.
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/977d82fc-c2a6-4c3d-8db5-c6c917e9c8c0n%40googlegroups.com.