1. i used lstmbox to get box file (my.Font.box) with some
images(my.Font.tif).
2. using jTessBoxEditor to finetune the box file.
3. lstm.train get lstmf file (my.Font.lstmf)
lstmtraining \
--model_output ./out/finetune \
--continue_from ./best_eng.lstm \ # combine_tessdata from
lstmtraining --traineddata data/tamtrain/tamtrain.traineddata
--old_traineddata tesseract/tessdata/tam.traineddata --continue_from
data/tam/tam.lstm --net_spec '[Lfx256 O1c111]' --model_output
data/checkpoints --learning_rate 20e-4 --train_listfile
data/list.train --eval_listfil
No, I always train from scratch.
best fast.traindata doesn't recognize eng and persian and the accuracy is
too low in some fonts.
I want to solve this problem.
For fine tune can have different unicharset. As I read in wiki of
tesseract, it is the number of class of lstm. So if Mr. Smit has trained
>By version alpha, I trained about 1000 line and it is not so bad
You must have only done fine tuning of model then and now you are trying to
train from scratch.
On Wed, 26 Sep 2018, 04:01 Khosrobeigy.zohreh,
wrote:
> I know, actually I am master in lstm. I want to resolve all error and then
>
I know, actually I am master in lstm. I want to resolve all error and then
train big text.
By version alpha, I trained about 1000 line and it is not so bad. But in
version beta 4 I got many error.
In alpha,
# Use LSTM
tessedit_ocr_engine_mode 1
tessedit_pageseg_mode 6
# Arabic page layout variable
--fontlist "Arial"
Does that have good coverage for Farsi?
--max_iterations 5000
You are trying to train from scratch with 18000 lines of text and only 5000
iterations. That will not work.
Ray has trained on hundreds of thousands of lines of text and millions of
iterations.
On Tue, 25 Sep 2
Hi, I use this :
tesseract 4.0.0-beta.4
leptonica-1.74.4
libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib
1.2.8
Found AVX2
Found AVX
Found SSE
I've trained about 18000 line for persian language. I use this command:
bash -x tesstrain.sh --fonts_dir /usr/share/fonts --
7 matches
Mail list logo