Hi Ramon,
I do not have source files for dawg dictionaries and I am not able to
"decompile" them. Anyway I think to create dictionaries is the easiest
part of tesseract training: based on wiki[1] input is simple utf-8 file
with one word per line. This file is split to several files:
* lang.pu
Hi Patrick,
Do you have experience that it works (e.g. it produces different output
for different "Page seg mode")?
I tried several options but I got the same output. I used scan of 4
column magazine page as input file.
Maybe I did something wrong, maybe I do not understand what should be
result.
that was awesome. thanks...
On Apr 15, 5:10 pm, namenick wrote:
> hi all...
>
> is there a way to instruct tesseract to ignore anything that is not
> trained to read. like the lines around the date and time in this
> image:http://quereven.com/images/moo_time.jpg
>
> the simple reason is that wi
Hi for you quick answer Zdenko.
As you pointed out, I'm already using tif / box pair from spanish
language to train my catalan .traineddata language. (As spanish
characters suits catalan characters too).
But doing just this (with no words in dictionary files) the dictionary
is not quite good. I t
4 matches
Mail list logo