Solution is requested urgently.

On Wed, Dec 2, 2015 at 4:25 PM, sriranga(83yrsold) <
[email protected]> wrote:

>
>  I have created kan.unicharambigs(attached below) based on the output text
> of Kan.training_text file (which is big). I could not understand how to
> test the attached file and find out whether it works or not?
> kindly point out my mistakes in fhe said attached file, if any, for which
> i shall be thankful to you. I prefer to have commandline test if possible.
>
> ==========================================================================
> Based on wiki instruction (extract reproduced below for ready reference) =
>
> The rules are not bidirectional, so if you want 'rn' to be considered when
> 'm' is detected and vise versa you need a rule for each.
>
> Version 3.03 and on supports a new, simpler format for the unicharambigs
> file:
>
> v2
> '' " 1
> m rn 0
> iii m 0
>
> In this format, the "error" and "correction" are simple utf-8 strings
> separated by *a space*, and, after another space, the same type specifier
> as v1 (0 for optional and 1 for mandatory substitution). Note the downside
> of this simpler format is that Tesseract has to encode the utf-8 strings
> into the components of the unicharset. In complex scripts, this encoding
> may be ambiguous. In this case, the encoding is chosen such as to use the
> least utf-8 characters for each component, ie the shortest unicharset
> components will make up the encoding.
>
> Like most other files used in training, the 'unicharambigs' file must be
> encoded as UTF8, and must end with a newline character. The unicharambigs
> format is also described in the unicharambigs(5) man page
> <https://tesseract-ocr.googlecode.com/svn-history/r683/trunk/doc/unicharambigs.5.html>.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/0d30025d-cc11-4f69-9e98-ec919d3f43df%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/0d30025d-cc11-4f69-9e98-ec919d3f43df%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CANKD7YzvkpNMbzdfnP_Z3SG7dMSMbCUWEqGSj1n4yqTCqTOVew%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to