Hi Pierre,

The FAQ states that the SetVariable must be called before the Init
function.

Regards,
Quan

On Apr 18, 12:50 pm, MARTIN Pierre <hicksc...@gmail.com> wrote:
> Dear NGuyenQ,
>
> > From the pagehttp://www.pixel-technology.com/freeware/tessnet2/
> > tessnet2.Tesseract ocr = new tessnet2.Tesseract();
> > ocr.SetVariable("tessedit_char_whitelist", "0123456789"); // If digit only
>
> This is brilliant advice you just gave him. It is very effective, i just 
> tested it on document with only digits and a few special characters.
> Since i'm working with C++ only (No .net wrapper), here is what i recommend 
> to do:
>
>         // Init your tess API.
>         _tessApi        = new tesseract::TessBaseAPI();
>         // Set up the current directory and language prefix.
>         _tessApi->Init("./", "cst");
>         // This is only important if you'll be parsing pictures with only one 
> line of text (Which is my case).
>         _tessApi->SetPageSegMode(tesseract::PSM_SINGLE_LINE);
>         // Here is the trick as explained and pointed by NGuyenQ:
>         _tessApi->SetVariable("tessedit_char_whitelist", "<0123456789");
>
>         // The in a loop for each of my documents, here is the idea:
>         PIX     *pix    = pixReadMemTiff((const 
> l_uint8*)buffer.buffer().constData(), buffer.size(), 0);
>         _tessApi->SetImage(pix);
>         doc.setRecognizedData("OCRLine", QString(text).trimmed());
>         pixDestroy(&pix);
>         delete []       text;
>         delete  pix;
>
>         // Release everything.
>         _tessApi->Clear();
>         _tessApi->End();
>         delete _tessApi;
>
> The very very interesting part is that before, i was getting "D" and "O" 
> instead of zeros, sometimes even "A" for "4" and "[]" and "[)" instead of 
> zeroes, despite my disambiguation file. Now, i'm getting everything correct, 
> which means the whitelist / blacklist are not just post-processing filters, 
> but real "recognition clues".
>
> i recommend everyone to take note (Well... i'm discovering this feature and 
> it's real consequences, maybe you're not :D).
>
> Pierre.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "tesseract-ocr" group.
> To post to this group, send email to tesseract-...@googlegroups.com.
> To unsubscribe from this group, send email to 
> tesseract-ocr+unsubscr...@googlegroups.com.
> For more options, visit this group 
> athttp://groups.google.com/group/tesseract-ocr?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to tesseract-...@googlegroups.com.
To unsubscribe from this group, send email to 
tesseract-ocr+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to