Ok, i feel a bit less bad now. combine_tessdata segfaults on both ubuntu and
osx:
182:tess max$ combine_tessdata -u eng.traineddata eng
Extracting tessdata components from eng.traineddata
tesseract::TessdataManager::TessdataTypeFromFileName( filename, &type,
&text_file):Error:Assert failed:in f
Ok, I found the problem. the fix is described here:
http://code.google.com/p/tesseract-ocr/issues/detail?id=356
the output dir needs to end in a period.
my bad.
max
On May 9, 2011, at 3:30 PM, zdenko podobny wrote:
> no problem :-) I think you will like option "-o" too.
>
> Zdenko
>
>
That's weird, I find tesseract works better with 150dpi. I can never get it
to return meaningful results at 300dpi. Maybe it is must my documents? Or
maybe I need to force them to grayscale? They are color documents (but all
black and white anyway).
On 9 May 2011 17:00, Quan Nguyen wrote:
> Did
In testing the Tessercat x, we see that the software ocr accuracy will
decrease under two conditions:
a. the text lines are adjacent to each other (Or characters are
vertically adjacent to each other.)
b. the text characters are horizontally adjacent to each other.
I wonder if there is tesseract
Did you scan them correctly, with appropriate pixel resolution (~300
DPI) and monochrome/grayscale settings?
On May 9, 10:20 am, Giby_the_kid wrote:
> I've test with the sample of text in the sources... it has worked...
> Now if I tried with any other scanned document, I get an empty text
> file.
Take a look at the source code of VietOCR.NET, which uses tessnet2
library.
http://vietocr.sf.net
On May 9, 10:08 am, Vignesh Raj wrote:
> Hi. Am very new to this and I need some help on how to set up tessnet
> for my .Net (c#) based application.
> I have not done anything yet and any link on th
I've test with the sample of text in the sources... it has worked...
Now if I tried with any other scanned document, I get an empty text
file.
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@goo
Hi. Am very new to this and I need some help on how to set up tessnet
for my .Net (c#) based application.
I have not done anything yet and any link on the basic study will be
very helpful.
I have gone through "http://www.pixel-technology.com/freeware/
tessnet2/", but could not find more info on the
Thanks for the information.
Now to your question. I use the UNLV format cause I output the string to a
simple edit control (text box) in windows which supports ISO-8859-1 encoded
strings I think since special UTF-8 characters look funny such as äöü.
Since I found no way how to get a UTF-8 support
no problem :-) I think you will like option "-o" too.
Zdenko
On Mon, May 9, 2011 at 8:27 AM, Max Cantor wrote:
> I feel really dumb now. Sorry for the bother.
>
>
> Thanks, max
>
> On May 9, 2011, at 14:01, zdenko podobny wrote:
>
> Please try to read (to look is not enough ;-) ) [1] :
>
> //
Hi Max,
Look at:
Extracts all component files from .traineddata
combine_tessdata -u tessdata/ell.traineddata /home/$USER/temp/ell
combine_tessdata language_data_path_prefix (e.g. tessdata/eng.)
Combines all individual tessdata components (unicharset, DAWGs, classifier
templates, ambiguities, lan
I feel really dumb now. Sorry for the bother.
Thanks, max
On May 9, 2011, at 14:01, zdenko podobny wrote:
> Please try to read (to look is not enough ;-) ) [1] :
>
>
> // Specify option -u to unpack all the components to the specified path:
> //
>
>
> // combine_tessdata -u tessdata/eng.tr
12 matches
Mail list logo