RE: tesseract api, how do you get the bbox co-ordinates in commandline using the exe in win32

2011-03-26 Thread Adetokunbo Bamidele
Thanks. I use notepad++. :-) -Original Message- From: Dmitri Silaev Sent: 26 March 2011 20:57 To: tesseract-ocr@googlegroups.com; BYTEFX Subject: Re: tesseract api, how do you get the bbox co-ordinates in commandline using the exe in win32 Well, wingrep is not a must have. I just mentione

Re: tesseract.exe has stopped working on win2008 r2

2011-03-26 Thread TP
On Sat, Mar 26, 2011 at 7:42 AM, zdenko podobny wrote: >> Can somebody explain why a tif size (2480x3508 @ 8BPP) is not processed? The test image has 16 bpp. > This is not tesseract but leptonica issue (library used for image handling). > When I run it on linux I got error message comming from l

Re: tesseract api, how do you get the bbox co-ordinates in commandline using the exe in win32

2011-03-26 Thread Dmitri Silaev
Great, I use it too, that's one of the famous free text editing programs )) However it's not capable to do massive automated text file processing, but I think this is what you need to achieve your goal... On Sun, Mar 27, 2011 at 12:25 AM, Adetokunbo Bamidele wrote: > Thanks. I use notepad++. :

Re: tesseract improve the reject rate ?

2011-03-26 Thread Dmitri Silaev
When you have a small trained alphabet, Tesseract's classifier sometimes might not find suitable matches and in that way it will output a null character further converted to a space. However in your case, there are Chinese characters that have many strokes and outlines, many of which somehow (parti

Re: tesseract.exe has stopped working on win2008 r2

2011-03-26 Thread Dmitri Silaev
Guys, I still can't understand what the error is produced by Tesseract. Let's wait for the error screenshot. Or did you understand everything already? Richard says he's got an error message... Warm regards, Dmitri Silaev On Sat, Mar 26, 2011 at 5:42 PM, zdenko podobny wrote: > > > On Fri, Ma

Re: tesseract api, how do you get the bbox co-ordinates in commandline using the exe in win32

2011-03-26 Thread Dmitri Silaev
Well, wingrep is not a must have. I just mentioned it to name anything. After all, it's shareware )) You need a program that is just capable of processing text files and doing some basic operations with words or numbers within a text line. There's a vast of such programs on the Internet, you'll pr

Re: tesseract api, how do you get the bbox co-ordinates in commandline using the exe in win32

2011-03-26 Thread Dmitri Silaev
Hi, Glad you've made some progress with your goal. As for parameters that can influence speed vs. accuracy - they are many. Just to name a few: classify_class_pruner_threshold classify_class_pruner_multiplier classify_cp_cutoff_strength classify_integer_matcher_multiplier These relate to the pr

Re: simple invocation of tesseract on ubuntu generates a single-byte output file

2011-03-26 Thread zdenko podobny
On Sat, Mar 26, 2011 at 3:56 PM, Robert P. J. Day wrote: > On Sat, 26 Mar 2011, zdenko podobny wrote: > > > On Sat, Mar 26, 2011 at 2:34 PM, rpjday wrote: > > long story short, i'm seeing this issue on my ubuntu 10.10 system: > > > >http://ubuntuforums.org/showthread.php?t=1599686 >

Re: simple invocation of tesseract on ubuntu generates a single-byte output file

2011-03-26 Thread Robert P. J. Day
On Sat, 26 Mar 2011, zdenko podobny wrote: > On Sat, Mar 26, 2011 at 2:34 PM, rpjday wrote: > long story short, i'm seeing this issue on my ubuntu 10.10 system: > >  http://ubuntuforums.org/showthread.php?t=1599686 > > the packages i have installed: > >  * tessearct-ocr >

Re: simple invocation of tesseract on ubuntu generates a single-byte output file

2011-03-26 Thread zdenko podobny
On Sat, Mar 26, 2011 at 2:34 PM, rpjday wrote: > long story short, i'm seeing this issue on my ubuntu 10.10 system: > > http://ubuntuforums.org/showthread.php?t=1599686 > > the packages i have installed: > > * tessearct-ocr > * tesseract-ocr-eng > > which version you installed? > i took a si

Re: tesseract.exe has stopped working on win2008 r2

2011-03-26 Thread zdenko podobny
On Fri, Mar 25, 2011 at 5:40 PM, Lutz, Michael wrote: > Hi, > > I just ran your tif file, I get no results, it must have something to do > with the size of the image. If I try to run a portion of tiff something > smaller than 1000x1000 then I get results. > > Can somebody explain why a tif size

Re: tesseract.exe has stopped working on win2008 r2

2011-03-26 Thread zdenko podobny
convert it to png - you got smaller picture with the same quality and tesseract should process it without problem. Zdenko On Fri, Mar 25, 2011 at 5:03 PM, Richard Genthner wrote: > Here is the screenshot and the tif file. Dmitri if you rename the .exe that > should work. I'm trying to get the tr

simple invocation of tesseract on ubuntu generates a single-byte output file

2011-03-26 Thread rpjday
long story short, i'm seeing this issue on my ubuntu 10.10 system: http://ubuntuforums.org/showthread.php?t=1599686 the packages i have installed: * tessearct-ocr * tesseract-ocr-eng i took a simple screenshot of some text, saved it to a .tif file, then ran: $ tesseract tess.tif tess

Re: tesseract.exe has stopped working on win2008 r2

2011-03-26 Thread Sriranga(78yrsold)
According irfanview, is compressed as - LZW tif file of 300 DPI What Quan says is correct image is heavily compressed tif one. Tesseract-OCR is supported only *uncompressed tif* file only from my experience. On Sat, Mar 26, 2011 at 6:17 PM, Quan Nguyen wrote: > The image appears to have been

Re: tesseract.exe has stopped working on win2008 r2

2011-03-26 Thread Quan Nguyen
The image appears to have been heavily compressed. OCR the whole image did not yield anything. Doing it blockwise, I got some results but not very accurate: Ch Juhe 24, 2@@9 the ACHP vctect ct: revisect teccmmehdettcns tcr mee_s1es-muhqes-t'ube[[e (NR/H~ ‘evictetnce ct tmmuhity’ requtrementstcr he