date:20121115

Re: Chinese OCR - top-down right-left orientation and training

2012-11-15 Thread Devin Bean

Thanks, I appreciate the suggestions! On Friday, November 2, 2012 1:48:45 PM UTC-4, sventech wrote: > > Cutting off the borders and possibly adding white borders might help. > Normalizing out the text that bleeds through the page would also help. > The text is clear, so you might not need to ret

How to build the tesseract 3.02.02 project in Eclipse at Ubuntu?

2012-11-15 Thread Linda Li

I want to build the tesseract 3.02.02 project so that I can modify some code to tune it to some specific task. Version: tesseract 3.02.02 Ubuntu 12.04, Eclipse Juno I put the tesseract into the Eclipse project. Include directories /usr/local/include /usr/local /usr/include/leptonica and all fil

Problem with ViewerDebugging with tesseract 3.02.02

2012-11-15 Thread Linda Li

Version: tesseract 3.02.02 Ubuntu 12.04, Eclipse Juno I am trying to use ViewerDebugging. Following the instructions in http://code.google.com/p/tesseract-ocr/wiki/ViewerDebugging I installed javac download piccolo-1.2.jar, piccolox-1.2.jar, and make ScrollView.jar Then I use export to set the

Re: inconsistent results from tesseract when the same TessBaseAPI object is used for decoding multiple images

2012-11-15 Thread newtotesseract

Hi Dmitri, How do we clear the adaptive classifier? Can I please know, what is the API or function for clearing the adaptive classifier? Best Regards, - ganesh On Friday, November 16, 2012 3:39:22 AM UTC+8, Dmitri Silaev wrote: > > Sriranga, > > All you can specify in the command line can be s

Re: Word Search Using Tessnet

2012-11-15 Thread Sven Pedersen

There is a newer wrapper for 3.x version: http://code.google.com/p/tesseractdotnet/w/list I think it was made by the developer of VietOCR --Sven On Thu, Nov 15, 2012 at 5:06 PM, zdenko podobny wrote: > On Fri, Nov 9, 2012 at 1:43 PM, Troy Frazier wrote: > >> Is it possible to search an image

Re: Word Search Using Tessnet

2012-11-15 Thread zdenko podobny

On Fri, Nov 9, 2012 at 1:43 PM, Troy Frazier wrote: > Is it possible to search an image for a particular word using the Tessnet > wrapper? I see that it is possible to limit your scan to certain > characters, but what I would like to do is to input a word and have all > instances of that word be

Re: Tesseract Forms Recognition,

2012-11-15 Thread Sven Pedersen

Hi Rey, The Shared Questionnaire System (SQS) is doing something here under Apache license: http://dev.sqs2.net/projects/ in Java, XSLT and JavaScript And queXF assumes you create the forms yourself (under GPLv2) http://quexf.sourceforge.net/ for tesseract's license Check here: http://www.apache.

Re: Can I configure Tesseract to always match a dictionary word?

2012-11-15 Thread Zdenko Podobný

Regarding "user_patterns_suffix" have a look at tesseract manual page [1]. I am not sure if there is possibility to force tesseract choose ocr output from dictionary (I never tried it ;-) ) But you can increase dictionary strength with variables language_model_penalty_non_freq_dict_word and la

Re: Having traindata files uncombined

2012-11-15 Thread Zdenko Podobný

Can you please use 3.02 version instead of 3.01 and write exact error message? There is possibility to copy text from windows console - select relevant text/lines with pressed left mouse button then click with right mouse button outside of selected text but in console window - highlight will di

Re: ocr of image fails

2012-11-15 Thread Sven Pedersen

Yes, I think the text size (x-height) was too small. Also, the English language data may be trained with more fonts, given that Google created it. --Sven On Thu, Nov 15, 2012 at 6:43 AM, sascha4j wrote: > after converting the image with imagmagick the result is better. not 100% > but nearly. >

Re: ocr of image fails

2012-11-15 Thread sascha4j

after converting the image with imagmagick the result is better. not 100% but nearly. the options for imagemagick were convert -colorspace gray -resize 200% -unsharp 0x8+1.5+0.05 Am Donnerstag, 15. November 2012 10:26:21 UTC+1 schrieb sascha4j: > Hi, > > i try to ocr some scanned text w

ocr of image fails

2012-11-15 Thread sascha4j

Hi, i try to ocr some scanned text with tesseract-ocr. for some images the result is quite good. but for this one ( see attached file) the result is poor. any hints why ? and what i could do to get a better result? i use tesseract 3.0.2 with german language. greetings sascha4j

Re: Confidence in HOCR file

2012-11-15 Thread José Luis Rey

Thanks very much for your responses zdenop, I'm not used to dev in open source projects like this, perhaps you may help me to understand, for example if I implement a feature to add character rect&confidence to the hocr output, how this is translated to the main project (if it is good enough

Re: Confidence in HOCR file

2012-11-15 Thread zdenko podobny

On Thu, Nov 15, 2012 at 10:15 AM, José Luis Rey wrote: > Thanks very much for your responses zdenop, > > I'm not used to dev in open source projects like this, perhaps you may > help me to understand, for example if I implement a feature to add > character rect&confidence to the hocr output, how

inconsistent results from tesseract when the same TessBaseAPI object is used for decoding multiple images

2012-11-15 Thread newtotesseract

Hi friends I am using a static TessBaseAPI object in my application. This object gets initialized and reads, processes the training data at the startup of the application. Then, this application processes multiple scanned images through the TESS_API TessBaseAPI::ProcessPages() function, using

Re: Chinese OCR - top-down right-left orientation and training

How to build the tesseract 3.02.02 project in Eclipse at Ubuntu?

Problem with ViewerDebugging with tesseract 3.02.02

Re: inconsistent results from tesseract when the same TessBaseAPI object is used for decoding multiple images

Re: Word Search Using Tessnet

Re: Word Search Using Tessnet

Re: Tesseract Forms Recognition,

Re: Can I configure Tesseract to always match a dictionary word?

Re: Having traindata files uncombined

Re: ocr of image fails

Re: ocr of image fails

ocr of image fails

Re: Confidence in HOCR file

Re: Confidence in HOCR file

inconsistent results from tesseract when the same TessBaseAPI object is used for decoding multiple images

15 matches

Site Navigation

Mail list logo

Footer information