Any idea of Tesseract 3.0 release date

2010-08-21 Thread Maggie
Dear all, I am working on a project that badly needed a 3.0 release to support the image conversion to Chinese. I am wondering if anyone know the release date of 3.0? Will it release before the end of the year? Any information is greatly appreciated. Maggie. -- You received this message because

Re: Line of equals symbols not recognized

2010-08-21 Thread Colin Beckingham
On 08/20/2010 02:02 PM, Jimmy O'Regan wrote: On 20 August 2010 12:53, colbec wrote: Using tesseract 3.00 on Opensuse 11.2. From CLI as in tesseract file.tif file In an image that contains a line of '=' signs the recognition is much worse than if these lines are removed, eg: line 1 and stuff =

Re: recognition languages sets? with hierarchy?

2010-08-21 Thread tt
I'm not much of a programmer, but could you point me to the code doing that? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to tesseract-...@googlegroups.com. To unsubscribe from this group, send email to tes

Re: recognition languages sets? with hierarchy?

2010-08-21 Thread Jimmy O'Regan
On 21 August 2010 10:12, tt wrote: > Is it possible for Tesseract to make ocr with languages put in ordered > set? I have lots of text to ocr consisting primarily of lang1, with > small portions in lang2 and lang3 (quotes and refs). It would be ideal > for Tesseract to recognise "what it can" in l

recognition languages sets? with hierarchy?

2010-08-21 Thread tt
Is it possible for Tesseract to make ocr with languages put in ordered set? I have lots of text to ocr consisting primarily of lang1, with small portions in lang2 and lang3 (quotes and refs). It would be ideal for Tesseract to recognise "what it can" in lang1 (e.g., to 90% match), then switch to th

Re: Announcement: new version of pyTesseractTrainer available

2010-08-21 Thread tt
Thank you. The 1st link gives 404 error, btw. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to tesseract-...@googlegroups.com. To unsubscribe from this group, send email to tesseract-ocr+unsubscr...@google

Re: Announcement: new version of pyTesseractTrainer available

2010-08-21 Thread tt
Okay, I didn't notice the link in the announcement leads to the original Trainer. The diff, however, is valid (Trainer patched with this works with v3 boxes on my system). Also, regarding the Trainer (not the authors' 1.01 but the original with v3 boxes read/write added): The incredibly slow open

Re: Announcement: new version of pyTesseractTrainer available

2010-08-21 Thread zdenko podobny
Hi, your problem is that you use tesseractTrainer.py that was done in 2007 and not pyTesseractTrainer.py (2010) that corrected this issue. I would suggest to use http://code.google.com/p/pytesseracttrainer/downloads/detail?name=pyTesseractTrainer-1.01.pyor (if you are brave enough devel version: h

Re: Announcement: new version of pyTesseractTrainer available

2010-08-21 Thread tt
This Trainer variant won't open v3 box file: Traceback (most recent call last): File "/home/ty/files/tesseractTrainer.py", line 546, in doFileOpen self.loadImageAndBoxes(fileName, chooser) File "/home/ty/files/tesseractTrainer.py", line 471, in loadImageAndBoxes self.boxes = loadBoxData