[tesseract-ocr] Re: [tesseract-dev] Re: Training tools linking failure, icu_48::*

2014-07-31 Thread Jeff Breidenbach
Done. Bonus points if someone can remember to remove the instructions when they become obsolete in October. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tess

[tesseract-ocr] Re: [tesseract-dev] Re: Training tools linking failure, icu_48::*

2014-07-31 Thread Shree
It maybe helpful to add these instructions for compiling Tesseract on Ubuntu with training tools in the wiki . On Thursday, July 31, 2014 9:05:08 PM UTC+5:30, Jeff Breidenbach wrote: > > For me the errors came from some debris in the training > directory, and "make clean" in that directory too

Re: [tesseract-ocr] Re: Tesseract 3.03: PDF-OCR generated PDFs show coding artefacts => do not use lossy (jpg) compression! Use lossless compression (png)!!

2014-07-31 Thread Jim O'Regan
On 31 July 2014 20:27, Tom wrote: > Dear Jim, > Tom, I want to say before anything else that I very much appreciate this followup message. I'm glad that you took the time to rephrase your position. > thanks for your explanantion, I also studied to two codes (one part is in > Leptonica, the other

Re: [tesseract-ocr] Re: Tesseract 3.03: PDF-OCR generated PDFs show coding artefacts => do not use lossy (jpg) compression! Use lossless compression (png)!!

2014-07-31 Thread zdenko podobny
I do not have to time to have a look on this issue yet, but forcing user to use lossless compression is not right way IMO. Right way is to implement option for user to force tesseract to use lossless compression, but this feature is not provided by your "patch"... Zdenko On Thu, Jul 31, 2014 at

[tesseract-ocr] Re: Tesseract 3.03: PDF-OCR generated PDFs show coding artefacts => do not use lossy (jpg) compression! Use lossless compression (png)!!

2014-07-31 Thread Tom
@jimregan Dear Jim, thanks for your explanantion, I also studied to two codes (one part is in Leptonica, the other, more important in Tesseract). I think, forcing to use "FLATE" just before the image is rendered into the PDF page is the best solution, I kindly ask you to try my (short and easy

[tesseract-ocr] Re: Problem with single line three character Chinese

2014-07-31 Thread Paul
Did you try the options -psm 7 or -psm 8? Probably you will get better results by using one of them. Paul Am Donnerstag, 31. Juli 2014 08:36:12 UTC+2 schrieb Sayang: > > a) Tesseract correctly OCR'd eight (>30 character) lines of Chinese, > scanned from a book > > b) Tesseract seemed to fail OC

[tesseract-ocr] OCR using C

2014-07-31 Thread Rara
Hello, I've began my first experience with Tesseract in order to implement a program for biometric document recognition. I'm searching of a detailed guide for developpement with Tesseract and a tuto explained how to use and test this platform with windows OS. Looking forward to your answer ! Be

[tesseract-ocr] Re: Tesseract 3.03: PDF-OCR generated PDFs show coding artefacts => do not use lossy (jpg) compression! Use lossless compression (png)!!

2014-07-31 Thread Tom
For the application of Tesseract as OCR engine for texts (with or without images, B/W or colour), everything else than lossless compression is stupid.So respectfully stated, I cannot accept your "work-around". Please see my patch (on Github). It fully fixes the issue - we are talking only about

[tesseract-ocr] Re: Works Tesseract with Neuronal Networks ???

2014-07-31 Thread Martin M
Thanks a lot Paul, I will take a look on it. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group,

Re: [tesseract-ocr] Tracking the operations in Tesseract ( Debugging Process)

2014-07-31 Thread sibi kanagaraj
Hi , This is with respect the the debugging process . I have followed the steps given here . /// " Building and installing*On Linux:* - Copy piccolo2d-core-3.0.jar and piccolo2d-extras-3.0.jar to tesseract/java. - cd java - make ScrollView.jar - Set the SCROLLVIEW_PAT

[tesseract-ocr] Problem with single line three character Chinese

2014-07-31 Thread Sayang
a) Tesseract correctly OCR'd eight (>30 character) lines of Chinese, scanned from a book b) Tesseract seemed to fail OCR'ing a single line image with three characters (xingqisi - Thursday) (i) Four different fonts were tried - so four different single line images - attached. (ii) The bi

[tesseract-ocr] Re: Tesseract 3.03: PDF-OCR generated PDFs show coding artefacts => do not use lossy (jpg) compression! Use lossless compression (png)!!

2014-07-31 Thread Tom
Solution: see fix in github https://code.google.com/p/tesseract-ocr/issues/detail?id=1263#c4 -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsu