Re: [tesseract-ocr] Black line deleted

2017-12-11 Thread ShreeDevi Kumar
You have not mentioned which version of tesseract you are using. I tested just now with tesseract4.0alpha and the pdf has the original image with lines. See attached. However, as Zdenko had pointed out before, the OCR is NOT accurate. ShreeDevi _

Re: [tesseract-ocr] Black line deleted

2017-12-10 Thread ShreeDevi Kumar
Pdf generation is done by tesseract only. I had cc:ed Jeff who is the main developer for the pdf related code. On 10-Dec-2017 11:03 PM, "lelive" wrote: Ok, thank for your reply ! If i use tesseract img.tif out -l fra pdf which software makes the conversion to pdf ? Olivier Le dimanche 10

Re: [tesseract-ocr] Black line deleted

2017-12-10 Thread lelive
Ok, thank for your reply ! If i use tesseract img.tif out -l fra pdf which software makes the conversion to pdf ? Olivier Le dimanche 10 décembre 2017 10:02:30 UTC+1, shree a écrit : > > I think the question is related to pdf generation and not the actual OCR. > > The resulting pdf should inc

Re: [tesseract-ocr] Black line deleted

2017-12-10 Thread ShreeDevi Kumar
I think the question is related to pdf generation and not the actual OCR. The resulting pdf should include the original image with the text layer. It seems the lines are deleted in generated pdf. On 10-Dec-2017 1:25 PM, "lelive" wrote: > Hello, > yes i know that, but i have the same problem wit

Re: [tesseract-ocr] Black line deleted

2017-12-09 Thread lelive
Hello, yes i know that, but i have the same problem with classic tables in A4 page. All lines disapears ! Help plz ! Le jeudi 7 décembre 2017 10:05:15 UTC+1, zdenop a écrit : > > I do not think that images like this are appropriate for OCR (at least not > for tesseract). IMO you should do prepr

Re: [tesseract-ocr] Black line deleted

2017-12-07 Thread Zdenko Podobný
I do not think that images like this are appropriate for OCR (at least not for tesseract). IMO you should do preprocessing of them and pass to tesseract only areas with text. Tesseract is very noise sensitive (at least 3.x version). Zdenko On Wed, Dec 6, 2017 at 8:32 PM, lelive wrote: > Hi all

[tesseract-ocr] Black line deleted

2017-12-06 Thread lelive
Hi all, i use tesseract for technical documents and produce pdf searchable . But if the picture contain lines, in the pdf file result, the lines are deleted