Re: reading text in pdf, some working sample code

2017-11-22 Thread Bob van der Poel
On Wed, Nov 22, 2017 at 5:39 PM, Jon Ribbens wrote: > On 2017-11-21, Daniel Gross wrote: > > I am new to python and jumped right into trying to read out (english) > text > > from PDF files. > > That's not a trivial task. However I just released pycpdf, which might > help you out. Check out https

Re: reading text in pdf, some working sample code

2017-11-22 Thread Jon Ribbens
On 2017-11-21, Daniel Gross wrote: > I am new to python and jumped right into trying to read out (english) text > from PDF files. That's not a trivial task. However I just released pycpdf, which might help you out. Check out https://github.com/jribbens/pycpdf which shows an example of extracting

Re: reading text in pdf, some working sample code

2017-11-21 Thread dieter
Daniel Gross writes: > I am new to python and jumped right into trying to read out (english) text > from PDF files. > > I tried various libraries (including slate) You could give "pdfminer" a try. Note, however, that it may not be possible to extract the text: PDF is a generic format which works

Re: reading text in pdf, some working sample code

2017-11-21 Thread Paul Moore
I haven't tried it, but a quick Google search found PyPDF2 - https://stackoverflow.com/questions/34837707/extracting-text-from-a-pdf-file-using-python You don't give much detail about what you tried and how it failed, so if the above doesn't work for you, I'd suggest providing more detail as to wh

reading text in pdf, some working sample code

2017-11-21 Thread Daniel Gross
Hi, I am new to python and jumped right into trying to read out (english) text from PDF files. I tried various libraries (including slate) out there but am running into diverse problems, such as with encoding or buffer too small errors -- deep inside some decompression code. Essentially, i want