Daniel Gross <gross...@gmail.com> writes: > I am new to python and jumped right into trying to read out (english) text > from PDF files. > > I tried various libraries (including slate)
You could give "pdfminer" a try. Note, however, that it may not be possible to extract the text: PDF is a generic format which works by mapping character codes to glyphs (i.e. visual symbols); if your PDF uses a special map for this (especially with non standard glyph collections (aka "font"s)), then the text extraction (which in fact extracts sequences of character codes) can give unusable results. -- https://mail.python.org/mailman/listinfo/python-list