On 21.07.2015 04:55, Chris Angelico wrote:
On Tue, Jul 21, 2015 at 12:49 PM, ryguy7272 <ryanshu...@gmail.com> wrote:
I'm trying to copy some Python code from a PDF book that I'm reading.  I want 
to test out the code, and I can copy it, but when I paste it into the Shell, 
everything is all screwed up because of the indentation. Every time I paste in 
any kind of code, it seems like everything is immediately left-justified, and 
then nothing works.

Any idea how to make this work easily?  Without re-typing hundreds of lines of 
code...

Sounds like a flaw in the PDF - it creates indentation in some way
other than leading spaces/tabs.

PDF never uses tabs and spaces for indentation. In a PDF file, typically all words are placed using a drawing operator individually, the space is made up by your eyes when see the file. While space characters exist in fonts, they are practically never used. Often even inside a word there are breaks, because of kerning corrections. When copying the data, the PDF reader has to guess where the word breaks are and how the strings belong together. Acrobat does a good job, but fails in this special situation. Sometimes it even fails for a narrow running font and copies the string without any word breaks.

Laura's method works, because pdftotext can simulate the PDF appearance using spaces in the output. Maybe an OCR program with good layout could also be used.

Christian

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to