On 2:59 PM, flebber wrote:
<snip>
Traceback (most recent call last):
   File "C:/Python26/Pdfread", line 16, in<module>
     open('x.txt', 'w').write(content)
NameError: name 'content' is not defined
When i use.

import pyPdf

def getPDFContent(path):
     content =C:\Components-of-Dot-NET.txt"
     # Load PDF into pyPDF
     pdf =yPdf.PdfFileReader(file(path, "rb"))
     # Iterate pages
     for i in range(0, pdf.getNumPages()):
         # Extract text from page and add to content
         content +=df.getPage(i).extractText() + "\n"
     # Collapse whitespace
     content = ".join(content.replace(u"\xa0", " ").strip().split())
     return content

print getPDFContent(r"C:\Components-of-Dot-NET.pdf").encode("ascii",
"ignore")
open('x.txt', 'w').write(content)

There's no global variable content, that was local to the function. So it's lost when the function exits. it does return the value, but you give it to print, and don't save it anywhere.

data = getPDFContent(r"C:\Components-of-Dot-NET.pdf").encode("ascii",
"ignore")

outfile = open('x.txt', 'w')
outfile.write(data)

close(outfile)

I used a different name to emphasize that this is *not* the same variable as content inside the function. In this case, it happens to have the same value. And if you used the same name, you could be confused about which is which.


DaveA

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to