On 2:59 PM, flebber wrote:
<snip>
Traceback (most recent call last):
File "C:/Python26/Pdfread", line 16, in<module>
open('x.txt', 'w').write(content)
NameError: name 'content' is not defined
When i use.
import pyPdf
def getPDFContent(path):
content =C:\Components-of-Dot-NET.txt"
# Load PDF into pyPDF
pdf =yPdf.PdfFileReader(file(path, "rb"))
# Iterate pages
for i in range(0, pdf.getNumPages()):
# Extract text from page and add to content
content +=df.getPage(i).extractText() + "\n"
# Collapse whitespace
content = ".join(content.replace(u"\xa0", " ").strip().split())
return content
print getPDFContent(r"C:\Components-of-Dot-NET.pdf").encode("ascii",
"ignore")
open('x.txt', 'w').write(content)
There's no global variable content, that was local to the function. So
it's lost when the function exits. it does return the value, but you
give it to print, and don't save it anywhere.
data = getPDFContent(r"C:\Components-of-Dot-NET.pdf").encode("ascii",
"ignore")
outfile = open('x.txt', 'w')
outfile.write(data)
close(outfile)
I used a different name to emphasize that this is *not* the same
variable as content inside the function. In this case, it happens to
have the same value. And if you used the same name, you could be
confused about which is which.
DaveA
--
http://mail.python.org/mailman/listinfo/python-list