My previous question asked how to read a file into a strcuture a line at a time. Figured it out. Now I'm trying to use .find to separate out the PDF objects. (See code) PROBLEM/QUESTION: My call to lines[i].find does NOT find all instances of endobj. Any help available? Any insights?
#!/usr/bin/python inputfile = file('sample.pdf','rb') # This is PDF with which we will work lines = inputfile.readlines() # read file one line at a time linestart = [] # Starting address for each line lineend = [] # Ending address for each line linetype = [] print len(lines) # print number of lines i = 0 # define an iterator, i addr = 0 # and address pointer while i < len(lines): # Go through each line linestart = linestart + [addr] length = len(lines[i]) lineend = lineend + [addr + (length-1)] addr = addr + length i = i + 1 i = 0 while i < len(lines): # Initialize line types as normal linetype = linetype + ['normal'] i = i + 1 i = 0 while i < len(lines): # if lines[i].find(' obj') > 0: linetype[i] = 'object' print "At address ",linestart[i],"object found at line ",i,": ", lines[i] if lines[i].find('endobj') > 0: linetype[i] = 'endobj' print "At address ",linestart[i],"endobj found at line ",i,": ", lines[i] i = i + 1 -- --------------------------------- --- -- - Posted with NewsLeecher v4.0 Final Web @ http://www.newsleecher.com/?usenet ------------------- ----- ---- -- - -- http://mail.python.org/mailman/listinfo/python-list