I am new to Python. I am trying to extract text from the bookmarks in a PDF file that would provide the data for a Word template merge. I have gotten down to a string of text pulled out of the list object that I got from using PyPDF2 module. I am stuck on now to get the data out of the string that I need. I am calling it a string, but Python is recognizing as a dictionary object.
Here is the string: {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'} What a want is the following to end up as fields on my Word template merge: MedSourceFirstName: "John" MedSourceLastName: "Milani" MedSourceLastTreatment: "05/28/2014" If I use keys() on the dictionary I get this: ['/Title', '/Page', '/Type']I was hoping "Src" and Tmt Dt." would be treated as keys. Seems like the key/value pair of a dictionary would translate nicely to fieldname and fielddata for a Word document merge. Here is my code so far. [python]import PyPDF2 pdfFileObj=open('x.pdf','rb') pdfReader=PyPDF2.PdfFileReader(pdfFileObj) MyList=pdfReader.getOutlines() MyDict=(MyList[-1][0]) print(isinstance(MyDict,dict)) print(MyDict) print(list(MyDict.keys()))[/python] I get this output in Sublime Text: True {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'} ['/Title', '/Page', '/Type'] [Finished in 0.4s] Thank you in advance for any suggestions. -- https://mail.python.org/mailman/listinfo/python-list