On Tue, 13 Feb 2018 13:42:08 +0000, Rhodri James wrote: > On 13/02/18 13:11, Stanley Denman wrote: >> I am trying to performance a regex on a "string" of text that python >> isinstance is telling me is a dictionary. When I run the code I get >> the following error: >> >> {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: >> 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), >> '/Type': '/FitB'} >> >> Traceback (most recent call last): >> File "C:\Users\stand\Desktop\PythonSublimeText.py", line 9, in >> <module> >> x=MyRegex.findall(MyDict) >> TypeError: expected string or bytes-like object >> >> Here is the "string" of code I am working with: >> >> {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: >> 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), >> '/Type': '/FitB'} >> >> I want to grab the name "MILANI, JOHN C" and the last date >> "-mm/dd/yyyy" as a pair such that if I have X numbers of string like >> the above I will end out with N pairs of values (name and date)/ Here >> is my code: >> >> import PyPDF2,re pdfFileObj=open('x.pdf','rb') >> pdfReader=PyPDF2.PdfFileReader(pdfFileObj) >> Result=pdfReader.getOutlines() >> MyDict=(Result[-1][0]) >> print(MyDict) >> print(isinstance(MyDict,dict)) >> MyRegex=re.compile(r"MILANI,") >> x=MyRegex.findall(MyDict) >> print(x) > > As the error message says, re.findall() expects a string. A dictionary > is in no sense a string, so passing it in whole like that won't work. > If you know that the name will always show up in the title field, you > can pass just the title: > > x = MyRegex.findall(MyDict['/Title']) > > Otherwise you will have to loop through all the entries in the > dictionary: > > for entry in MyDict.values(): > x = MyRegex.findall(entry) # ...and do something with x > > I rather suspect you are going to find that the titles aren't in a very > systematic format, though.
for what purpose are you trying to run this regex anyway? it is almost certainly the wrong approach for your task -- Larkinson's Law: All laws are basically false. -- https://mail.python.org/mailman/listinfo/python-list