Re: Can't match str/unicode

2017-01-07 Thread CM
On Sunday, January 8, 2017 at 1:17:56 AM UTC-5, Steven D'Aprano wrote: > On Sunday 08 January 2017 15:33, CM wrote: > > > On Saturday, January 7, 2017 at 7:59:01 PM UTC-5, Steve D'Aprano wrote: > [...] > >> Start by printing repr(candidate_text) and see what you really have. > > > > Yes, that did

Re: Can't match str/unicode

2017-01-07 Thread Steven D'Aprano
On Sunday 08 January 2017 15:33, CM wrote: > On Saturday, January 7, 2017 at 7:59:01 PM UTC-5, Steve D'Aprano wrote: [...] >> Start by printing repr(candidate_text) and see what you really have. > > Yes, that did it. The repr of that one was, in fact: > > u'match /r' Are you sure it is a forwar

Re: Can't match str/unicode

2017-01-07 Thread Chris Angelico
On Sun, Jan 8, 2017 at 3:31 PM, CM wrote: > On Saturday, January 7, 2017 at 6:42:25 PM UTC-5, Chris Angelico wrote: > >> What happens if you print the repr of each string? Or, if one of them >> truly is a literal, just print the repr of the one you got from >> win32com. >> >> ChrisA > > Yes, that

Re: Can't match str/unicode

2017-01-07 Thread CM
On Saturday, January 7, 2017 at 7:59:01 PM UTC-5, Steve D'Aprano wrote: > On Sun, 8 Jan 2017 08:40 am, CM wrote: > > > So what's going on here? Why isn't a string with the content 'match' equal > > to another string with the content 'match'? > > You don't know that the content is 'match'. All you

Re: Can't match str/unicode

2017-01-07 Thread CM
On Saturday, January 7, 2017 at 6:42:25 PM UTC-5, Chris Angelico wrote: > What happens if you print the repr of each string? Or, if one of them > truly is a literal, just print the repr of the one you got from > win32com. > > ChrisA Yes, that did it. The repr of that one was, in fact: u'match /

Re: Can't match str/unicode

2017-01-07 Thread Steve D'Aprano
On Sun, 8 Jan 2017 08:40 am, CM wrote: > So what's going on here? Why isn't a string with the content 'match' equal > to another string with the content 'match'? You don't know that the content is 'match'. All you know is that when printed, it *looks like* 'match'. Hint: s = 'match ' print 'mat

Re: Can't match str/unicode

2017-01-07 Thread Chris Angelico
On Sun, Jan 8, 2017 at 8:40 AM, CM wrote: > > This is candidate_text: match > > > False > > and, of course, doesn't enter that "do something" loop since apparently > candidate_text != 'match'...even though it seems like it does. > > So what's going on here? Why isn't a string with the content '

Can't match str/unicode

2017-01-07 Thread CM
This is probably very simple but I get confused when it comes to encoding and am generally rusty. (What follows is in Python 2.7; I know.). I'm scraping a Word docx using win32com and am just trying to do some matching rules to find certain paragraphs that, for testing purposes, equal the word