On 12/17/2009 11:24 AM, Richard Brodie wrote:
A raw string is not a distinct type from an ordinary string
in the same way byte strings and Unicode strings are. It
is a merely a notation for constants, like writing integers
in hexadecimal.

(r'\n', u'a', 0x16)
('\\n', u'a', 22)



Yes, that was a mistake.  But the problem remains::

        >>> re.sub('abc', r'a\nb\n.c\a','123abcdefg') == re.sub('abc', 
'a\\nb\\n.c\\a',' 123abcdefg') == re.sub('abc', 'a\nb\n.c\a','123abcdefg')
        True
        >>> r'a\nb\n.c\a' == 'a\\nb\\n.c\\a' == 'a\nb\n.c\a'
        False

Why are the first two strings being treated as if they are the last one?
That is, why isn't '\\' being processed in the obvious way?
This still seems wrong.  Why isn't it?

More simply, consider::

        >>> re.sub('abc', '\\', '123abcdefg')
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
          File "C:\Python26\lib\re.py", line 151, in sub
            return _compile(pattern, 0).sub(repl, string, count)
          File "C:\Python26\lib\re.py", line 273, in _subx
            template = _compile_repl(template, pattern)
          File "C:\Python26\lib\re.py", line 260, in _compile_repl
            raise error, v # invalid expression
        sre_constants.error: bogus escape (end of line)

Why is this the proper handling of what one might think would be an
obvious substitution?

Thanks,
Alan Isaac
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to