[issue12177] re.match raises MemoryError

2011-05-28 Thread Matthew Boehm
Matthew Boehm added the comment: Here are some windows results with Python 2.7: >>> import re >>> re.match("()*?1", "1") <_sre.SRE_Match object at 0x025C0E60> >>> re.match("()+?1", "1") >>> re.ma

[issue12855] open() and codecs.open() treat form-feed differently

2011-08-29 Thread Matthew Boehm
New submission from Matthew Boehm : A file opened with codecs.open() splits on a form feed character (\x0c) while a file opened with open() does not. >>> with open("formfeed.txt", "w") as f: ... f.write("line \fone\nline two\n") ... >>> with

[issue12855] open() and codecs.open() treat form-feed differently

2011-08-29 Thread Matthew Boehm
Matthew Boehm added the comment: Thanks for explaining the reasoning. Perhaps I should add this to the python wiki (http://wiki.python.org/moin/Unicode) ? It would be nice if it fit in the docs somewhere, but I'm not sure where. I'm curious how (or if) 2to3 would handle this as

[issue12855] open() and codecs.open() treat form-feed differently

2011-08-29 Thread Matthew Boehm
Changes by Matthew Boehm : -- resolution: -> wont fix status: open -> closed ___ Python tracker <http://bugs.python.org/issue12855> ___ ___ Python-bugs-

[issue12855] open() and codecs.open() treat form-feed differently

2011-08-29 Thread Matthew Boehm
Matthew Boehm added the comment: I'll suggest a patch for the documentation when I get to my home computer in an hour or two. -- assignee: -> docs@python components: +Documentation -Interpreter Core nosy: +docs@python resolution: wont fix -> status: clo

[issue12855] open() and codecs.open() treat form-feed differently

2011-08-29 Thread Matthew Boehm
Matthew Boehm added the comment: I'm taking a look at the docs now. I'm considering adding a table/list of characters python treats as newlines, but it seems like this might fit better as a note in http://docs.python.org/library/stdtypes.html#str.splitlines or somewhere else i

[issue12855] linebreak sequences should be better documented

2011-08-29 Thread Matthew Boehm
Matthew Boehm added the comment: I've attached a patch for python2.7 that adds a small not to library/stdtypes.html#str.splitlines explaining which sequences are treated as line breaks: """ Note: Python recognizes "\r", "\n", and "\r\n" as

[issue12855] linebreak sequences should be better documented

2011-08-30 Thread Matthew Boehm
Matthew Boehm added the comment: I can fix the patch to list all the unicode line boundaries. The three places I've considered putting it are: 1. On the howto/unicode.html 2. Somewhere in the stdtypes.html#typesseq description (maybe with other notes at the bottom) 3. As a note t

[issue12855] linebreak sequences should be better documented

2011-08-30 Thread Matthew Boehm
Matthew Boehm added the comment: I've attached a patch for 2.7 and will attach one for 3.2 in a minute. I built the docs for both 2.7 and 3.2 and verified that there were no warnings and that the resulting web pages looked okay. Things to consider: * Placement of unicode.splitlines() m

[issue12855] linebreak sequences should be better documented

2011-08-30 Thread Matthew Boehm
Changes by Matthew Boehm : Added file: http://bugs.python.org/file23077/linebreakdoc.v2.py32.patch ___ Python tracker <http://bugs.python.org/issue12855> ___ ___ Pytho