[issue4426] UTF7 decoding is far too strict

2009-05-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: Committed in r72283 and r72285. Thanks! -- resolution: accepted -> fixed status: open -> closed ___ Python tracker ___ _

[issue4426] UTF7 decoding is far too strict

2009-05-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: Thanks for the update. Functions like PyUnicode_EncodeUTF7() are part of the public C API, therefore their semantics can't be changed lightly. The patch looks ok to me, apart from minor style issues. -- assignee: -> pitrou resolution: -> accepted vers

[issue4426] UTF7 decoding is far too strict

2009-05-04 Thread Nick Barnes
Nick Barnes added the comment: This was my first contribution to Python. I don't know what the rules are on changing the arguments of an internal function such as PyUnicode_EncodeUTF7(). Since I was rewriting the whole function anyway, I tried to give it arguments which made more sense with re

[issue4426] UTF7 decoding is far too strict

2009-05-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: A quick comment on the patch: it seems to invert (quite futily I'd say) the meaning of the arguments given to PyUnicode_EncodeUTF7, which is an incompatible API change. I'm in favour of reworking this patch in order to keep the original API. If I'm not mistaken,

[issue4426] UTF7 decoding is far too strict

2009-05-04 Thread STINNER Victor
STINNER Victor added the comment: (oops, i stripped spaces in my last patch) -- Added file: http://bugs.python.org/file13868/issue4426.patch ___ Python tracker ___ __

[issue4426] UTF7 decoding is far too strict

2009-05-04 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file13867/issue4426.patch ___ Python tracker ___ ___ Python-bugs-list mailing

[issue4426] UTF7 decoding is far too strict

2009-05-04 Thread STINNER Victor
STINNER Victor added the comment: I updated the patch to python trunk. I was hard because "patch -p1" failed at many places, so please double check the updated patch. Changes from utf7patch: * I removed the test "c >=0" in DECODE_DIRECT(c) because c type is "Py_UNICODE" and the type is alway

[issue4426] UTF7 decoding is far too strict

2009-05-04 Thread STINNER Victor
STINNER Victor added the comment: Copy of msg76404 from duplicate issue (#4425): '/'.encode('utf7') returns '+AC8-'. It should return '/'. See RFC 2152. '/'.decode('utf7') raises an exception (this is a special case of a general problem with UTF-7 decoding, which I will report as a sepa

[issue4426] UTF7 decoding is far too strict

2008-12-16 Thread Antoine Pitrou
Antoine Pitrou added the comment: I'm not in a position to comment on the encoding algorithm itself but I have a couple of comments: * I get the following compilation warning: Objects/unicodeobject.c: In function ‘PyUnicode_DecodeUTF7Stateful’: Objects/unicodeobject.c:1531: attention : ‘shiftOu

[issue4426] UTF7 decoding is far too strict

2008-12-05 Thread Gabriel Genellina
Changes by Gabriel Genellina <[EMAIL PROTECTED]>: -- nosy: +gagenellina ___ Python tracker <[EMAIL PROTECTED]> ___ ___ Python-bugs-list

[issue4426] UTF7 decoding is far too strict

2008-12-01 Thread Nick Barnes
Nick Barnes <[EMAIL PROTECTED]> added the comment: Here is my patch. This is a rewrite of the UTF7 encoder and decoder. It now handles surrogate pairs correctly, so non-BMP characters work with this codec. And my motivating example ('/'.decode('utf7')) works OK. I'm not totally confident of t

[issue4426] UTF7 decoding is far too strict

2008-12-01 Thread Nick Barnes
Nick Barnes <[EMAIL PROTECTED]> added the comment: My original defect report here was incorrect, or possibly only relates to a particular older Python installation. It is still the case that UTF-7 decoding is fussier than it need be (decoding should be permissive), and is broken specifically for

[issue4426] UTF7 decoding is far too strict

2008-11-29 Thread Antoine Pitrou
Changes by Antoine Pitrou <[EMAIL PROTECTED]>: -- nosy: +pitrou ___ Python tracker <[EMAIL PROTECTED]> ___ ___ Python-bugs-list mailing

[issue4426] UTF7 decoding is far too strict

2008-11-27 Thread Nick Barnes
Nick Barnes <[EMAIL PROTECTED]> added the comment: I'll try to get to this next week. Right now I'm snowed under. I don't promise to do any refactoring. ___ Python tracker <[EMAIL PROTECTED]> ___

[issue4426] UTF7 decoding is far too strict

2008-11-26 Thread Marc-Andre Lemburg
Marc-Andre Lemburg <[EMAIL PROTECTED]> added the comment: On 2008-11-25 19:56, Nick Barnes wrote: > Nick Barnes <[EMAIL PROTECTED]> added the comment: > > Well, I could submit a diff for unicodeobject.c, but I have never > contributed to Python (or used this particular tracking system) before. >

[issue4426] UTF7 decoding is far too strict

2008-11-25 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: > Is there a standard form for contributing changes? Unified diff? Attach a patch file (unified diff, yes). ___ Python tracker <[EMAIL PROTECTED]> __

[issue4426] UTF7 decoding is far too strict

2008-11-25 Thread Nick Barnes
Nick Barnes <[EMAIL PROTECTED]> added the comment: Well, I could submit a diff for unicodeobject.c, but I have never contributed to Python (or used this particular tracking system) before. Is there a standard form for contributing changes? Unified diff? ___

[issue4426] UTF7 decoding is far too strict

2008-11-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg <[EMAIL PROTECTED]> added the comment: On 2008-11-25 12:11, Nick Barnes wrote: > New submission from Nick Barnes <[EMAIL PROTECTED]>: > > UTF-7 decoding raises an exception for any character not in the RFC2152 > "Set D" (directly encoded characters). In particular, it raises

[issue4426] UTF7 decoding is far too strict

2008-11-25 Thread Nick Barnes
Nick Barnes <[EMAIL PROTECTED]> added the comment: # Note, this test covers issues 4425 and 4426 # Direct encoded characters: set_d = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789'(),-./:?" # Optional direct characters: set_o = '!"#$%&*;<=>@[]^_`{|}' all((c.encode('utf7') ==

[issue4426] UTF7 decoding is far too strict

2008-11-25 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Can you write some tests to help fixing this issue? Stupid example (I don't know UTF-8 encoding): >>> all((byte.encode("utf-7") == byte) for byte in '<=>[]@') >>> all((byte.decode("utf-7") == byte) for byte in '<=>[]@') -- nosy: +hay

[issue4426] UTF7 decoding is far too strict

2008-11-25 Thread Nick Barnes
New submission from Nick Barnes <[EMAIL PROTECTED]>: UTF-7 decoding raises an exception for any character not in the RFC2152 "Set D" (directly encoded characters). In particular, it raises an exception for characters in "Set O" (optional direct characters), such as < = > [ ] @ etc. These charac