Antoine Pitrou <pit...@free.fr> added the comment: > > The UTF-8 codec described by RFC 2279 didn't say so, so, since our > > codec was following RFC 2279, it was producing valid UTF-8. With RFC > > 3629 a number of things changed in a non-backward compatible way. > > Therefore we couldn't just change the behavior of the UTF-8 codec nor > > rename it to something else in Python 2. We had to wait till Python 3 > > in order to fix it. > > I'm a bit confused on this. You no longer fix bugs in Python 2?
In general, we try not to introduce changes that have a high probability of breaking existing code, especially when what is being "fixed" is a minor issue which almost nobody complains about. This is even truer for stable branches, and Python 2 is very much a stable branch now (no more feature releases after 2.7). > That's why I say that you are of conformance by having encoders and decoders > of UTF > streams tolerate noncharacters. You are not allowed to call something a UTF > and do > non-UTF things with it, because this in violation of conformance requirement > C2. Perhaps, but it is not Python's fault if the IETF and the Unicode consortium have disagreed on what UTF-8 should be. I'm not sure what people called "UTF-8" when support for it was first introduced in Python, but you can't blame us for maintaining a consistent behaviour across releases. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue12729> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com