[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-22 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: I think it's better to close the ticket as "won't fix". -- resolution: -> wont fix status: open -> closed ___ Python tracker ___ __

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-20 Thread Ori Avtalion
Ori Avtalion added the comment: > In which specific case did you find the problem you mentioned ? I didn't. I only pointed out the inconsistency. I'm happy with rejecting this bug, if it's not seen as a problem. -- ___ Python tracker

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-19 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Ori Avtalion wrote: > > Ori Avtalion added the comment: > > Ignoring the custom utf-8/latin-8 conversion functions, the actual checking > if a codec exists is done in Python/codecs.c's PyCodec_Decode. > > Is that where I should move the aforementioned o

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-19 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Mark Dickinson wrote: > > Mark Dickinson added the comment: > > Thanks for the patch. > > Rather than remove that optimization entirely, I'd consider pushing it into > PyUnicode_Decode. > > All tests (whether for the standard library or for the core) g

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-18 Thread Ori Avtalion
Ori Avtalion added the comment: Ignoring the custom utf-8/latin-8 conversion functions, the actual checking if a codec exists is done in Python/codecs.c's PyCodec_Decode. Is that where I should move the aforementioned optimization to? Is it safe to assume that the decoded object is always a st

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-18 Thread Mark Dickinson
Mark Dickinson added the comment: And PyUnicode_Decode doesn't look up the encoding in the registry either: that's somewhere in PyCodec_Decode. I'm going to butt out now and leave this to those who know the code better. :) -- ___ Python tracker <

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-18 Thread Mark Dickinson
Mark Dickinson added the comment: I take that back: test_codecs_errors isn't the right function to add these tests to. I actually don't see any current tests for invalid codecs. Part of the problem would be coming up with an invalid codec name in the first place: as I understand it, new c

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-18 Thread Mark Dickinson
Mark Dickinson added the comment: Thanks for the patch. Rather than remove that optimization entirely, I'd consider pushing it into PyUnicode_Decode. All tests (whether for the standard library or for the core) go into Lib/test, so that would be the right place. test_codecs_errors in Lib/t

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-18 Thread Ori Avtalion
Ori Avtalion added the comment: OK. The attached patch removes the empty string check before decoding. I'm not sure where tests should go, since I can only find them in Lib/test/ and this is not a library change. -- keywords: +patch Added file: http://bugs.python.org/file16254/decode

[issue7961] Py3k: decoding empty bytestring with invalid encoding throws no error

2010-02-18 Thread Mark Dickinson
Mark Dickinson added the comment: Specifically, the behaviour comes from an early check for empty strings in the PyUnicode_FromEncodedObject function: /* Convert to Unicode */ if (len == 0) { Py_INCREF(unicode_empty); v = (PyObject *)unicode_empty; } else