[issue9425] Rewrite import machinery to work with unicode paths

2010-08-29 Thread STINNER Victor
Changes by STINNER Victor : Added file: http://bugs.python.org/file18671/Py_UNICODE_strcat.patch ___ Python tracker <http://bugs.python.org/issue9425> ___ ___ Python-bug

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-29 Thread STINNER Victor
STINNER Victor added the comment: Py_UNICODE_strcat.patch: create Py_UNICODE_strcat() function. Py_UNICODE_strdup.patch: create Py_UNICODE_strdup() function. -- Added file: http://bugs.python.org/file18672/Py_UNICODE_strdup.patch ___ Python tracker

[issue9713] Py_CompileString fails on non decode-able paths.

2010-08-31 Thread STINNER Victor
STINNER Victor added the comment: The problem is not specific to Py_CompileString(): all functions based (indirectly) on PyParser_ASTFromString() and PyParser_ASTFromFile() expect filenames encoded in utf-8 with the strict error handler. If we choose to use something else than utf-8 in

[issue9713] Py_CompileString fails on non decode-able paths.

2010-08-31 Thread STINNER Victor
Changes by STINNER Victor : -- components: +Unicode -None versions: +Python 3.2 ___ Python tracker <http://bugs.python.org/issue9713> ___ ___ Python-bugs-list m

[issue1552880] Unicode Imports

2010-08-31 Thread STINNER Victor
STINNER Victor added the comment: utf-8 codec (in strict mode) rejects surrogates in python3, and so you doesn't support undecodable filenames (filenames decoded using surrogateescape error handler which produces surrogate characters). It may be possible if you use surrogateescape every

[issue9549] Remove sys.setdefaultencoding()

2010-09-01 Thread STINNER Victor
STINNER Victor added the comment: Ok to remove it from Python 3.2. I don't think that it is necessary to update Python 2.7 code/doc. -- ___ Python tracker <http://bugs.python.org/i

[issue1552880] Unicode Imports

2010-09-01 Thread STINNER Victor
STINNER Victor added the comment: > According to the Unicode standard the high and low surrogate halves used > by UTF-16 (...) Yes, but in Python, U+DC80..D+DCFF range is used to store undecodable bytes. Eg. 'abc\xff'.decode('ascii', 'surrogateescape')

[issue9738] Document the encoding of functions bytes arguments of the C API

2010-09-01 Thread STINNER Victor
New submission from STINNER Victor : Many C functions have bytes argument (char* type) but the encoding is not documented. If would not be a problem if the encoding was always the same, but it is not. Examples: - format of PyUnicode_FromFormat() should be encoded as ISO-8859-1 - filename of

[issue9425] Rewrite import machinery to work with unicode paths

2010-09-01 Thread STINNER Victor
STINNER Victor added the comment: r84429 creates Py_UNICODE_strcat() (change with the patch: return the right value). r84430 creates PyUnicode_strdup() (change with the patch: rename the function from Py_UNICODE_strdup() to PyUnicode_strdup() and mangle the function name

[issue9425] Rewrite import machinery to work with unicode paths

2010-09-01 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18672/Py_UNICODE_strdup.patch ___ Python tracker <http://bugs.python.org/issue9425> ___ ___ Python-bug

[issue9425] Rewrite import machinery to work with unicode paths

2010-09-01 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18671/Py_UNICODE_strcat.patch ___ Python tracker <http://bugs.python.org/issue9425> ___ ___ Python-bug

[issue7077] SysLogHandler can't handle Unicode

2010-09-02 Thread STINNER Victor
Changes by STINNER Victor : -- nosy: +haypo ___ Python tracker <http://bugs.python.org/issue7077> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue9756] Crash with custom __getattribute__

2010-09-03 Thread STINNER Victor
STINNER Victor added the comment: << I found this crash while playing with proxies (thanks haypo). http://code.activestate.com/recipes/496741-object-proxying/ >> My question was: why does isinstance(Proxy('abc'), str) works (give True), whereas re.match('abc'

[issue9756] Crash with custom __getattribute__

2010-09-03 Thread STINNER Victor
STINNER Victor added the comment: >>> class Spam(object): ... def __getattribute__(self, name): ... if name == '__class__': ... return str ... raise AttributeError ... >>> spam = Spam('spam') >>> isinstance(s

[issue8678] crashers in rgbimg

2010-09-03 Thread STINNER Victor
STINNER Victor added the comment: I am able to reproduce the crash with z > 4: # (magic, type (rle, bpp), dim, x, y, z) open('image', 'wb').write(struct.pack('>hh', 0732, 1, 1, 1, 1, 10)) rgbimg.longimagedata('image') -- But not

[issue9769] PyUnicode_FromFormatV() doesn't handle non-ascii text correctly

2010-09-03 Thread STINNER Victor
New submission from STINNER Victor : I'm trying to document the encoding of all bytes argument of the C API: see #9738. I tried to understand which encoding is used by PyUnicode_FromFormat*() (and PyErr_Format() which calls PyUnicode_FromFormatV()). It looks like ISO-8859-1

[issue9738] Document the encoding of functions bytes arguments of the C API

2010-09-03 Thread STINNER Victor
STINNER Victor added the comment: About PyErr_Format() and PyUnicode_FromFormat*() encoding: it's not exactly ISO-8859-1... there is a bug => issue #9769. -- ___ Python tracker <http://bugs.python.or

[issue9630] Reencode filenames when setting the filesystem encoding

2010-09-03 Thread STINNER Victor
STINNER Victor added the comment: Another possibility is to use _Py_char2wchar() + PyUnicode_FromWideChar() / _Py_wchar2char() + PyUnicode_AsWideChar() to decode / encode filenames. These functions use the locale encoding. This solution was possible in Python 3.1, but no more in Python 3.2

[issue9632] Remove sys.setfilesystemencoding()

2010-09-03 Thread STINNER Victor
STINNER Victor added the comment: > In such environments you cannot expect the user to configure the > system properly (i.e. set an environment variable). Why would it be different for embeded python? > Instead, the application has to provide an educated guess > to the Python in

[issue9756] Crash with custom __getattribute__

2010-09-03 Thread STINNER Victor
STINNER Victor added the comment: PyUnicode_Check(op) checks op->ob_type->tp_flags & Py_TPFLAGS_UNICODE_SUBCLASS. -- ___ Python tracker <http://bugs.python.

[issue9756] Crash with custom __getattribute__

2010-09-03 Thread STINNER Victor
STINNER Victor added the comment: I have different questions: - Should we trust PyObject_IsInstance() or PyUnicode_Check() (because they give different results)? - Should PyObject_IsInstance() and PyUnicode_Check() give the same result? - Should we fix the segfault? To fix the segfault, I

[issue1552880] [Python2] Use utf-8 in the import machinery on Windows to support unicode paths

2010-09-03 Thread STINNER Victor
STINNER Victor added the comment: Oh, I didn't see that the issue was specific to Python2. I updated the issue's title. If I understood correctly, the issue is also specific to Windows. Do you know if your patch changes the public API? (break the compatibility) -- FYI abo

[issue9630] Reencode filenames when setting the filesystem encoding

2010-09-05 Thread STINNER Victor
STINNER Victor added the comment: > Do we really want to support this kind of configuration? There is also a problem is the directory name is b'py3k\xe9': at startup (utf-8 encoding), the name is decoded to 'py3k\udce9'. When the locale encoding is set to iso-885

[issue9769] PyUnicode_FromFormatV() doesn't handle non-ascii text correctly

2010-09-07 Thread STINNER Victor
STINNER Victor added the comment: > PyUnicode_FromFormat("%s", text) expects a utf-8 buffer. Really? I don't see how "*s++ = *f;" (where s is Py_UNICODE* and f is char*) can decode utf-8. It looks more like ISO-8859-1. > Very recently (r84472, r84485), some

[issue9632] Remove sys.setfilesystemencoding()

2010-09-07 Thread STINNER Victor
STINNER Victor added the comment: About "embedded Python interpreters or py2exe-style applications": do you mean that the application calls a C function to set the encoding before starting the interpreter? Or you mean the Python function, sys.setfilesystemencoding()? I would like

[issue9632] Remove sys.setfilesystemencoding()

2010-09-07 Thread STINNER Victor
STINNER Victor added the comment: "keep the C function" Hum, currently, Python3 only has a *private* function called _Py_SetFileSystemEncoding() which can only be called after _Py_InitializeEx() (because it relies on the codecs API). If you consider that there is a real use case,

[issue4947] sys.stdout fails to use default encoding as advertised

2010-09-08 Thread STINNER Victor
STINNER Victor added the comment: I commited my patch (with a new test, iso-8859-1:replace) to 2.7: r84621. I will no backport to 2.6 because this branch now only accept security fixes. -- resolution: -> fixed status: open -> closed ___

[issue9769] PyUnicode_FromFormatV() doesn't handle non-ascii text correctly

2010-09-08 Thread STINNER Victor
STINNER Victor added the comment: > My remark is that utf-8 tend to be applied to all kind of files; > if someone once decide that non-ascii chars are allowed in (some) > string constants, they will be stored in utf-8. In this case, it will be better to raise an error on non-a

[issue9804] ascii() does not always join surrogate pairs

2010-09-08 Thread STINNER Victor
STINNER Victor added the comment: For unicode, ascii(x) is implemented as repr(x).encode('ascii', 'backslashreplace').decode('ascii'). repr(x) is "'" + x + "'" for printable characters (eg. U+1D121), and "'U+%08x'&q

[issue9804] ascii() does not always join surrogate pairs

2010-09-08 Thread STINNER Victor
STINNER Victor added the comment: > >>> s = "'\0\"\n\r\t abcd\x85é\U00012fff\U0001D121xxx\uD800." > (...) > (I think I've included everything: > - normal chars > - control chars > - one-byte non-ASCII > - two-byte non-ASCII (and lone sur

[issue9738] Document the encoding of functions bytes arguments of the C API

2010-09-09 Thread STINNER Victor
STINNER Victor added the comment: #6543 changed the encoding of the filename argument of PyRun_SimpleFileExFlags() (and all functions based on PyRun_SimpleFileExFlags) and c_filename attribute of the compiler (private) structure in Python 3.1.3: use utf-8 in strict mode instead of filesystem

[issue9713] Py_CompileString fails on non decode-able paths.

2010-09-09 Thread STINNER Victor
STINNER Victor added the comment: #6543 changed the encoding of the filename argument of PyRun_SimpleFileExFlags() (and all functions based on PyRun_SimpleFileExFlags) and c_filename attribute of the compiler (private) structure in Python 3.1.3: use utf-8 in strict mode instead of filesystem

[issue8611] Python3 doesn't support locale different than utf8 and an non-ASCII path (POSIX)

2010-09-09 Thread STINNER Victor
STINNER Victor added the comment: See also #9713 (Py_CompileString fails on non decode-able paths) and #9738 (Document the encoding of functions bytes arguments of the C API). -- ___ Python tracker <http://bugs.python.org/issue8

[issue9813] Module Name Changed

2010-09-09 Thread STINNER Victor
STINNER Victor added the comment: Do you think that it is a Python bug? You should first try to report a bug on eGenenix bug tracker: http://www.egenix.com/services/support/ -- nosy: +haypo ___ Python tracker <http://bugs.python.org/issue9

[issue9819] TESTFN_UNICODE and TESTFN_UNDECODABLE

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: > WARNING: The filename '@test_464_tmp-共有される' CAN be encoded > by (...) cp932 We should find character not encodable in any Windows code page, but accepted as filenames. > characters like "\u2661" or "\u2668" (..

[issue9820] Windows : os.listdir(b'.') doesn't raise an error for unencodable filenames

2010-09-10 Thread STINNER Victor
New submission from STINNER Victor : In Python 3.2, mbcs encoding (default filesystem encoding on Windows) is now strict: raise an error on unencodable/undecodable characters/bytes. But os.listdir(b'.') encodes unencodable bytes as b'?'. Example: >>> os.mkdir

[issue9820] Windows : os.listdir(b'.') doesn't raise an error for unencodable filenames

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: I found this bug while trying to find an unencodable filename for #9819 (TESTFN_UNDECODABLE). Anyway, the bytes API should be avoided on Windows since Windows native filename type is unicode. -- ___ Python

[issue9819] TESTFN_UNICODE and TESTFN_UNDECODABLE

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: See also #9820. -- ___ Python tracker <http://bugs.python.org/issue9819> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue9821] Support PEP 383 on Windows: mbcs support of surrogateescape error handler

2010-09-10 Thread STINNER Victor
New submission from STINNER Victor : It would be nice to support PEP 383 (surrogateescape) on Windows, but the mbcs codec doesn't support it for performance reason. The Windows functions to encode/decode MBCS don't give the index of the unencodable/undecodable character/byte. For en

[issue9820] Windows : os.listdir(b'.') doesn't raise an error for unencodable filenames

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: > os.listdir(b'listdir') should raise an error (and not ignore > the filename or replaces unencodable characters by b'?'). To avoid the error, a solution is to support the PEP 383 on Windows (for the mbcs encoding). I opened a separ

[issue9819] TESTFN_UNICODE and TESTFN_UNDECODABLE

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: > "dir" command cannot print filename correctly, though. Who cares? We just have to be able to create a file with a name containing non encodable characters, list the directory, and then remove this evil file. -- With r84666, Python uses &

[issue9819] TESTFN_UNICODE and TESTFN_UNDECODABLE

2010-09-10 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18823/find_unencode_filename.py ___ Python tracker <http://bugs.python.org/issue9819> ___ ___ Pytho

[issue9821] Support PEP 383 on Windows: mbcs support of surrogateescape error handler

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: Oh wait. PEP 383 is a solution to store undecodable bytes in an unicode string, but for mbcs I'm trying to get the opposite: store unicode in bytes and this is not possible (at least with PEP 383). Example with Python 3.1: >>> print("abcŁ

[issue9819] TESTFN_UNICODE and TESTFN_UNDECODABLE

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: > With r84666, Python uses "-\u5171\u6709\u3055\u308c\u308b" > suffix for TESTFN_UNENCODABLE. Backported to 3.1 as r84668. I don't want to patch Python 2.x (its unicode support is lower and the code is too different than Python3) and

[issue9820] Windows : os.listdir(b'.') doesn't raise an error for unencodable filenames

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: Patch: - Remove the bytes version of listdir(): reuse the unicode version but converts the filename to bytes using PyUnicode_EncodeFSDefault() if the directory name is not unicode - use Py_XDECREF(d) instead of Py_DECREF(d) at the end (because d=NULL on

[issue9821] Support PEP 383 on Windows: mbcs support of surrogateescape error handler

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: Close this issue: PEP 383 is specific to filesystem using bytes, it is useless on Windows (the problem on Windows is on encoding, not on decoding). -- resolution: -> invalid status: open -> closed ___

[issue9769] PyUnicode_FromFormatV() doesn't handle non-ascii text correctly

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: @amaury: Do you agree to reject non-ascii bytes? TODO: document format encoding in Doc/c-api/*.rst. -- ___ Python tracker <http://bugs.python.org/issue9

[issue9632] Remove sys.setfilesystemencoding()

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: I didn't proposed to add a new parameter to Py_InitializeEx() (which means create a new function to not break the API), I just wrote that _Py_SetFileSystemEncoding() doesn't work for your use case. > If you embed Python into another applic

[issue8603] Create a bytes version of os.environ and getenvb()

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: > If you still consider that the change on .data as a bug, > I think that the fix is to remove .data (mark it as > protected: environ.data => environ._data). r84690 marks os.environ.data as protected. Close this issue again. --

[issue9402] pyexpat: replace PyObject_DEL() by Py_DECREF() to fix a crash in pydebug mode

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: Fixed by r84692. -- resolution: -> fixed status: open -> closed ___ Python tracker <http://bugs.python.org/issue9402> ___ __

[issue8197] Fatal error on thread creation in low memory condition: local storage

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: I don't know how to fix this issue, and I don't know if it can be fixed. As the issue is very unlikely, I prefer to close it. -- resolution: -> wont fix status: open -> closed ___ Pytho

[issue7093] xmlrpclib.ServerProxy() doesn't support unicode uri

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: Well, it was trivial to workaround this bug in my application (convert host to bytes using explicit host = str(host)). Python3 doesn't have this issue and Python 2.7 is released, I prefer to close this bug as wont fix. -- resolution: -> fixe

[issue7093] xmlrpclib.ServerProxy() doesn't support unicode uri

2010-09-10 Thread STINNER Victor
Changes by STINNER Victor : -- resolution: fixed -> wont fix ___ Python tracker <http://bugs.python.org/issue7093> ___ ___ Python-bugs-list mailing list Un

[issue5934] fix gcc warnings: explicit type conversion for uid/gid in posix

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: Well, it's not a bug, just a gcc warning. We don't need this patch. -- resolution: -> wont fix status: open -> closed ___ Python tracker <http://bugs.p

[issue9408] curses: Link against libncursesw instead of libncurses

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: Does anyone agree with me? -- ___ Python tracker <http://bugs.python.org/issue9408> ___ ___ Python-bugs-list mailing list Unsub

[issue8589] test_warnings.CEnvironmentVariableTests.test_nonascii fails under an ascii terminal

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: It should be fixed by r84694. -- status: open -> closed ___ Python tracker <http://bugs.python.org/issue8589> ___ ___ Python-

[issue5016] FileIO.seekable() can return False

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: It is fixed in 2.7 with the backport of the Python3's io library (r73394). -- resolution: accepted -> fixed status: open -> closed ___ Python tracker <http://bugs.python

[issue6011] python doesn't build if prefix contains non-ascii characters

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: New patch: - add encoding option to TextFile constructor - parse_makefile() uses the heuristic from text_file.diff Note: sys.getfilesystemencoding() is always set in Python 3.2 (but it may be None in Python 2.x and Python < 3.2). -- Added f

[issue9561] distutils: set encoding to utf-8 for input and output files

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: I attached a patch to #6011 to set the encoding to read the Makefile. -- ___ Python tracker <http://bugs.python.org/issue9

[issue9579] In 3.x, os.confstr() returns garbage if value is longer than 255 bytes

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: Fixed in r84696+r84697: confstr-minimal.diff + PyUnicode_DecodeFSDefaultAndSize(). -- ___ Python tracker <http://bugs.python.org/issue9

[issue9579] In 3.x, os.confstr() returns garbage if value is longer than 255 bytes

2010-09-10 Thread STINNER Victor
Changes by STINNER Victor : -- resolution: -> fixed status: open -> closed ___ Python tracker <http://bugs.python.org/issue9579> ___ ___ Python-bugs-list

[issue9580] os.confstr() doesn't decode result according to PEP 383

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: Fixed in r84696+r84697: confstr-minimal.diff from #9579 + PyUnicode_DecodeFSDefaultAndSize(). Thanks for the patch, sorry for the delay. -- resolution: -> duplicate status: open -> closed ___ Python t

[issue9580] os.confstr() doesn't decode result according to PEP 383

2010-09-10 Thread STINNER Victor
Changes by STINNER Victor : -- resolution: duplicate -> fixed ___ Python tracker <http://bugs.python.org/issue9580> ___ ___ Python-bugs-list mailing list Un

[issue767645] incorrect os.path.supports_unicode_filenames

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: test_pep277.patch removes the usage of os.path.supports_unicode_filenames from test_pep277: the test still pass on Debian Sid (Linux). Can someone test the patch on Mac OS X, FreeBSD and Solaris (and maybe other POSIX/UNIX OSes)? About Windows

[issue767645] incorrect os.path.supports_unicode_filenames

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: Oops, forget test_pep277.patch: I misunderstood r81149 (new way to detect if the filesystem supports unicode or not). test_pep277 fails with my patch on Linux with LC_CTYPE=C. -- ___ Python tracker <h

[issue767645] incorrect os.path.supports_unicode_filenames

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: r84701 fixes supports_unicode_filenames's definition in Python 3.2 (and r84702 in Python 3.1): os.listdir(str) now always return unicode filenames (including non-ascii characters). -- ___ Python tracker

[issue767645] incorrect os.path.supports_unicode_filenames

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: > Maybe os.path.supports_unicode_filenames should be deprecated. > The doc currently says: > "True if arbitrary Unicode strings can be used as file names > (within limitations imposed by the file system), and if os.listdir() > returns U

[issue9408] curses: Link against libncursesw instead of libncurses

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: $ ldd $(/usr/bin/python3.1 -c 'import readline; print(readline.__file__)')|grep curses libncurses.so.5 => /lib/libncurses.so.5 (0xb7537000) $ ldd /lib/libreadline.so.6|grep curses libncurses.so.5 => /lib/libncurses.so.5 (0xb76a6

[issue9769] PyUnicode_FromFormatV() doesn't handle non-ascii text correctly

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: Fixed by r84704 in Python 3.2. -- resolution: -> fixed status: open -> closed ___ Python tracker <http://bugs.python.org/

[issue9819] TESTFN_UNICODE and TESTFN_UNDECODABLE

2010-09-10 Thread STINNER Victor
STINNER Victor added the comment: > How about TESTFN_UNICODE (test_unicode_file) issue? File "e:\python-dev\py3k\lib\test\test_unicode_file.py", line 12, in TESTFN_UNICODE.encode(TESTFN_ENCODING) UnicodeEncodeError: 'mbcs' codec can't encode character

[issue9819] TESTFN_UNICODE and TESTFN_UNDECODABLE

2010-09-10 Thread STINNER Victor
Changes by STINNER Victor : Added file: http://bugs.python.org/file18845/unicode_file.patch ___ Python tracker <http://bugs.python.org/issue9819> ___ ___ Python-bug

[issue9819] TESTFN_UNICODE and TESTFN_UNDECODABLE

2010-09-11 Thread STINNER Victor
STINNER Victor added the comment: > Thank you, your patch works. Ok, patch commited to 3.2 as r84710. Thanks for your feedback. -- ___ Python tracker <http://bugs.python.org/iss

[issue8589] test_warnings.CEnvironmentVariableTests.test_nonascii fails under an ascii terminal

2010-09-12 Thread STINNER Victor
STINNER Victor added the comment: > Still happens with r84709 on PPC Tiger 3.x It's not the same error, PYTHONWARNINGS is decoded from the wrong encoding: locale encodind instead of utf-8. r84731 should fix this bug (at least, it restores the encoding used because my last commit

[issue9836] Refleak in PyUnicode_FormatV

2010-09-12 Thread STINNER Victor
STINNER Victor added the comment: Fixed by r84730, thanks for the issue. -- resolution: -> fixed status: open -> closed ___ Python tracker <http://bugs.python.org/

[issue9820] Windows : os.listdir(b'.') doesn't raise an error for unencodable filenames

2010-09-12 Thread STINNER Victor
STINNER Victor added the comment: > What do you gain with this patch? (i.e. what is its advantage?) You know directly that os.listdir(bytes) is unable to encode the filename, instead of manipulate an invalid filename (b'?') and get the error later (when you use the filenam

[issue9820] Windows : os.listdir(b'.') doesn't raise an errorfor unencodablefilenames

2010-09-12 Thread STINNER Victor
STINNER Victor added the comment: > FindFirst/NextFileA will also do some other interesting conversions, > such as the best-fit conversion (which the "mbcs" code doesn't do > (anymore?)). About mbcs, mbcs codec of Python 3.1 is like .encode('mbcs', 'repl

[issue9820] Windows : os.listdir(b'.') doesn't raise an errorfor unencodablefilenames

2010-09-12 Thread STINNER Victor
STINNER Victor added the comment: It remembers me the discussion of the issue #3187. About unencodable filenames, Guido proposed to ignore them or to use errors="replace", and wrote "Failing the entire os.listdir() call is not acceptable". (... long discussion ...) And

[issue9820] Windows : os.listdir(b'.') doesn't raise an errorfor unencodablefilenames

2010-09-12 Thread STINNER Victor
STINNER Victor added the comment: > FindFirst/NextFileA will also do some other interesting conversions, > such as the best-fit conversion (which the "mbcs" code doesn't do > (anymore?)). If we choose to keep this behaviour, I will have to revert my commit on mbcs cod

[issue9820] Windows : os.listdir(b'.') doesn't raise an errorfor unencodablefilenames

2010-09-13 Thread STINNER Victor
STINNER Victor added the comment: > I fail to see why removing incorrect file names from the result > list is any better than keeping them. The result list will > be incorrect either way. It depends if you focus on displaying the content of the directory, or on processing

[issue9820] Windows : os.listdir(b'.') doesn't raise an errorfor unencodablefilenames

2010-09-13 Thread STINNER Victor
STINNER Victor added the comment: > I think trying to emulate, in Python, what the *A functions > do is futile. My problem is that some functions will use mbcs in strict mode (functions using PyUnicode_EncodeFSDefault): raise UnicodeEncodeError, and other will use mbcs in replac

[issue9820] Windows : os.listdir(b'.') doesn't raise an errorfor unencodablefilenames

2010-09-13 Thread STINNER Victor
STINNER Victor added the comment: - ignore unencodable filenames is not a good idea - raise an error on unencodable filenames breaks backward compatibility - I don't think that emit a warning will change anything Even if I don't like mbcs+replace (current behaviour of os.listdir(

[issue767645] incorrect os.path.supports_unicode_filenames

2010-09-13 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18841/test_pep277.patch ___ Python tracker <http://bugs.python.org/issue767645> ___ ___ Python-bug

[issue767645] incorrect os.path.supports_unicode_filenames

2010-09-13 Thread STINNER Victor
STINNER Victor added the comment: r84784 sets os.path.supports_unicode_filenames to True on Mac OS X (macpath module). About test_supports_unicode_filenames.patch. test_unicode_listdir() is wrong: os.listdir(str) always return str (see r84701). "verify that the new file's name i

[issue767645] incorrect os.path.supports_unicode_filenames

2010-09-13 Thread STINNER Victor
STINNER Victor added the comment: I backported r84701 and r84784 to Python 2.7 (r84787). -- ___ Python tracker <http://bugs.python.org/issue767645> ___ ___ Pytho

[issue9819] TESTFN_UNICODE and TESTFN_UNDECODABLE

2010-09-13 Thread STINNER Victor
Changes by STINNER Victor : -- resolution: -> fixed status: open -> closed ___ Python tracker <http://bugs.python.org/issue9819> ___ ___ Python-bugs-list

[issue767645] incorrect os.path.supports_unicode_filenames

2010-09-14 Thread STINNER Victor
STINNER Victor added the comment: > There seems to be some confusion about the macpath.py module. (...) Oops. I thought that Mac OS X uses macpath, but in fact it is posixpath. Can you try my new patch posixpath_darwin.patch? I reopen the issue because I patched the wrong module. I supp

[issue9850] obsolete macpath module dangerously broken and should be removed

2010-09-14 Thread STINNER Victor
STINNER Victor added the comment: The solution may be different depending on Python version. I propose to keep macpath in Python 2.7, just because it's too late to change such thing in Python2. But we may mark macpath as deprecated, eg. "macpath will be removed in Python 3.2"

[issue6011] python doesn't build if prefix contains non-ascii characters

2010-09-14 Thread STINNER Victor
STINNER Victor added the comment: For non-ascii directory name but ascii locale (eg. C locale), we have 3 choices: a- read Makefile as a binary file b- use the PEP 383 c- refuse to compile (a) doesn't seem easy because it looks like distutils use the unicode type for all path

[issue6011] python doesn't build if prefix contains non-ascii characters

2010-09-14 Thread STINNER Victor
STINNER Victor added the comment: Warning: "use the PEP 383" may impact other distutils component because the path may be written into to other files, which mean that we have to use errors='surrogateescape' for these files too. -- __

[issue8998] add crypto routines to stdlib

2010-09-17 Thread STINNER Victor
Changes by STINNER Victor : -- nosy: +haypo ___ Python tracker <http://bugs.python.org/issue8998> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue767645] incorrect os.path.supports_unicode_filenames

2010-09-17 Thread STINNER Victor
STINNER Victor added the comment: > No problems noted with a quick test of posixpath_darwin.patch > on 10.6 so looks good. Ok thanks. Fix commited to 3.2 (r84866) and 2.7 (r84868). I kept my patch on macpath (supports_unicode_filenames=True) because it is still valid (even if it is no

[issue8589] test_warnings.CEnvironmentVariableTests.test_nonascii fails under an ascii terminal

2010-09-17 Thread STINNER Victor
STINNER Victor added the comment: I don't see any test_warnings anymore on http://code.google.com/p/bbreport/wiki/PythonBuildbotReport. Close this issue. -- status: open -> closed ___ Python tracker <http://bugs.python.or

[issue4661] email.parser: impossible to read messages encoded in a different encoding

2010-09-21 Thread STINNER Victor
Changes by STINNER Victor : -- nosy: +haypo ___ Python tracker <http://bugs.python.org/issue4661> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue9630] Reencode filenames when setting the filesystem encoding

2010-09-24 Thread STINNER Victor
STINNER Victor added the comment: New version of the patch: - reencode sys.path_importer_cache (and remove the last FIXME) - fix different reference leaks - catch PyIter_Next() failures - create a subfunction to reencode sys.modules: it's easier to review and manager errors in sh

[issue9630] Reencode filenames when setting the filesystem encoding

2010-09-24 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file18561/reencode_modules_path-2.patch ___ Python tracker <http://bugs.python.org/issue9630> ___ ___

[issue9630] Reencode filenames when setting the filesystem encoding

2010-09-24 Thread STINNER Victor
STINNER Victor added the comment: > I would rename the feature to something like "redecode-modules" Yes, right. I will rename the functions before commiting the patch. -- ___ Python tracker <http://bugs.pyth

[issue9630] Reencode filenames when setting the filesystem encoding

2010-09-24 Thread STINNER Victor
STINNER Victor added the comment: > Why is this needed ? Short answer: to support filesystem encoding different than utf-8. See #8611 for a longer explanation. Example: $ pwd /home/SHARE/SVN/py3ké $ PYTHONFSENCODING=ascii ./python test_fs_encoding.py Fatal Python error: Py_Initial

[issue9630] Reencode filenames when setting the filesystem encoding

2010-09-24 Thread STINNER Victor
STINNER Victor added the comment: > Not sure it's related, but there seems to be a bug: It's not a bug, it's a feature :-) If you specify a non-existing locale, the GNU libc fails back to ascii. $ locale -a C français french fr_FR fr...@euro fr_FR.iso88591 fr_fr.iso885.

[issue9630] Reencode filenames when setting the filesystem encoding

2010-09-24 Thread STINNER Victor
STINNER Victor added the comment: > Some things about your patch: > - as Amaury said, functions should be named "redecode*" > rather than "reencode*" Yes, as written before (msg117269), I will do it in my next patch. > - please use -1 for error return, n

[issue9630] Reencode filenames when setting the filesystem encoding

2010-09-24 Thread STINNER Victor
STINNER Victor added the comment: Le vendredi 24 septembre 2010 14:35:29, Marc-Andre Lemburg a écrit : > Thanks for the explanation. So the only reason why you have to go through > all those hoops is to > > * allow the complete set of Python supported encoding names

<    18   19   20   21   22   23   24   25   26   27   >