[issue13560] Add PyUnicode_DecodeLocale and PyUnicode_DecodeLocaleAndSize

2011-12-08 Thread STINNER Victor
New submission from STINNER Victor : To decode byte string from the locale encoding (LC_CTYPE), PyUnicode_DecodeFSDefault() can be used, but this function uses a constant encoding set at startup (the locale encoding at startup). The right method is currently to call _Py_char2wchar() and then

[issue13560] Add PyUnicode_DecodeLocale and PyUnicode_DecodeLocaleAndSize

2011-12-08 Thread STINNER Victor
Changes by STINNER Victor : -- keywords: +patch Added file: http://bugs.python.org/file23886/pyunicode_decodelocale.patch ___ Python tracker <http://bugs.python.org/issue13

[issue13441] TestEnUSCollation.test_strxfrm() fails on Solaris

2011-12-09 Thread STINNER Victor
STINNER Victor added the comment: I collected the locale list triggering the mbstowcs() bug thanks my previous commit: * hu_HU (ISO8859-2): character U+3020 * de_AT (ISO8859-1): character U+3076 * cs_CZ (ISO8859-2): character U+3020 * sk_SK (ISO8859-2): character U+3020

[issue4352] imp.find_module() fails with a UnicodeDecodeError when called with non-ASCII search paths

2011-12-09 Thread STINNER Victor
STINNER Victor added the comment: @Serg Asminog: What is your Python version? What is your locale encoding (print(sys.getfilesystemencoding())? What is your Windows version? -- ___ Python tracker <http://bugs.python.org/issue4

[issue13565] test_multiprocessing.test_notify_all() hangs on "AMD64 Snow Leopard 02 03.x"

2011-12-09 Thread STINNER Victor
New submission from STINNER Victor : [333/363] test_multiprocessing Timeout (1:00:00)! Thread 0x000112d0b000: File "/Users/buildbot/buildarea/3.x.parc-snowleopard-1/build/Lib/multiprocessing/connection.py", line 411 in _recv File "/Users/buildbot/buildarea/3.x.parc-snow

[issue11894] test_multiprocessing failure on "AMD64 OpenIndiana 3.x": KeyError on id_to_obj[ident] in serve_client()

2011-12-09 Thread STINNER Victor
STINNER Victor added the comment: I didn't see this failure again since the issue was opened, so I close it as invalid. -- resolution: -> invalid status: open -> closed ___ Python tracker <http://bugs.python.

[issue13441] TestEnUSCollation.test_strxfrm() fails on Solaris

2011-12-09 Thread STINNER Victor
STINNER Victor added the comment: The Solaris buildbot is green, let's close it. I didn't report the bug upstream. Feel free to report it to Oracle! -- resolution: -> fixed status: open -> closed ___ Python tracker <htt

[issue4352] imp.find_module() fails with a UnicodeDecodeError when called with non-ASCII search paths

2011-12-09 Thread STINNER Victor
STINNER Victor added the comment: Oops, it's not sys.getfilesystemencoding(), but locale.getpreferredencoding() which is interesting. Can you give me your locale encoding? -- ___ Python tracker <http://bugs.python.org/i

[issue12567] curses implementation of Unicode is wrong in Python 3

2011-12-09 Thread STINNER Victor
STINNER Victor added the comment: > I wrote down when I set up the OpenIndiana buildbots Hum, please use the issue #13552 for curses issues on OpenIndiana/Solaris. > ... de funciones: "mvwchgat" y "wchgat" See issues #3786 and #13552 for this problem. > I insta

[issue5905] strptime fails in non-UTF locale

2011-12-09 Thread STINNER Victor
Changes by STINNER Victor : -- resolution: -> fixed status: open -> closed ___ Python tracker <http://bugs.python.org/issue5905> ___ ___ Python-bugs-list

[issue13560] Add PyUnicode_DecodeLocale and PyUnicode_DecodeLocaleAndSize

2011-12-09 Thread STINNER Victor
STINNER Victor added the comment: I fixed issue #5905 (strptime fails in non-UTF locale). The fix is not enough if the locale is changed in Python. Update the patch to fix time.strftime() (if wcsftime() is not available). -- Added file: http://bugs.python.org/file23894

[issue11886] test_time.test_tzset() fails on "x86 FreeBSD 7.2 3.x": AEST timezone called "EST"

2011-12-09 Thread STINNER Victor
STINNER Victor added the comment: The FreeBSD 7.2 3.x buildbot is green. -- resolution: -> fixed status: open -> closed ___ Python tracker <http://bugs.python.org/i

[issue13570] Expose faster unicode<->ascii functions in the C-API

2011-12-10 Thread STINNER Victor
STINNER Victor added the comment: Le 09/12/2011 22:12, Stefan Krah a écrit : > The bottleneck in _decimal is (res is ascii): > > PyUnicode_FromString(res); > > PyUnicode_DecodeASCII(res) has the same performance. > > > With this function ... > >static PyObj

[issue13572] import _curses fails because of UnicodeDecodeError('utf8' codec can't decode byte 0xb5 ...') on ARM Ubuntu 3.x

2011-12-10 Thread STINNER Victor
New submission from STINNER Victor : http://www.python.org/dev/buildbot/all/builders/ARM%20Ubuntu%203.x/builds/143/steps/test/logs/stdio --- test test_curses crashed -- Traceback (most recent call last): File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/regrte

[issue13572] import _curses fails because of UnicodeDecodeError('utf8' codec can't decode byte 0xb5 ...') on ARM Ubuntu 3.x

2011-12-10 Thread STINNER Victor
STINNER Victor added the comment: The compilation of the module failed for the same reason: building '_curses' extension gcc -pthread -fPIC -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -DHAVE_NCURSESW=1 -I/usr/include/ncursesw -IInclude -I. -I./Include -I/usr/include/arm-lin

[issue13572] import _curses fails because of UnicodeDecodeError('utf8' codec can't decode byte 0xb5 ...') on ARM Ubuntu 3.x

2011-12-10 Thread STINNER Victor
STINNER Victor added the comment: The problem comes maybe from the name of a curses key, keyname(). PyInit__curses() gets the name of all keys (KEY_MIN..KEY_MAX). -- ___ Python tracker <http://bugs.python.org/issue13

[issue11886] test_time.test_tzset() fails on "x86 FreeBSD 7.2 3.x": AEST timezone called "EST"

2011-12-11 Thread STINNER Victor
STINNER Victor added the comment: Hum, it's still not ok: == FAIL: test_tzset (test.test_time.TimeTestCase) -- Traceback (most recent call last):

[issue13561] os.listdir documentation should mention surrogateescape

2011-12-11 Thread STINNER Victor
STINNER Victor added the comment: Can you please write a doc patch? -- ___ Python tracker <http://bugs.python.org/issue13561> ___ ___ Python-bugs-list mailin

[issue13248] deprecated in 3.2, should be removed in 3.3

2011-12-11 Thread STINNER Victor
STINNER Victor added the comment: .. versionchanged:: 3.2 - The *strict* parameter is deprecated. HTTP 0.9-style "Simple Responses" + The *strict* parameter is removed. HTTP 0.9-style "Simple Responses" are not supported anymore. Such change looks wrong:

[issue13572] import _curses fails because of UnicodeDecodeError('utf8' codec can't decode byte 0xb5 ...') on ARM Ubuntu 3.x

2011-12-11 Thread STINNER Victor
STINNER Victor added the comment: @Barry: can you try to get a trace using gdb? Start python in gdb, set a breapoint on PyErr_SetObject, continue, run the Python command "import _curses", get the gdb traceback (or continue if the error is not the UT

[issue13539] Return value missing in calendar.TimeEncoding.__enter__

2011-12-12 Thread STINNER Victor
Changes by STINNER Victor : -- nosy: +haypo ___ Python tracker <http://bugs.python.org/issue13539> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue13390] Hunt memory allocations in addition to reference leaks

2011-12-12 Thread STINNER Victor
STINNER Victor added the comment: > How different is the performance cost of this solution compared > to inserting DTrace probe for the same purpose? DTrace is only available on some platforms (Solaris and maybe FreeBSD?). -- ___ Python t

[issue13596] Only recompile Lib/_sysconfigdata.py when needed

2011-12-13 Thread STINNER Victor
New submission from STINNER Victor : Attached patch fixes Makefile.pre.in to only recompile Lib/_sysconfigdata.py when needed. -- files: sysconfigdata.patch keywords: patch messages: 149406 nosy: haypo, pitrou priority: normal severity: normal status: open title: Only recompile Lib

[issue13596] Only recompile Lib/_sysconfigdata.py when needed

2011-12-13 Thread STINNER Victor
Changes by STINNER Victor : -- components: +Build ___ Python tracker <http://bugs.python.org/issue13596> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue13604] update PEP 393 (match implementation)

2011-12-15 Thread STINNER Victor
STINNER Victor added the comment: Various comments of the PEP 393 and your patch. "For compatibility with existing APIs, several representations may exist in parallel; over time, this compatibility should be phased out." and "For compatibility, redundant representations may b

[issue13545] Pydoc3.2: TypeError: unorderable types

2011-12-15 Thread STINNER Victor
STINNER Victor added the comment: > The patch in msg<148968> solves the issue for me. Cool, I applied the patch to Python 3.2 and 3.3. -- resolution: -> fixed status: open -> closed ___ Python tracker <http://bugs.pyt

[issue13560] Add PyUnicode_DecodeLocale and PyUnicode_DecodeLocaleAndSize

2011-12-16 Thread STINNER Victor
STINNER Victor added the comment: changeset: 74002:279b0aee0cfb user:Victor Stinner date:Fri Dec 16 23:56:01 2011 +0100 files: Doc/c-api/unicode.rst Include/unicodeobject.h Modules/_localemodule.c Modules/main.c Modules/timemodule.c description: Add

[issue13617] Reject embedded null characters in wchar* strings

2011-12-16 Thread STINNER Victor
New submission from STINNER Victor : The curses module (only since Python 3.3), locale.strcoll(), locale.strxfrm(), time.strftime() and imp.NullImporter() (only on Windows) accept embedded null characters, whereas they convert the Unicode string to a wide character (wchar_t*) string. The

[issue13617] Reject embedded null characters in wchar* strings

2011-12-16 Thread STINNER Victor
STINNER Victor added the comment: PyUnicode_AsWideCharString() documentation should also warn about this issue. -- ___ Python tracker <http://bugs.python.org/issue13

[issue13618] bytes.decode() UnicodeEncodeError on Apple iOS (>16-bit) characters

2011-12-16 Thread STINNER Victor
Changes by STINNER Victor : -- nosy: +haypo ___ Python tracker <http://bugs.python.org/issue13618> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue13619] Add a new codec: "locale", the current locale encoding

2011-12-16 Thread STINNER Victor
New submission from STINNER Victor : To factorize the code and to fix encoding issues in the time module, I added functions to decode/encode from/to the locale encoding: PyUnicode_DecodeLocale(), PyUnicode_DecodeLocaleAndSize() and PyUnicode_EncodeLocale() (issue #13560). During tests, I

[issue13560] Add PyUnicode_DecodeLocale and PyUnicode_DecodeLocaleAndSize

2011-12-16 Thread STINNER Victor
STINNER Victor added the comment: Ok, I think that the current code is good enough to close the issue. I opened a more global issue about the Python codec: #13619. -- resolution: -> fixed status: open -> closed ___ Python tracker

[issue13619] Add a new codec: "locale", the current locale encoding

2011-12-16 Thread STINNER Victor
Changes by STINNER Victor : -- keywords: +patch Added file: http://bugs.python.org/file23985/locale_encoding.patch ___ Python tracker <http://bugs.python.org/issue13

[issue13619] Add a new codec: "locale", the current locale encoding

2011-12-16 Thread STINNER Victor
STINNER Victor added the comment: # On FreeBSD, Solaris and Mac OS X, b'\xff' can be decoded in # the C locale. The C locale is something like ISO-8859-1, not # 7-bit ASCII. On FreeBSD, it *is* the ISO-8859-1 encoding. -- ___ Python trac

[issue13453] Tests and network timeouts

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: http://www.python.org/dev/buildbot/all/builders/x86%20Gentoo%203.x/builds/1327/steps/test/logs/stdio == ERROR: test_list_active (test.test_nntplib.NetworkedNNTPTests

[issue11231] bytes() constructor is not correctly documented

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: Oooh, I missed the important sentence "Accordingly, constructor arguments are interpreted as for bytearray()." The 5 constructors are documented in bytearray doc: http://docs.python.org/dev/library/functions.html#bytearray -- resolution:

[issue13619] Add a new codec: "locale", the current locale encoding

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: Patch version 2: improve the test. Try also the user locale encoding if the C locale uses ISO-8859-1 (should improve the code coverage on FreeBSD, Mac OS X and Solaris). -- Added file: http://bugs.python.org/file23987/locale_encoding-2.patch

[issue13555] cPickle MemoryError when loading large file (while pickle works)

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: > Should we fix this (Py_ssize_t, overflow check before computation), as in > #11564? Yes. Use Py_ssize_t type for the buf_size attribute, and replace "bigger <= 0" (test if an overflow occurred) by "self->buf_size > (PY_SSIZE_

[issue13619] Add a new codec: "locale", the current locale encoding

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: I tested locale_encoding-2.patch on Linux, FreeBSD and Windows: UTF-8 and ISO-8859-1 locales on Linux and FreeBSD, and the cp1252 ANSI code page on Windows. -- ___ Python tracker <http://bugs.python.

[issue13621] Unicode performance regression in python3.3 vs python3.2

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: Sorted and grouped results. "replace", "find" and "concat" should be easy to fix, "format" is a little bit more complex, "strip" and "split" depends on "find" performance and require to

[issue13622] Bytes performance regression in python3.3 vs python3.2

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: Sorted and grouped results. "replace", "find" and "concat" should be easy to fix, "strip" and "split" depend on "find" performance. replace: - b"...text.with.2000.lines...replace(b"

[issue13623] Bytes performance regression in python3.3 vs python3.2

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: Grouped results. find (first): - (b"A"*1000).find(b"A"): -70% - (b"A"*1000).rfind(b"A") : -70% - (b"A"*1000).index(b"A") : -71% - (b"A"*1000).rindex(b"A") : -68% - (

[issue13622] Bytes performance regression in python3.3 vs python3.2

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: Boris.FELD told me that there was a bug in compare.py: all numbers are related to Unicode (see #13621), not bytes. -- ___ Python tracker <http://bugs.python.org/issue13

[issue13622] Bytes performance regression in python3.3 vs python3.2

2011-12-17 Thread STINNER Victor
Changes by STINNER Victor : -- resolution: -> invalid status: open -> closed ___ Python tracker <http://bugs.python.org/issue13622> ___ ___ Python-bugs-

[issue13623] Bytes performance regression in python3.3 vs python3.2

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: See also the issue #13621 for results on Unicode. -- ___ Python tracker <http://bugs.python.org/issue13623> ___ ___ Python-bug

[issue13621] Unicode performance regression in python3.3 vs python3.2

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: See also the issue #13623 for results on bytes. -- ___ Python tracker <http://bugs.python.org/issue13621> ___ ___ Python-bug

[issue13624] UTF-8 encoder performance regression in python3.3

2011-12-17 Thread STINNER Victor
New submission from STINNER Victor : iobench benchmarking tool showed that the UTF-8 encoder is slower in Python 3.3 than Python 3.2. The performance depends on the characters of the input string: * 8x faster (!) for a string of 50.000 ASCII characters * 1.5x slower for a string of 50.000

[issue13623] Bytes performance regression in python3.3 vs python3.2

2011-12-17 Thread STINNER Victor
Changes by STINNER Victor : -- nosy: +flox ___ Python tracker <http://bugs.python.org/issue13623> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue13621] Unicode performance regression in python3.3 vs python3.2

2011-12-17 Thread STINNER Victor
Changes by STINNER Victor : -- nosy: +flox ___ Python tracker <http://bugs.python.org/issue13621> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue13624] UTF-8 encoder performance regression in python3.3

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: > Can you please provide your exact testing procedure? Here you have. $ cat bench.sh echo -n "ASCII: " ./python -m timeit 'x="A"*5' 'x.encode("utf-8")' echo -n "UCS-1: " ./python -m timeit

[issue13624] UTF-8 encoder performance regression in python3.3

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: Oh, Antoine told me that I missed the -s command line argument to timeit: $ cat bench.sh echo -n "ASCII: " ./python -m timeit -s 'x="A"*5' 'x.encode("utf-8")' echo -n "UCS-1: " ./python -m t

[issue13624] UTF-8 encoder performance regression in python3.3

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: Python 3.2 (narrow): ASCII: 1 loops, best of 3: 28.2 usec per loop UCS-1: 1 loops, best of 3: 59.1 usec per loop UCS-2: 1 loops, best of 3: 88.8 usec per loop UCS-4: 1000 loops, best of 3: 254 usec per loop Python 3.2 (wide): ASCII: 1 loops

[issue13624] UTF-8 encoder performance regression in python3.3

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: > 8x faster (!) for a string of 50.000 ASCII characters Oooh, it's just faster because encoding ASCII to UTF-8 is now O(1). The ASCII data is shared with the UTF-8 data thanks to the PEP 393! -- ___ Python

[issue13530] Docs for os.lseek neglect to mention what it returns

2011-12-17 Thread STINNER Victor
Changes by STINNER Victor : -- resolution: -> fixed status: open -> closed ___ Python tracker <http://bugs.python.org/issue13530> ___ ___ Python-bugs-list

[issue13530] Docs for os.lseek neglect to mention what it returns

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: Thanks for the patch Jérémy. -- nosy: +haypo ___ Python tracker <http://bugs.python.org/issue13530> ___ ___ Python-bugs-list m

[issue13623] Bytes performance regression in python3.3 vs python3.2

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: > (b"A"*1000).find(b"A"): -70% This one is a performance regression introduced by #12170. Attached patch checks object type before trying a conversion to size_t instead of catching an exception. -- keywords:

[issue13623] Bytes performance regression in python3.3 vs python3.2

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: bytes_find.patch only works for Python int, not object with the __index__ method. My new patch (bytes_find-2.patch) uses PyNumber_Check() instead of PyLong_Check() to be more generic. It fixes also a different issue: raise the same ValueError than bytes.find

[issue13623] Bytes performance regression in python3.3 vs python3.2

2011-12-17 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file24012/bytes_find-2.patch ___ Python tracker <http://bugs.python.org/issue13623> ___ ___ Python-bug

[issue13623] Bytes performance regression in python3.3 vs python3.2

2011-12-17 Thread STINNER Victor
Changes by STINNER Victor : Added file: http://bugs.python.org/file24013/bytes_find-2.patch ___ Python tracker <http://bugs.python.org/issue13623> ___ ___ Python-bug

[issue12170] index() and count() methods of bytes and bytearray should accept byte ints

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: New changeset 75648db1b3f3 by Victor Stinner in branch 'default': http://hg.python.org/cpython/rev/75648db1b3f3 Issue #13623: Fix a performance regression introduced by issue #12170 in bytes.find() and handle correctly OverflowError (raise the same

[issue13623] Bytes performance regression in python3.3 vs python3.2

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: I checked stringbench: there is no more performance regression (difference of more than 20%). -- resolution: -> fixed status: open -> closed ___ Python tracker <http://bugs.python.org/i

[issue13522] Document error return values for PyFloat_* and PyComplex_*

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: _Py_c_pow() doc is wrong: + If :attr:`exp.imag` is not null, or :attr:`exp.real` is negative, + this method returns zero and sets :c:data:`errno` to :c:data:`EDOM`. The function only fails if num=0 and exp.real < 0 or if num=0 and exp.imag !

[issue13621] Unicode performance regression in python3.3 vs python3.2

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: > "...text.with.2000.lines...replace("\n", " ") (*10): -37.668161% I also noticed a difference between Python 3.2 and 3.3, but Python 3.3 is 13% *faster* (and not slower). This benchmark is not really representative because str

[issue13621] Unicode performance regression in python3.3 vs python3.2

2011-12-17 Thread STINNER Victor
STINNER Victor added the comment: > I also noticed a difference between Python 3.2 and 3.3, > but Python 3.3 is 13% *faster* (and not slower). Oops, I misused the timeit module, there is a regression. > New changeset c802bfc8acfc by Victor Stinner in branch 'default': >

[issue13522] Document error return values for PyFloat_* and PyComplex_*

2011-12-17 Thread STINNER Victor
Changes by STINNER Victor : -- resolution: -> fixed status: open -> closed ___ Python tracker <http://bugs.python.org/issue13522> ___ ___ Python-bugs-list

[issue13624] UTF-8 encoder performance regression in python3.3

2011-12-18 Thread STINNER Victor
STINNER Victor added the comment: Updated patch to fix also the size of the small buffer on the stack, as suggested by Antoine. -- Added file: http://bugs.python.org/file24021/utf8_encoder-2.patch ___ Python tracker <http://bugs.python.

[issue13624] UTF-8 encoder performance regression in python3.3

2011-12-18 Thread STINNER Victor
STINNER Victor added the comment: utf8_encoder_prescan.patch: precompute the size of the output to avoid a PyBytes_Resize() at exit. It is much slower: ASCII: 10 loops, best of 3: 2.06 usec per loop UCS-1: 1 loops, best of 3: 123 usec per loop UCS-2: 1 loops, best of 3: 171 usec

[issue13624] UTF-8 encoder performance regression in python3.3

2011-12-18 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file24005/utf8_encoder.patch ___ Python tracker <http://bugs.python.org/issue13624> ___ ___ Python-bug

[issue13624] UTF-8 encoder performance regression in python3.3

2011-12-18 Thread STINNER Victor
STINNER Victor added the comment: Patch version 3 to fix compiler warnings (avoid variables used for the error handler, unneeded for UCS-1). -- Added file: http://bugs.python.org/file24023/utf8_encoder-3.patch ___ Python tracker <h

[issue13624] UTF-8 encoder performance regression in python3.3

2011-12-18 Thread STINNER Victor
Changes by STINNER Victor : -- resolution: -> fixed status: open -> closed ___ Python tracker <http://bugs.python.org/issue13624> ___ ___ Python-bugs-list

[issue13628] python-gdb.py: patch to improve support of optimized Python

2011-12-18 Thread STINNER Victor
New submission from STINNER Victor : If Python is compiled with gcc -O3, gdb is unable to get the f argument of PyEval_EvalFrameEx(). It is possible to retrieve "f" from the caller, PyEval_EvalCodeEx(). Attached patch tries to implement this idea and enable more test_gdb tests on

[issue13624] UTF-8 encoder performance regression in python3.3

2011-12-18 Thread STINNER Victor
STINNER Victor added the comment: > It's actually still O(n): the UTF-8 data still need to be copied > into a bytes object. Hum, correct, but a memory copy is much faster than having to decode UTF-8. -- ___ Python tracker <http://b

[issue13617] Reject embedded null characters in wchar* strings

2011-12-18 Thread STINNER Victor
STINNER Victor added the comment: embedded_nul-2.patch: a more complete patch check also null byte in functions calling PyUnicode_EncodeFSDefault(). -- Added file: http://bugs.python.org/file24041/embedded_nul-2.patch ___ Python tracker <h

[issue5689] Support xz compression in tarfile module

2011-12-18 Thread STINNER Victor
STINNER Victor added the comment: There is failure on a XP buildbot. I don't know if it is a sporadic issue or not. http://www.python.org/dev/buildbot/all/builders/x86%20XP-5%203.x/builds/3921/steps/test/logs/stdio ==

[issue13628] python-gdb.py: patch to improve support of optimized Python

2011-12-19 Thread STINNER Victor
STINNER Victor added the comment: > It is possible to retrieve "f" from the caller, PyEval_EvalCodeEx() It does not always work, but it works sometimes, so it's better to try :-) I applied my fix to Python 2.7, 3.2 and 3.3. lipython.py of Python 2.7 is outdated, it should

[issue13643] 'ascii' is a bad filesystem default encoding

2011-12-20 Thread STINNER Victor
STINNER Victor added the comment: > Currently when running Python on a non-OSX posix environment > under either the C locale, or with an invalid or missing locale, > it's not possible to operate using unicode filenames outside > the ascii range. It was already discussed:

[issue13643] 'ascii' is a bad filesystem default encoding

2011-12-20 Thread STINNER Victor
STINNER Victor added the comment: > under either the C locale, or with an invalid or missing locale The right fix is to fix your locale, not Python. -- ___ Python tracker <http://bugs.python.org/issu

[issue13643] 'ascii' is a bad filesystem default encoding

2011-12-20 Thread STINNER Victor
STINNER Victor added the comment: > If there was a separate LC_FILENAMES then Python could respect > that and insist people set it, but there isn't. During 1 month, we had PYTHONFSENCODING environment variable. It was not a good idea. Again: please read the discussion (in cl

[issue13643] 'ascii' is a bad filesystem default encoding

2011-12-20 Thread STINNER Victor
STINNER Victor added the comment: > There are two problems with this: one is just the practical > one that it scales poorly to have to tell every user to do this > and to take them through working out how to set this in a way > that covers cron jobs, daemons, things run over ssh, e

[issue13643] 'ascii' is a bad filesystem default encoding

2011-12-20 Thread STINNER Victor
STINNER Victor added the comment: > The main problem I see being discussed is that > changing the encoding after Python starts would > be dangerous, which I agree with, but we're not > proposing to do that. Not after Python start. Using two encodings at the same would just a

[issue13619] Add a new codec: "locale", the current locale encoding

2011-12-20 Thread STINNER Victor
STINNER Victor added the comment: I would be possible to implement incremental decoder with mbsrtowcs() and incremental encoder with wcsrtombs(), by serializing mbstate_t to a long integer (TextIOWrapper.tell() does something like that). The problem is that mbsrtowcs() and wcsrtombs() are

[issue13643] 'ascii' is a bad filesystem default encoding

2011-12-20 Thread STINNER Victor
STINNER Victor added the comment: I should not write comments so late :-p > Not after Python start. Using two encodings at the same would just ... at the same time > ... because I would like to inconsistency. because it would lead to inconsist

[issue13643] 'ascii' is a bad filesystem default encoding

2011-12-21 Thread STINNER Victor
STINNER Victor added the comment: > Having more than one encoding on unix is already a reality, there's nothing > to stop someone setting LANG=de_DE.UTF-8 and LC_MESSAGES=C say. Nope. The locale encoding is chosen using LC_ALL, LC_CTYPE or LANG variable: use the first non-emp

[issue13636] Python SSL Stack doesn't have a Secure Default set of ciphers

2011-12-21 Thread STINNER Victor
STINNER Victor added the comment: > By default the Python SSL/TLS Stack (client/server) expose > unsecure protocols (SSLv2) and unsecure ciphers (EXPORT 40bit DES). If there is a problem, it should not be fixed in Python, but in the underlying library (OpenSSL) or in applications. Pytho

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-21 Thread STINNER Victor
STINNER Victor added the comment: + self.name = self.name.encode("iso-8859-1", "replace") Why did you chose ISO-8859-1? I think that the filesystem encoding should be used instead: -self.name = self.name.encode("iso-8859-1", "replace") +

[issue8604] Adding an atomic FS write API

2011-12-21 Thread STINNER Victor
STINNER Victor added the comment: > I'm not sure about the best module to host this, though: os.path ? Some OS don't provide atomic rename. If we only define a function when it is atomic (as we do in the posix module, only expose functions available on the OS), programs will

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-21 Thread STINNER Victor
STINNER Victor added the comment: "The gzip format (defined in RFC 1952) allows storing the original filename (without the .gz suffix) in an additional field in the header (the FNAME field). Latin-1 (iso-8859-1) is required." Hum, it looks like the author of the gzip program (on Li

[issue13643] 'ascii' is a bad filesystem default encoding

2011-12-21 Thread STINNER Victor
STINNER Victor added the comment: > it will still be passing values that can't be > interpreted by other processes as you highlighed earlier. On UNIX, data going outside Python has be be encoded: you pass byte strings, not directly Unicode. Surrogates are encoded back to ori

[issue13643] 'ascii' is a bad filesystem default encoding

2011-12-21 Thread STINNER Victor
STINNER Victor added the comment: This discussion is becoming very long, I didn't remember the original purpose. You want to use UTF-8 instead of ASCII, so what? What do you want to do with your nicely well decoded filenames? You cannot print it to your terminal nor pass it to a subpr

[issue13643] 'ascii' is a bad filesystem default encoding

2011-12-21 Thread STINNER Victor
STINNER Victor added the comment: >> Nope. The locale encoding is chosen using LC_ALL, LC_CTYPE or LANG >> variable: use the first non-empty variable. LC_MESSAGES doesn't affect >> the encoding. Example: > > That's good to know, thanks. Only leaves the case

[issue13643] 'ascii' is a bad filesystem default encoding

2011-12-21 Thread STINNER Victor
STINNER Victor added the comment: On 22/12/2011 02:16, Martin Pool wrote: > The proposal is that in some cases where Python currently assumes > filenames are ascii on Linux, it ought to instead assume they are > utf-8. Oh, I expected a use case describing the problem, not the

[issue13643] 'ascii' is a bad filesystem default encoding

2011-12-21 Thread STINNER Victor
STINNER Victor added the comment: > The problem as I see it is this: > > On Linux, filenames are generally (but not always) in UTF-8; people > fairly commonly end up with no locale configured, which causes Python > to decode filenames as ascii. It is easy for this to end up with

[issue13619] Add a new codec: "locale", the current locale encoding

2011-12-22 Thread STINNER Victor
STINNER Victor added the comment: + encoding = locale.getpreferredencoding() It should be locale.getpreferredencoding(False). -- ___ Python tracker <http://bugs.python.org/issue13

[issue13078] IDLE: Python Crashes When Saving Or Opening

2011-12-22 Thread STINNER Victor
Changes by STINNER Victor : -- title: Python Crashes When Saving Or Opening -> IDLE: Python Crashes When Saving Or Opening ___ Python tracker <http://bugs.python.org/issu

[issue13565] test_multiprocessing.test_notify_all() hangs on "AMD64 Snow Leopard 02 03.x"

2011-12-23 Thread STINNER Victor
STINNER Victor added the comment: > Victor, could you try the attached script on FreeBSD, > to see if you get ECONNREFUSED? Yes, I get a ECONNREFUSED. I tested backlog.py on FreeBSD 8.2. -- ___ Python tracker <http://bugs.python.org/i

[issue13674] crash in datetime.strftime

2011-12-29 Thread STINNER Victor
STINNER Victor added the comment: timemodule.c has the following check: #if defined(_MSC_VER) || defined(sun) if (buf.tm_year + 1900 < 1 || < buf.tm_year + 1900) { PyErr_SetString(PyExc_ValueError, "strftime() requires year

[issue13703] Hash collision security issue

2012-01-03 Thread STINNER Victor
Changes by STINNER Victor : -- nosy: +haypo ___ Python tracker <http://bugs.python.org/issue13703> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue13703] Hash collision security issue

2012-01-03 Thread STINNER Victor
STINNER Victor added the comment: > Unless there's evidence of performance regressions > or backward incompatibilities, I agree. If hash() is modified, str(dict) and str(set) will change for example. It may break doctests. Can we consider that the application should not rely (ind

[issue13704] Random number generator in Python core

2012-01-03 Thread STINNER Victor
Changes by STINNER Victor : -- keywords: +patch Added file: http://bugs.python.org/file24135/3106cc0a2024.diff ___ Python tracker <http://bugs.python.org/issue13

[issue13706] non-ascii fill characters no longer work in numeric formatting

2012-01-03 Thread STINNER Victor
STINNER Victor added the comment: > I assume this is left over from the PEP 393 changes. Correct. > I'm not sure such a restriction needs to exist any more. The restriction was introduced to simplify the implementation. maxchar has to be computed exactly in format_stri

<    8   9   10   11   12   13   14   15   16   17   >