STINNER Victor added the comment:
Can't we use RegEnumValueW and RegQueryInfoKeyW?
--
___
Python tracker
<http://bugs.python.org/issue9937>
___
___
Pytho
STINNER Victor added the comment:
Le mardi 28 septembre 2010 22:24:56, vous avez écrit :
> I disagree. PyObject_As*Buffer functions are remnants of the old buffer
> API in Python 2.x. They are here only to ease porting of existing C
> code, but carefully written 3.x code should
New submission from STINNER Victor :
PyUnicode_AsWideChar() doesn't merge surrogate pairs on a system with 32 bits
wchar_t and Python compiled in narrow mode (sizeof(wchar_t) == 4 and
sizeof(Py_UNICODE) == 2) => see issue #8670.
It is not easy to fix this problem because the ca
STINNER Victor added the comment:
#9979 proposes to create a new PyUnicode_AsWideCharString() function.
--
___
Python tracker
<http://bugs.python.org/issue8
STINNER Victor added the comment:
New version of the patch:
- fix PyUnicode_AsWideCharString() :-)
- replace PyUnicode_AsWideChar() by PyUnicode_AsWideCharString() in most
functions using PyUnicode_AsWideChar()
- indicate that PyUnicode_AsWideCharString() raises a MemoryError on error
Keep
Changes by STINNER Victor :
Removed file: http://bugs.python.org/file19054/pyunicode_aswidecharstring.patch
___
Python tracker
<http://bugs.python.org/issue9979>
___
___
STINNER Victor added the comment:
See also issue #4626 which introduced PyCF_IGNORE_COOKIE and
PyPARSE_IGNORE_COOKIE flags to support unicode string for the builtin compile()
function.
--
nosy: +haypo
___
Python tracker
<http://bugs.python.
STINNER Victor added the comment:
> But shouldn't PyUnicode_AsWideCharString() merge surrogate pairs when it
> can? The implementation doesn't do this.
I don't want to do two different things at the same time. My plan is:
- create PyUnicode_AsWideCharString()
- use PyUni
STINNER Victor added the comment:
I fixed in this issue in multiple commits:
- r85093: create PyUnicode_AsWideCharString()
- r85094: use it in import.c
- r85095: use it for _locale.strcoll()
- r85096: use it for time.strftime()
- r85097: use it in _ctypes module
> So, you agree with
STINNER Victor added the comment:
Forget my previous message, I forgot important points.
> So the only reason why you have to go through
> all those hoops is to
>
> * allow the complete set of Python supported encoding
> names for the PYTHONFSENCODING
>
>
STINNER Victor added the comment:
Patch version 4:
- Rename "reencode" to "redecode"
- Return -1 (instead of 1) on error
--
title: Reencode filenames when setting the filesystem encoding -> Redecode
filenames when setting the filesystem encoding
Added file:
Changes by STINNER Victor :
Removed file: http://bugs.python.org/file18996/reencode_modules_path-3.patch
___
Python tracker
<http://bugs.python.org/issue9630>
___
___
STINNER Victor added the comment:
Le mercredi 29 septembre 2010 13:45:15, vous avez écrit :
> Marc-Andre Lemburg added the comment:
>
> STINNER Victor wrote:
> > STINNER Victor added the comment:
> >
> > Forget my previous message, I forgot important points.
>
STINNER Victor added the comment:
I commited redecode_modules_path-4.patch as r85115 in Python 3.2.
--
resolution: -> fixed
status: open -> closed
___
Python tracker
<http://bugs.python.org/
STINNER Victor added the comment:
r85115 closes #9630: an important patch for #9425, redecode all filenames when
setting the filesystem encoding.
Next tasks (maybe not in this order):
- merge getpath.c
- redecode argv[0] used by PySys_SetArgvEx() to feed sys.path (encode argv[0]
with the
New submission from STINNER Victor :
$ PYTHONFSENCODING=latin-1 ./python Lib/test/test_warnings.py
...
==
FAIL: test_nonascii (__main__.CEnvironmentVariableTests
STINNER Victor added the comment:
If I understood correctly, you don't want the value to be truncated if the
variable grows between the two calls to confstr(). Which behaviour would you
expect? A Python exception?
> but Victor Stinner has expressed concern that a buggy
> conf
STINNER Victor added the comment:
> OK, so who's messing up: subprocess or Py_main()?
Well, this is the real question :-)
locale encoding is used to decode command line arguments (sys.argv), filesystem
encoding is used to decode environment variables and to encode subprocess
argum
New submission from STINNER Victor :
On UNIX/BSD systems, Python decodes arguments with the locale encoding, whereas
subprocess encodes arguments with the fileystem encoding. If both encodings are
differents, we have a problem.
There was already the issue #4388 but it was closed because it
STINNER Victor added the comment:
I don't understand why you would like to implicitly convert bytes to str (which
is one of the worse design choice of Python2). If you don't want to care about
encodings, use bytes is fine. Decode bytes using an arbitrary encoding is the
fast
STINNER Victor added the comment:
> Indeed, the fs encoding isn't initialized until later in
> Py_InitializeEx. Maybe the PYTHONWARNINGS code should be moved
> there instead?
sys.warnopts should be filled early because it is used to initialize the
_warnings module, and the _w
STINNER Victor added the comment:
[cmdline_encoding-2.patch] Patch to use locale encoding to decode and encode
command line arguments. Remarks about the patch:
- failing to get the locale encoding (very unlikely) is a fatal error
- TODO: in initfsencoding(), Py_FileSystemDefaultEncoding
STINNER Victor added the comment:
> Maybe the PYTHONWARNINGS code should be moved there instead?
sys.warnoptions is read by the warnings module (not the _warnings module) when
this module is loaded. The warnings module is loaded by Py_InitializeEx() if
sys.warnoptions list is not empty.
STINNER Victor added the comment:
> The problem with command line arguments is that they don't necessarily
> have just one encoding (just like env vars may well use more than
> one encoding) on Unix platforms.
The issue #8776 proposes the creation of sys.argv.
> When using pa
STINNER Victor added the comment:
Extract of an interesting message (msg111432) of #8775 (issue specific to Mac
OS X):
<< A system where the filesystem encoding doesn't match the locale encoding is
hard to get right. While it would be possible to add sys.cmdlineencoding th
STINNER Victor added the comment:
> A system where the filesystem encoding doesn't match the locale
> encoding is hard to get right.
Mmmh. The problem is maybe that the new PYTHONFSENCODING environment variable
(added by #8622) introduced an horrible inconstency between Pytho
STINNER Victor added the comment:
> Option 2 (the alternative Antoine suggested and I'm considering):
> - "decode" ... to str ...
> - ... objects are "encoded" back to actual bytes before
> they are returned
In this case, you have to be very careful to
STINNER Victor added the comment:
Update the patch for the new PyUnicode_AsWideCharString() function:
- use Py_UNICODE_SIZE and SIZEOF_WCHAR_T in the preprocessor tests
- faster loop: don't use a counter + pointer, but only use pointers (for the
stop condition)
The patch is not finish
Changes by STINNER Victor :
Removed file:
http://bugs.python.org/file17322/pyunicode_aswidechar_surrogates-py3k.patch
___
Python tracker
<http://bugs.python.org/issue8
STINNER Victor added the comment:
Patch version 3:
- fix unicode_aswidechar if Py_UNICODE_SIZE == SIZEOF_WCHAR_T and w == NULL
(return the number of characters, don't write into w!)
- improve unicode_aswidechar() comment
--
Added file: http://bugs.python.org/file
STINNER Victor added the comment:
I don't know how to test "if Py_UNICODE_SIZE == 4 && SIZEOF_WCHAR_T == 2". On
Windows, sizeof(wchar_t) is 2, but it looks like Python is not prepared to have
Py_UNICODE != wchar_t for is Windows implementation.
wchar_t is 32 bits lon
STINNER Victor added the comment:
Patch version 4:
- implement unicode_aswidechar() for 16 bits wchar_t and 32 bits Py_UNICODE
- PyUnicode_AsWideWcharString() returns the number of wide characters
excluding the nul character as does PyUnicode_AsWideChar()
For 16 bits wchar_t and 32 bits
Changes by STINNER Victor :
Removed file: http://bugs.python.org/file19082/aswidechar_nonbmp-2.patch
___
Python tracker
<http://bugs.python.org/issue8670>
___
___
Pytho
Changes by STINNER Victor :
Removed file: http://bugs.python.org/file19083/aswidechar_nonbmp-3.patch
___
Python tracker
<http://bugs.python.org/issue8670>
___
___
Pytho
STINNER Victor added the comment:
Ooops, I lost my patch to fix the initial (ctypes) issue. Here is an updated
patch: ctypes_nonbmp.patch (which needs aswidechar_nonbmp-4.patch).
--
Added file: http://bugs.python.org/file19101/ctypes_nonbmp.patch
Changes by STINNER Victor :
--
title: Allow bytes in some APIs that use string literals internally ->
urllib.parse: Allow bytes in some APIs that use string literals internally
___
Python tracker
<http://bugs.python.org/iss
STINNER Victor added the comment:
r85172 changes PyUnicode_AsWideCharString() (don't count the trailing nul
character in the output size) and add unit tests.
r85173 patches unicode_aswidechar() to supports non-BMP characters for all
known wchar_t/Py_UNICODE size combinaisons (2/2, 2/4
STINNER Victor added the comment:
r85174+r85177: ctypes.c_wchar supports non-BMP characters with 32 bits wchar_t
=> fix this issue
(I commited also an unwanted change on _testcapi to fix r85172 in r85174:
r85175 reverts this change, and r85176 fixes the _testcapi bug ag
STINNER Victor added the comment:
> r85173 patches unicode_aswidechar() to supports non-BMP characters
> for all known wchar_t/Py_UNICODE size combinaisons (2/2, 2/4 and 4/2).
Oh, and 4/4 ;-)
--
___
Python tracker
<http://bugs.p
New submission from STINNER Victor :
In the following example, sys.path[0] should be
'/home/SHARE/SVN/py3k\udcc3\udca9' (my locale and filesystem encodings are
utf-8):
$ cd /home/SHARE/SVN/py3ké
$ echo "import sys; print(sys.path[0])" > x.py
$ ./python x.p
STINNER Victor added the comment:
See also #10014: sys.path[0] is decoded from the locale encoding instead of the
fileystem encoding.
--
___
Python tracker
<http://bugs.python.org/issue9
STINNER Victor added the comment:
> Since this was a bugfix, it should be merged back into 2.7, yes?
Mmmh, the fix requires to change PyUnicode_AsWideChar() function (support
non-BMP characters and surrogate pairs) (and maybe also to create
PyUnicode_AsWideCharString()). I don't rea
STINNER Victor added the comment:
> If you were worried about performance, then surrogateescape is certainly
> much slower than latin1.
If you were really worried about performance, the bytes type is maybe faster
than: decode bytes to str using latin-1, process str strings, encode
STINNER Victor added the comment:
See also issue #10039.
--
___
Python tracker
<http://bugs.python.org/issue10014>
___
___
Python-bugs-list mailing list
Unsub
STINNER Victor added the comment:
> The problem is that PySys_SetArgvEx() ...
Not only PySys_SetArgvEx(). There is another issue with RunMainFromImporter()
which do: sys.path[0] = filename
--
___
Python tracker
<http://bugs.python.org/issu
New submission from STINNER Victor :
If a program name contains a non-ascii character in its name and/or full path
and PYTHONFSENCODING is set to an encoding different than the locale encoding,
Python fails to open the program.
Example in the utf-8 locale:
$ PYTHONFSENCODING=ascii ./python
STINNER Victor added the comment:
This issue depends on issue #10039.
--
dependencies: +python é.py fails with UnicodeEncodeError if PYTHONFSENCODING is
used
___
Python tracker
<http://bugs.python.org/issue10
STINNER Victor added the comment:
r85302: _wrealpath() and _Py_wreadlink() support surrogates in the input path.
--
realpath_fs_encoding.patch: patch _wrealpath() to encode the resulting path
with the filesystem encoding (with surrogateescape) instead of the locale
encoding. This patch is
STINNER Victor added the comment:
I just created Python/fileutils.c: update the patch for this new file.
--
Added file: http://bugs.python.org/file19153/realpath_fs_encoding-2.patch
___
Python tracker
<http://bugs.python.org/issue10
Changes by STINNER Victor :
Removed file: http://bugs.python.org/file19147/realpath_fs_encoding.patch
___
Python tracker
<http://bugs.python.org/issue10014>
___
___
Pytho
STINNER Victor added the comment:
There was a bug in copy_absolute(): if _Py_wgetcwd() failed, the result was
undefined (depending of the content of "path" buffer). Especially, absolutize()
calls copy_absolute() with a buffer allocated on the stack: the content of this
buffer depe
STINNER Victor added the comment:
deleted_cwd.patch, patch based on labrat's patch updated to py3k:
http://www.physics.drexel.edu/~wking/code/hg/hgwebdir.cgi/python/rev/77f3ad10ba45
Procedure to test the patch:
- go into Python source tree
- make a directory "z"
- en
Changes by STINNER Victor :
--
title: Add the null context manager to contextlib -> Add a "no-op" (null)
context manager to contextlib
___
Python tracker
<http://bugs.pytho
STINNER Victor added the comment:
About your patch:
- __enter__() might return self instead of None... i don't really know which
choice is better. "with Null() as x:" works in both cases
- __exit__() has no result value, "pass" is enough
- I don't li
STINNER Victor added the comment:
> FWIW, this still happens on the latest of /branches/py3k,
> when LANG does not match up to the enforced fs encoding
ixokai has the bug on Snow Leopard x86.
--
___
Python tracker
<http://bugs.p
STINNER Victor added the comment:
py3k_also_no_unicode_error_on_direct_test_run.patch comes a little bit too late:
$ LANG= ./python Lib/test/regrtest.py -v test_time
== CPython 3.2a2+ (py3k, Oct 8 2010, 01:40:20) [GCC 4.4.5 20100909 (prerelease)]
== Linux-2.6.32-trunk-686-i686-with-debian
STINNER Victor added the comment:
> For the record, this can be now reproduced under Linux by forcing different
> locale and filesystem encodings:
>
> $ PYTHONFSENCODING=utf8 LANG=ISO-8859-1 ./python -m test.regrtest
> test_cmd_line
I opened a separated issue for Linux, #999
STINNER Victor added the comment:
> Perhaps. We could also declare that command line arguments and
> environment variables are always UTF-8-encoded on OSX (which I think
> would be fairly accurate)
Python uses the filesystem encoding to encode/decode environment variables,
an
STINNER Victor added the comment:
> So perhaps it would be best if Python had two external default encodings:
> the IO one (command line arguments, environment variables, text files),
> and the file name encoding (defaulting to the IO encoding if not set)
Hum, I prefer to consid
STINNER Victor added the comment:
> We run into problems because we have two inconsistent
> encodings, ...
What? No. We have problems because we don't use the same encoding to decode and
to encode the same data type. It's not a problem to use a different encoding
for each d
STINNER Victor added the comment:
> > What? No. We have problems because we don't use the same encoding to
> > decode and to encode the same data type. It's not a problem to use a
> > different encoding for each data type (stdout, filenames, environment
> > var
STINNER Victor added the comment:
> > ... So Antoine and Martin: which encoding do you prefer?
>
> I still propose to drop the fsname encoding. Then this question goes away.
You mean that we should use the following encoding for the command line
arguments, environment varia
STINNER Victor added the comment:
MvL> > - Windows: unicode for command line/env, mbcs to decode filenames
MvL> No: unicode for filenames also.
Yes, I mean unicode for everything, but decode bytes data from the mbcs
encoding.
--
_
STINNER Victor added the comment:
MAL> If you remove the PYTHONFSENCODING, then we have to reconsider
MAL> removal of sys.setfilesystemencoding().
Plase, Marc, read my comments. You never consider technical problems,
you just propose to ensure that "Python just work
STINNER Victor added the comment:
MAL> You can't just tell people to go with whatever encoding setup
MAL> you prefer to make Python's guessing easier or more correct.
Python doesn't really *guess* the encoding, it just reads the encoding from the
locale.
What do you
STINNER Victor added the comment:
> I guess LANG and LC_CTYPE can be used for other purposes
> such as internationalization.
That's why there are different environement variables:
* LC_MESSAGES for i18n (messages)
* LC_CTYPE for the encoding
* LC_TIME for time and
STINNER Victor added the comment:
issue9992.patch:
- Remove PYTHONFSENCODING environment variable
- Mac OS X: Use utf-8 to decode command line arguments
- Fix issue #9992 (this issue): attached test, locale_fs_encoding.py, pass
- Fix issue #9988
- Fix issue #10014
- Fix issue #10039
STINNER Victor added the comment:
I think that issue9992.patch fixes also #4388 because it uses the same encoding
(FS encoding, utf8) on OSX to encode and to decode command line arguments.
--
___
Python tracker
<http://bugs.python.org/issue9
STINNER Victor added the comment:
> Oops, sorry. I'll withdraw my last patch.
Why? Your patch is useful to run a single test outside regrtest. But you should
not remove the hack on regrtest.py, only keep your patch on unittest/runner.py.
There are not e
New submission from STINNER Victor :
If the site module fails, the error is not logged because of a bug in
initsite(). The problem is that PyFile_WriteString() does nothing if an error
occurred.
- Edit Lib/site.py to add "raise Exception('xxx')" at the beginning of ma
STINNER Victor added the comment:
Fixed in 3.2 (r85386+r85387+r85389), 2.7 (r85390), 3.1 (r85391).
Thanks labrat for your patch. I added you to Misc/ACKS.
--
resolution: -> fixed
status: open -> closed
___
Python tracker
<http://bugs.p
STINNER Victor added the comment:
New version of the patch:
- use more standard function names (_Py_initsegfault => _Py_InitSegfault)
- use "#ifdef HAVE_SIGACTION" to support system without sigaction(): fallback
to signal()
- usage of the alternative stack is now opt
STINNER Victor added the comment:
Updated example:
--
$ ./python Lib/test/crashers/recursive_call.py
Fatal Python error: segmentation fault
Traceback (most recent call first):
File "Lib/test/crashers/recursive_call.py", line 12 in
File
Changes by STINNER Victor :
--
nosy: +dmalcolm
___
Python tracker
<http://bugs.python.org/issue8863>
___
___
Python-bugs-list mailing list
Unsubscribe:
Changes by STINNER Victor :
Removed file: http://bugs.python.org/file17507/segfault_handler.patch
___
Python tracker
<http://bugs.python.org/issue8863>
___
___
Python-bug
Changes by STINNER Victor :
Removed file: http://bugs.python.org/file17717/segfault_handler-2.patch
___
Python tracker
<http://bugs.python.org/issue8863>
___
___
Pytho
STINNER Victor added the comment:
> It should be tested at least on a wide build and ...
Done: it works correctly for non-BMP characters in narrow and wide builds.
Eg. in wide build with U+10 character in the path:
-
$ ./python bla.py
Fatal Python error: segmentat
STINNER Victor added the comment:
Patch version 4:
- Add segfault.c to pythoncore.vcproj
- Remove #error "nope" (I used it for debug purpose)
- Don't include on Windows
This version works on Windows.
--
Added file: http://bugs.python.org/file19210/segfault_
STINNER Victor added the comment:
> Is it always correct to decode a filename with the locale encoding
> on Unix?
Do you know something better than the locale encoding? I don't.
> Can’t each filesystem have its own encoding?
Yes, but how do you get the encoding of each files
STINNER Victor added the comment:
r85393 introduced a regression in test_runpy of Python 2.7.
--
nosy: +haypo
resolution: fixed ->
status: closed -> open
___
Python tracker
<http://bugs.python.org/i
STINNER Victor added the comment:
test_runpy fails also on Python 3.2.
--
___
Python tracker
<http://bugs.python.org/issue10068>
___
___
Python-bugs-list mailin
STINNER Victor added the comment:
> dmalcolm asked if it would be possible to display the
> Python backtrace on Py_FatalError()
It works :-) I fixed a bug in ceval.c (r85411) which was not directly related.
Patch version 5:
- Display the Python backtrace on Py_FatalError() (if no
Changes by STINNER Victor :
Removed file: http://bugs.python.org/file19208/segfault_handler-3.patch
___
Python tracker
<http://bugs.python.org/issue8863>
___
___
Pytho
Changes by STINNER Victor :
Removed file: http://bugs.python.org/file19210/segfault_handler-4.patch
___
Python tracker
<http://bugs.python.org/issue8863>
___
___
Pytho
Changes by STINNER Victor :
--
title: Segfault handler: display Python backtrace on segfault -> Display Python
backtrace on SIGSEGV, SIGFPE and fatal error
___
Python tracker
<http://bugs.python.org/iss
STINNER Victor added the comment:
I posted the patch on Rietveld for a review (as asked by Antoine):
http://codereview.appspot.com/2477041
--
___
Python tracker
<http://bugs.python.org/issue8
STINNER Victor added the comment:
Version 6:
- don't use fputc(), fputs(), fprintf() or fflush() on stderr: use write() on
file descriptor 2 (should be stderr)
- write tests: add sigsegv(), sigfpe() and fatal_error() functions to the
_testcapi module
I was too lazy to reimplement func
STINNER Victor added the comment:
TODO (maybe): Call the original signal handler to make tools like apport or
ABRT able to catch segmentation faults.
> By the way, don't you want to handle SIGILL and SIGBUS too?
Maybe. SIGILL is a very rare exception. To test it, should I send
STINNER Victor added the comment:
I commited issue9992.patch as r85430 (remove PYTHONFSENCODING) + r85435 (OSX:
decode command line arguments from utf-8).
These commits should fix this issue. Reopen the issue if you notice new
problems, or if the problem is not fixed yet. I will watch Mac OS
STINNER Victor added the comment:
Fixed by r85430 (remove PYTHONFSENCODING), see #9992.
--
resolution: -> fixed
status: open -> closed
___
Python tracker
<http://bugs.python.org/i
STINNER Victor added the comment:
Fixed by r85430 (remove PYTHONFSENCODING), see #9992.
--
resolution: -> fixed
___
Python tracker
<http://bugs.python.org/issu
STINNER Victor added the comment:
Fixed by r85430 (remove PYTHONFSENCODING), see #9992.
--
resolution: -> fixed
status: open -> closed
___
Python tracker
<http://bugs.python.org/
STINNER Victor added the comment:
This issue should be fixed by r85435 (OSX: decode command line arguments from
utf-8), see #9992.
I will watch for the OSX buildbots.
--
___
Python tracker
<http://bugs.python.org/issue4
Changes by STINNER Victor :
--
status: open -> closed
___
Python tracker
<http://bugs.python.org/issue10014>
___
___
Python-bugs-list mailing list
Unsubscri
STINNER Victor added the comment:
> This issue should be fixed by r85435 ...
> I will watch for the OSX buildbots.
I don't know if it fixes the issue, but it introduces a regression. r85442
reverts it.
---
Revert r85435 (and r85440): decode command line arguments from utf-8
P
STINNER Victor added the comment:
osx_utf8_cmdline.patch: copy of r85435.
--
Added file: http://bugs.python.org/file19228/osx_utf8_cmdline.patch
___
Python tracker
<http://bugs.python.org/issue4
New submission from STINNER Victor :
It looks like the parser API (eg. PyParser_ParseFileFlagsEx,
PyParser_ASTFromFile) expects utf-8 filename: err_input() decodes the filename
from utf-8. But
Example in a non-ascii directory (/home/SHARE/SVN/py3kéŁ) and an ascii locale:
$ LANG
STINNER Victor added the comment:
test_undecodable_env() of test_subprocess fails. r85430 removes the following
code which was added by Antoine to fix this issue.
# Force surrogate-escaping of \xFF in the child process;
# otherwise it can be decoded as-is if the default locale
# is latin-1
STINNER Victor added the comment:
With r85466+r85467, the test_undecodable_env (of test_subprocess) uses C locale
to get ASCII locale encoding (for the first test, on unicode environment
variables). It should have the same effect than env['PYTHONFSENCODING'] =
'ascii
STINNER Victor added the comment:
Ok, the issue is not complelty fixed ;-)
12:55 < py-bb> build #504 of x86 debian parallel 3.x is complete: Success
[build successful] Build details are at
http://www.python.org/dev/buildbot/all/builders/x86%20debian%20parallel%203.x/
2301 - 2400 of 35168 matches
Mail list logo