[issue12731] python lib re uses obsolete sense of \w in full violation of UTS#18 RL1.2a

2011-08-27 Thread Ezio Melotti
Ezio Melotti added the comment: > But I really hope the re module (really: the _sre extension module) > can be fixed. If you mean on 2.7/3.2, then I guess we could extract the fixes from regex, but we have to see if it's doable and someone will have to do it. Also consider that the regex modu

[issue12731] python lib re uses obsolete sense of \w in full violation of UTS#18 RL1.2a

2011-08-27 Thread Ezio Melotti
Ezio Melotti added the comment: > Or the re module should be *replaced* by the code from the regex module > (but renamed to re, and with certain backwards compatibilities > restored, probably). This is what I meant. > But I really hope the re module (really: the _sre extension module) > can be

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2011-08-27 Thread Steven D'Aprano
Steven D'Aprano added the comment: I'm not sure if this belongs here, or on the Google code project page, so I'll add it in both places :) Feature request: please change the NEW flag to something else. In five or six years (give or take), the re module will be long forgotten, compatibility wi

[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-27 Thread Ezio Melotti
Ezio Melotti added the comment: FTR, with the latest Python 3.2/3.3 (narrow) I get: Total failures: 58 / 500 ( 12%) Total successes: 442 / 500 ( 88%) and with the latest Python 3.2/3.3 (wide) I get: Total failures: 52 / 500 ( 10%) Total successes: 448 / 500 ( 90%) -- Add

[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-27 Thread Terry J. Reedy
Terry J. Reedy added the comment: Python makes it easy to transform a sequence with a generator as long as no look-ahead is needed. utf16.UTF16.__iter__ is a typical example. Whenever a surrogate is found, grab the matching one. However, grapheme clustering does require look-ahead, which is a

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-27 Thread Vlad Riscutia
Vlad Riscutia added the comment: Ah, I see Antoine already attached a patch. I was 3 minutes late :) -- ___ Python tracker ___ ___ Py

[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-27 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Neither am I. Even in "old-style" English with ae and oe, one wrote > ÆGYPT and ÆSIR all caps but Ægypt and Æsir in titlecase, not *Aegypt or > *Aesir. Similarly with ŒNOLOGY / Œnology / œnology, never *Oenology. Trying to disprove you a bit: http://ecx.ima

[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-27 Thread Matthew Barnett
Matthew Barnett added the comment: There are some oddities in Unicode case-folding. Under full case-folding, both "\N{LATIN CAPITAL LETTER SHARP S}" and "\N{LATIN SMALL LETTER SHARP S}" fold to "ss", which means that those codepoints match each other. However, under simple case-folding, they

[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-27 Thread Tom Christiansen
Tom Christiansen added the comment: Guido van Rossum wrote on Sat, 27 Aug 2011 16:15:33 -: > Although personally I don't have much of an intuition for what > titlecase means (and why it's important), perhaps because I'm not > familiar with any language where there is a third case for s

[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-27 Thread Guido van Rossum
Guido van Rossum added the comment: Thanks you very much. We should fix the behavior in 3.3 for sure. I'm thinking that we may be able to backport the behavior fix to 2.7 and 3.2 as well, since it just makes the behavior generally "better" (and for most folks it won't matter anyway). I'm not su

[issue6560] socket sendmsg(), recvmsg() methods

2011-08-27 Thread Nick Coghlan
Nick Coghlan added the comment: Putting this back to open until we decide what to do about the OS X test failures. It sounds like it could really do with some more poking and prodding to figure out whether or not it poses a potential security risk or is just a relatively cosmetic problem with

[issue8426] multiprocessing.Queue fails to get() very large objects

2011-08-27 Thread Charles-François Natali
Charles-François Natali added the comment: > "Avoid sending very large amounts of data via queues, as you could come up > against system-dependent limits according to the operating system and whether > pipes or sockets are used. You could consider an alternative strategy, such > as writing la

[issue8296] multiprocessing.Pool hangs when issuing KeyboardInterrupt

2011-08-27 Thread Antoine Pitrou
Antoine Pitrou added the comment: Note that #9205 fixed concurrent.futures, but not multiprocessing.Pool which is a different kettle of fish. -- nosy: +pitrou ___ Python tracker ___

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-27 Thread Vlad Riscutia
Vlad Riscutia added the comment: Attached updated patch which extends generrmap.c to allow for easy addition of other error mappings. Also regenerated errmap.h and unittest. -- Added file: http://bugs.python.org/file23054/issue12802_2.diff ___ Pyth

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-27 Thread Antoine Pitrou
Antoine Pitrou added the comment: Here is a new patch. -- Added file: http://bugs.python.org/file23053/winenotdir.patch ___ Python tracker ___ __

[issue8296] multiprocessing.Pool hangs when issuing KeyboardInterrupt

2011-08-27 Thread Vinay Sajip
Vinay Sajip added the comment: Closing, as Andrey Vlasovskikh has agreed that this is a duplicate of #9205. -- nosy: +vinay.sajip resolution: -> duplicate status: open -> closed ___ Python tracker

[issue11990] redirected output - stdout writes newline as \n in windows

2011-08-27 Thread Vinay Sajip
Vinay Sajip added the comment: So is this now just a documentation issue, about the changed behaviour of pipes in 3.2? -- nosy: +vinay.sajip ___ Python tracker ___

[issue8426] multiprocessing.Queue fails to get() very large objects

2011-08-27 Thread Vinay Sajip
Vinay Sajip added the comment: I think it's just a documentation issue. The problem with documenting limits is that they are system-specific and, even if the current limits that Charles-François has mentioned are documented, these could become outdated. Perhaps a suggestion could be added to

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-27 Thread Antoine Pitrou
Antoine Pitrou added the comment: Ok, running vcvarsamd64.bat seems to do the trick. -- ___ Python tracker ___ ___ Python-bugs-list m

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-27 Thread Antoine Pitrou
Antoine Pitrou added the comment: Ok, apparently I can use errmap.mak, except that I get the following error: Z:\default\PC>nmake errmap.mak Microsoft (R) Program Maintenance Utility Version 9.00.21022.08 Copyright (C) Microsoft Corporation. All rights reserved. cl generrmap.c Micro

[issue10015] Creating a multiprocess.pool.ThreadPool from a child thread blows up.

2011-08-27 Thread Vinay Sajip
Changes by Vinay Sajip : -- title: Creating a multiproccess.pool.ThreadPool from a child thread blows up. -> Creating a multiprocess.pool.ThreadPool from a child thread blows up. ___ Python tracker ___

[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-27 Thread Tom Christiansen
Tom Christiansen added the comment: Guido van Rossum wrote on Fri, 26 Aug 2011 21:11:24 -: > Would this also affect .islower() and friends? SHORT VERSION: (7 lines) I don't believe so, but the relationship between lower() and islower() is not as clear to me as I would have t

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-27 Thread Antoine Pitrou
Antoine Pitrou added the comment: We could add a special case to generrmap.c (but how can I compile and execute this file? it doesn't seem to be part of the project files). -- ___ Python tracker _

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-27 Thread Vlad Riscutia
Vlad Riscutia added the comment: Oh, got it. Interesting. Then should I just add a comment somewhere or should we resolve this as Won't Fix? -- ___ Python tracker ___ _

[issue9923] mailcap module may not work on non-POSIX platforms if MAILCAPS env variable is set

2011-08-27 Thread Nick Coghlan
Nick Coghlan added the comment: As noted in the commit message, I didn't backport this, since it didn't seem worth risking breaking even the unlikely case that someone actually *was* using the MAILCAP environment variable on Windows. -- resolution: -> fixed stage: patch review -> com

[issue12174] Multiprocessing logging levels unclear

2011-08-27 Thread Vinay Sajip
Vinay Sajip added the comment: Although the reference docs don't list the numeric values of logging levels, this happened during reorganising of the docs. The table has moved to the HOWTO: http://docs.python.org/howto/logging.html#logging-levels That said, I don't understand the need for spec

[issue9923] mailcap module may not work on non-POSIX platforms if MAILCAPS env variable is set

2011-08-27 Thread Roundup Robot
Roundup Robot added the comment: New changeset 7b83d2c1aad9 by Nick Coghlan in branch 'default': Fix #9923: mailcap now uses the OS path separator for the MAILCAP envvar. Not backported, since it could break cases where people worked around the old POSIX-specific behaviour on non-POSIX platfor

[issue12835] Missing SSLSocket.sendmsg() wrapper allows programs to send unencrypted data by mistake

2011-08-27 Thread Nick Coghlan
Changes by Nick Coghlan : -- resolution: -> fixed stage: -> committed/rejected status: open -> closed ___ Python tracker ___ ___ Pyt

[issue12835] Missing SSLSocket.sendmsg() wrapper allows programs to send unencrypted data by mistake

2011-08-27 Thread Roundup Robot
Roundup Robot added the comment: New changeset b06f011a3529 by Nick Coghlan in branch 'default': Fix #12835: prevent use of the unencrypted sendmsg/recvmsg APIs on SSL wrapped sockets (Patch by David Watson) http://hg.python.org/cpython/rev/b06f011a3529 -- nosy: +python-dev __

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2011-08-27 Thread Antoine Pitrou
New submission from Antoine Pitrou : In several opcodes (BINBYTES, BINUNICODE... what else?), _pickle.c happily accepts 32-bit lengths of more than 2**31, while pickle.py uses marshal's "i" typecode which means "signed"... and therefore fails reading the data. Apparently, pickle.py uses marshal

[issue11564] pickle not 64-bit ready

2011-08-27 Thread Antoine Pitrou
Antoine Pitrou added the comment: Here is a new patch against 3.2. I can't say it works for sure, but it should be much better. It also adds a couple more tests. There seems to be a separate issue where pure-Python pickle.py considers 32-bit lengths signed where the C impl considers them unsig

[issue12847] crash with negative PUT in pickle

2011-08-27 Thread Antoine Pitrou
Antoine Pitrou added the comment: Same with LONG_BINPUT on a 32-bit build: >>> s = b'\x80\x03X\x01\x00\x00\x00ar\xff\xff\xff\xff.' >>> pickletools.dis(s) 0: \x80 PROTO 3 2: XBINUNICODE 'a' 8: rLONG_BINPUT -1 13: .STOP highest protocol among opcodes = 2 >>> pickle

[issue12847] crash with negative PUT in pickle

2011-08-27 Thread Antoine Pitrou
New submission from Antoine Pitrou : This doesn't happen on 2.x cPickle, where PUT keys are simply treated as strings. >>> import pickle, pickletools >>> s = b'Va\np-1\n.' >>> pickletools.dis(s) 0: VUNICODE'a' 3: pPUT-1 7: .STOP highest protocol among opcodes

[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-27 Thread Tom Christiansen
Tom Christiansen added the comment: Guido van Rossum wrote on Sat, 27 Aug 2011 03:26:21 -: > To me, making (default) iteration deviate from indexing is anathema. So long is there's a way to interate through a string some other way that by code unit, that's fine. However, the Java way

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-27 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: Unfortunately, it won't work. _dosmaperr() is not exported by msvcrt.dll, it is only available when you link against the static version of the C runtime. -- ___ Python tracker

[issue12833] raw_input misbehaves when readline is imported

2011-08-27 Thread Idan Kamara
Idan Kamara added the comment: You're right, as this little C program verifies: #include #include #include int main() { printf("foo "); char* buf = readline(""); free(buf); return 0; } Passing ' ' seems to be a suitable workaround for those who can't pass the text directly to

[issue12833] raw_input misbehaves when readline is imported

2011-08-27 Thread Nadeem Vawda
Nadeem Vawda added the comment: Reproduced on 3.3 head. Looking at the documentation of the C readline library, it needs to know the length of the prompt in order to display properly, so this seems to be an acknowledged limitation of the underlying library rather than a bug on our side. Still,

[issue12768] docstrings for the threading module

2011-08-27 Thread Graeme Cross
Graeme Cross added the comment: I will check that the patch works with 3.2; if not, I'll redo the patch for 3.2. I will also incorporate the review changes from Ezio and Eric. -- ___ Python tracker ___