[ python-Bugs-1628484 ] Python 2.5 64 bit compile fails on Solaris 10/gcc 4.1.1
Bugs item #1628484, was opened at 2007-01-05 00:45 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1628484&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Build Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Bob Atkins (bobatkins) Assigned to: Nobody/Anonymous (nobody) Summary: Python 2.5 64 bit compile fails on Solaris 10/gcc 4.1.1 Initial Comment: This looks like a recurring and somewhat sore topic. For those of us that have been fighting the dreaded: ./Include/pyport.h:730:2: error: #error "LONG_BIT definition appears wrong for platform (bad gcc/glibc config?)." when performing a 64 bit compile. I believe I have identified the problems. All of which are directly related to the Makefile(s) that are generated as part of the configure script. There does not seem to be anything wrong with the configure script or anything else once all of the Makefiles are corrected Python will build 64 bit Although it is possible to pass the following environment variables to configure as is typical on most open source software: CC C compiler command CFLAGS C compiler flags LDFLAGS linker flags, e.g. -L if you have libraries in a nonstandard directory CPPFLAGSC/C++ preprocessor flags, e.g. -I if you have headers in a nonstandard directory CPP C preprocessor These flags are *not* being processed through to the generated Makefiles. This is where the problem is. configure is doing everything right and generating all of the necessary stuff for a 64 bit compile but when the compile is actually performed - the necessary CFLAGS are missing and a 32 bit compile is initiated. Taking a close look at the first failure I found the following: gcc -pthread -c -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -I. -I./Include -fPIC -DPy_BUILD_CORE -o Modules/python.o ./Modules/python.c Where are my CFLAGS??? I ran the configure with: CFLAGS="-O3 -m64 -mcpu=ultrasparc -L/opt/lib/sparcv9 -R/opt/lib/sparcv9" \ CXXFLAGS="-O3 -m64 -mcpu=ultrasparc -L/opt/lib/sparcv9 -R/opt/lib/sparcv9" \ LDFLAGS="-m64 -L/opt/lib/sparcv9 -R/opt/lib/sparcv9" \ ./configure --prefix=/opt \ --enable-shared \ --libdir=/opt/lib/sparcv9 Checking the config.log and config.status it was clear that the flags were used properly as the configure script ran however, the failure is in the various Makefiles to actually reference the CFLAGS and LDFLAGS. LDFLAGS is simply not included in any of the link stages in the Makefiles and CFLAGS is overidden by BASECFLAGS, OPT and EXTRA_CFLAGS! Ah! EXTRA_CFLAGS="-O3 -m64 -mcpu=ultrasparc -L/opt/lib/sparcv9 -R/opt/lib/sparcv9" \ make Actually got the core parts to compile for the library and then failed to build the library because - LDFLAGS was missing from the Makefile for the library link stage - :-( Close examination suggests that the OPT environment variable could be used to pass the necessary flags through from conifgure but this still did not help the link stage problems. The fixes are pretty minimal to ensure that the configure variables are passed into the Makefile. My patch to the Makefile.pre.in is attached to this bug report. Once these changes are made Python will build properly for both 32 and 64 bit platforms with the correct CFLAGS and LDFLAGS passed into the configure script. BTW, while this bug is reported under a Solaris/gcc build the patches to Makefile.pre.in should fix similar build issues on all platforms. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1628484&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1626801 ] posixmodule.c leaks crypto context on Windows
Bugs item #1626801, was opened at 2007-01-03 12:47 Message generated for change (Comment added) made by ygale You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1626801&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Extension Modules Group: Python 2.6 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Yitz Gale (ygale) Assigned to: Martin v. Löwis (loewis) Summary: posixmodule.c leaks crypto context on Windows Initial Comment: The Win API docs for CryptAcquireContext require that the context be released after use by calling CryptReleaseContext, but posixmodule.c fails to do so in win32_urandom(). -- >Comment By: Yitz Gale (ygale) Date: 2007-01-05 12:01 Message: Logged In: YES user_id=1033539 Originator: YES OK, then, fine. You might want to just add a comment there so that people like me won't keep filing bugs against this. :) -- Comment By: Martin v. Löwis (loewis) Date: 2007-01-05 02:46 Message: Logged In: YES user_id=21627 Originator: NO Yes, I'm absolutely certain that terminating a process releases all handles, on Windows NT+. -- Comment By: Yitz Gale (ygale) Date: 2007-01-04 23:46 Message: Logged In: YES user_id=1033539 Originator: YES How do you know that "it is automatically released by the operating system?" The documentation for CryptAcquireContext states: "When you have finished using the CSP, release the handle by calling the CryptReleaseContext function." In the example code provided, the wording in the comments is even stronger: "When the handle is no longer needed, it must be released." The example code then explicitly calls CryptReleaseContext. Do you know absolutely for certain that we are not leaking resourses if we violate this clear API requirement? Reference: http://msdn2.microsoft.com/en-us/library/aa379886.aspx -- Comment By: Martin v. Löwis (loewis) Date: 2007-01-04 23:13 Message: Logged In: YES user_id=21627 Originator: NO I fail to see the problem. Only a single crypto context is allocated, and it is used all the time, i.e. until the Python interpreter finishes, at which time it is automatically released by the operating system. -- Comment By: Yitz Gale (ygale) Date: 2007-01-03 14:12 Message: Logged In: YES user_id=1033539 Originator: YES You might consider backporting this to 2.5 and 2.4. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1626801&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1627956 ] documentation error for "startswith" string method
Bugs item #1627956, was opened at 2007-01-04 11:21 Message generated for change (Comment added) made by akuchling You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1627956&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: Python 2.5 >Status: Closed >Resolution: Fixed Priority: 5 Private: No Submitted By: Keith Briggs (kbriggs) >Assigned to: A.M. Kuchling (akuchling) Summary: documentation error for "startswith" string method Initial Comment: At http://docs.python.org/lib/string-methods.html#l2h-241, I think prefix can also be a tuple of suffixes to look for. should be prefix can also be a tuple of prefixes to look for. -- >Comment By: A.M. Kuchling (akuchling) Date: 2007-01-05 09:15 Message: Logged In: YES user_id=11375 Originator: NO Fixed; thanks! -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1627956&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1625205 ] sqlite3 documentation omits: close(), commit(), autocommit
Bugs item #1625205, was opened at 2006-12-30 23:34 Message generated for change (Settings changed) made by akuchling You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1625205&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: kitbyaydemir (kitbyaydemir) >Assigned to: Gerhard Häring (ghaering) Summary: sqlite3 documentation omits: close(), commit(), autocommit Initial Comment: The Python 2.5 Library documentation (HTML format), Section 13.13 (sqlite3) fails to mention several important methods of Connection objects. Specifically, the close() and commit() methods. Considering that autocommit mode is not the default, I'm not sure how a user is supposed to figure out that they need to call these methods to ensure that changes are reflected on disk. (The only reason I discovered these was from http://initd.org/tracker/pysqlite/wiki/basicintro .) Furthermore, Section 13.13.5 mentions the existence of "autocommit mode", but fails to describe what that mode is and why it might be useful. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1625205&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1622533 ] null bytes in docstrings
Bugs item #1622533, was opened at 2006-12-26 12:47 Message generated for change (Comment added) made by akuchling You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1622533&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library >Group: Python 2.5 >Status: Closed >Resolution: Fixed Priority: 5 Private: No Submitted By: Fredrik Lundh (effbot) >Assigned to: A.M. Kuchling (akuchling) Summary: null bytes in docstrings Initial Comment: the following docstrings contain bogus control characters: module difflib, function _mdiff, contains four invalid bytes: ['\x00', '\x00', '\x00', '\x01'] module StringIO, method readline, contains a null byte: ['\x00'] since this breaks help() and probably a bunch of other documentation tools, it would probably be a good idea to add the missing backslashes... -- >Comment By: A.M. Kuchling (akuchling) Date: 2007-01-05 09:24 Message: Logged In: YES user_id=11375 Originator: NO Fixed in trunk rev. 53262, 25-maint rev. 53263. Thanks! -- Comment By: Raymond Hettinger (rhettinger) Date: 2006-12-26 13:27 Message: Logged In: YES user_id=80475 Originator: NO Clearer and simpler to make the whole docstring raw. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1622533&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-831574 ] Solaris term.h needs curses.h
Bugs item #831574, was opened at 2003-10-28 01:53 Message generated for change (Comment added) made by akuchling You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=831574&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Build Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Anthony Baxter (anthonybaxter) Assigned to: Anthony Baxter (anthonybaxter) Summary: Solaris term.h needs curses.h Initial Comment: Solaris' term.h requires curses.h to be included first. This causes the configure script to emit lines about a bug in autoconf. From the autoconf mailing lists, their standard response is to fix the configure script, see e.g. http://mail.gnu.org/archive/html/bug-autoconf/2003-05/msg00118.html The following patch against 2.3 branch for configure and configure.in makes things a bit happier. Note that Include/py_curses.h already includes curses.h before term.h, this just fixes the breakage of configure. -- >Comment By: A.M. Kuchling (akuchling) Date: 2007-01-05 09:33 Message: Logged In: YES user_id=11375 Originator: NO Is this bug still relevant to Python 2.5? -- Comment By: Martin v. Löwis (loewis) Date: 2003-10-31 10:22 Message: Logged In: YES user_id=21627 I find it confusing that the test for curses.h already refers to HAVE_CURSES_H; I think you should first check for curses.h, and then use HAVE_CURSES_H in the test for term.h I also agree that #ifdef is better than #if, even though it should not matter in an ISO C compiler (which replaces undefined symbols by 0 in an #if). -- Comment By: Anthony Baxter (anthonybaxter) Date: 2003-10-28 20:38 Message: Logged In: YES user_id=29957 Dunno if #ifdef is better or not - I just worked from the example in the attached autoconf mailing list message. -- Comment By: Neal Norwitz (nnorwitz) Date: 2003-10-28 08:08 Message: Logged In: YES user_id=33168 Should the #if be an #ifdef ? Looks fine to me, but I don't know much about autoconf. :-) I think Martin is the expert. Martin do you have an opinion? -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=831574&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1119331 ] curses.initscr - initscr exit w/o env(TERM) set
Bugs item #1119331, was opened at 2005-02-09 09:51 Message generated for change (Comment added) made by akuchling You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1119331&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Jacob Lilly (jrlilly) Assigned to: Michael Hudson (mwh) Summary: curses.initscr - initscr exit w/o env(TERM) set Initial Comment: the initscr in ncurses will cause an immeadiation exit if the env doesn't have the TERM variable set. Could the curses.initscr be changed so it tests if TERM is set and raises an exception? It would be helpful to be able to try and except this instead of just having ncurses exit for you. -- >Comment By: A.M. Kuchling (akuchling) Date: 2007-01-05 09:36 Message: Logged In: YES user_id=11375 Originator: NO Patch #2 looks OK. Any objections if I just commit it (to trunk only)? -- Comment By: Michael Hudson (mwh) Date: 2005-06-13 14:13 Message: Logged In: YES user_id=6656 How about the attached, then? (sorry for the long, long wait) -- Comment By: Jacob Lilly (jrlilly) Date: 2005-02-10 08:41 Message: Logged In: YES user_id=774886 The only thing that worries me about that is it takes a different path than ncurses does (or at least 5.4 does). If the env variable isn't set, initscr in ncurses assumes the term type is "unknown" (if no env) and passes "unknown" along, whereas setupterm assumes that if you pass it NULL for the term and the env isn't set, then it simply returns 0. I doubt anyone will have a valid term setup for unknown, but who knows. Beyound that works for me. BTW, the gnu ncurses guys say this is the the behavior in the standard. -- Comment By: Michael Hudson (mwh) Date: 2005-02-10 06:22 Message: Logged In: YES user_id=6656 The motivation for calling setupterm() ourselves was that I noticed TERM=garbage python -c 'import curses; curses.initscr()' hit the irritating error path too. I also hadn't realised there was a version of initscr in curses/__init__.py, which makes things easier... how about the attached? -- Comment By: Jacob Lilly (jrlilly) Date: 2005-02-09 19:06 Message: Logged In: YES user_id=774886 if you pass setupterm 0 for the term name it just tries to get the env variable, so the env test should cover it pretty well. It might make more sense to check the env first and then pass "unkown", setuperm("unknown", -1). This would seem to match the path that curses initscr follows. This would also raise _curses and curses shared exception. -- Comment By: Michael Hudson (mwh) Date: 2005-02-09 18:19 Message: Logged In: YES user_id=6656 Yeah, I noticed that. We could at least call setupterm(0, NULL) first, I guess... -- Comment By: Jacob Lilly (jrlilly) Date: 2005-02-09 14:51 Message: Logged In: YES user_id=774886 that is any return of 0 from newterm -- Comment By: Jacob Lilly (jrlilly) Date: 2005-02-09 14:49 Message: Logged In: YES user_id=774886 sorry, I should have done that in the beginning; I have it raising a RuntimeError, I think thats what it is. This doesn't really solve the problem in whole, since ncurses initscr has lots of ways it could decide to decide to exit (any return value from newterm causes it to exit), but it does solve a more common one. Anything else would require modifying ncruses to be responsible. -- Comment By: Michael Hudson (mwh) Date: 2005-02-09 13:45 Message: Logged In: YES user_id=6656 How amazingly terrible (on ncurses part). Do you want to/are you able to work on a patch? -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1119331&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-849046 ] gzip.GzipFile is slow
Bugs item #849046, was opened at 2003-11-25 10:45 Message generated for change (Comment added) made by akuchling You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=849046&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.4 >Status: Closed >Resolution: Fixed Priority: 3 Private: No Submitted By: Ronald Oussoren (ronaldoussoren) >Assigned to: Bob Ippolito (etrepum) Summary: gzip.GzipFile is slow Initial Comment: gzip.GzipFile is significantly (an order of a magnitude) slower than using the gzip binary. I've been bitten by this several times, and have replaced "fd = gzip.open('somefile', 'r')" by "fd = os.popen('gzcat somefile', 'r')" on several occassions. Would a patch that implemented GzipFile in C have any change of being accepted? -- >Comment By: A.M. Kuchling (akuchling) Date: 2007-01-05 09:42 Message: Logged In: YES user_id=11375 Originator: NO Patch #1281707 improved readline() performance and has been applied. I'll close this bug; please re-open if there are still performance issues. -- Comment By: April King (marumari) Date: 2005-05-04 12:18 Message: Logged In: YES user_id=747439 readlines(X) is even worse, as all it does is call readline() X times. readline() is also biased towards files where each line is less than 100 characters: readsize = min(100, size) So, if it's longer than that, it calls read, which calls _read, and so on. I've found using popen to be roughly 20x faster than using the gzip module. That's pretty bad. -- Comment By: Ronald Oussoren (ronaldoussoren) Date: 2003-12-28 11:25 Message: Logged In: YES user_id=580910 Leaving out the assignment sure sped thing up, but only because the input didn't contain lines anymore ;-) I did an experiment where I replaced self.extrabuf by a list, but that did slow things down. This may be because there seemed to be very few chunks in the buffer (most of the time just 2) According to profile.run('testit()') the function below spends about 50% of its time in the readline method: def testit() fd = gzip.open('testfile.gz', 'r') ln = fd.readline() cnt = bcnt = 0 while ln: ln = fd.readline() cnt += 1 bcnt += len(ln) print bcnt, cnt return bcnt,cnt testfile.gz is a simple textfile containing 40K lines of about 70 characters each. Replacing the 'buffers' in readline by a string (instead of a list) slightly speeds things up (about 10%). Other experiments did not bring any improvement. Even writing a simple C function to split the buffer returned by self.read() didn't help a lot (splitline(strval, max) -> match, rest, match is strval upto the first newline and at most max characters, rest is the rest of strval). -- Comment By: A.M. Kuchling (akuchling) Date: 2003-12-23 12:10 Message: Logged In: YES user_id=11375 It should be simple to check if the string operations are responsible -- comment out the 'self.extrabuf = self.extrabuf + data' in _add_read_data. If that makes a big difference, then _read should probably be building a list instead of modifying a string. -- Comment By: Brett Cannon (bcannon) Date: 2003-12-04 14:51 Message: Logged In: YES user_id=357491 Looking at GzipFile.read and ._read , I think a large chunk of time is burned in the decompression of small chunks of data. It initially reads and decompresses 1024 bits, and then if that read did not hit the EOF, it multiplies it by 2 and continues until the EOF is reached and then finishes up. The problem is that for each read a call to _read is made that sets up a bunch of objects. I would not be surprised if the object creation and teardown is hurting the performance. I would also not be surprised if the reading of small chunks of data is an initial problem as well. This is all guesswork, though, since I did not run the profiler on this. *But*, there might be a good reason for reading small chunks. If you are decompressing a large file, you might run out of memory very quickly by reading the file into memory *and* decompressing at the same time. Reading it in successively larger chunks means you don't hold the file's entire contents in memory at any one time. So the question becomes whether causing your memory to get overloaded and major thrashing on your swap space is worth the performance increase. There is also the option of inlining _read into 'read', but since it makes two calls that seems like poor abstraction and t
[ python-Bugs-756982 ] mailbox should use email not rfc822
Bugs item #756982, was opened at 2003-06-18 22:19 Message generated for change (Comment added) made by akuchling You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=756982&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Extension Modules Group: Python 2.3 Status: Open Resolution: None >Priority: 1 Private: No Submitted By: Ben Leslie (benno37) Assigned to: Barry A. Warsaw (bwarsaw) Summary: mailbox should use email not rfc822 Initial Comment: The mailbox module uses the rfc822 module as its default factory for creating message objects. The rfc822 documentation claims that its use is deprecated. The mailbox module should probably use the new email module as its default factory. Of course this has backward compatibility issues, in which case it should at least be mentioned in the mailbox documentation that it uses the deprecated rfc822 module, and provide an example of how to use the email module instead. -- >Comment By: A.M. Kuchling (akuchling) Date: 2007-01-05 09:46 Message: Logged In: YES user_id=11375 Originator: NO The reworking of mailbox.py introduced in Python 2.5 adds new mailbox classes that do use email.Message. Arguably we could begin deprecating the old classes (or just remove them all for Python 3000?). -- Comment By: Anthony Baxter (anthonybaxter) Date: 2005-01-10 02:56 Message: Logged In: YES user_id=29957 Given the amount of code out there using rfc822, should we instead PDW it? In any case, I'm -0 on putting a DeprecationWarning on it unless we've removed all use of it from the stdlib. -- Comment By: Barry A. Warsaw (bwarsaw) Date: 2005-01-08 10:49 Message: Logged In: YES user_id=12800 It's a good question. I'd like to say yes so that we can start adding deprecation warnings to rfc822 for Python 2.5. -- Comment By: Johannes Gijsbers (jlgijsbers) Date: 2005-01-08 09:22 Message: Logged In: YES user_id=469548 So, with the plans to seriously start working deprecating rfc822, should we use the email module as the default factory now? -- Comment By: Barry A. Warsaw (bwarsaw) Date: 2003-06-20 17:48 Message: Logged In: YES user_id=12800 I've added some sample code to the mailbox documentation that explain how to use the email package with the mailbox module. We can't change the default for backward compatibility reasons, as you point out. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=756982&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Feature Requests-1627266 ] optparse "store" action should not gobble up next option
Feature Requests item #1627266, was opened at 2007-01-03 13:46 Message generated for change (Comment added) made by draghuram You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1627266&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Raghuram Devarakonda (draghuram) Assigned to: Nobody/Anonymous (nobody) Summary: optparse "store" action should not gobble up next option Initial Comment: Hi, Check the following code: --opttest.py-- from optparse import OptionParser def process_options(): global options, args, parser parser = OptionParser() parser.add_option("--test", action="store_true") parser.add_option("-m", metavar="COMMENT", dest="comment", default=None) (options, args) = parser.parse_args() return process_options() print "comment (%r)" % options.comment - $ ./opttest.py -m --test comment ('--test') I was expecting this to give an error as "--test" is an option. But it looks like even C library's getopt() behaves similarly. It will be nice if optparse can report error in this case. -- >Comment By: Raghuram Devarakonda (draghuram) Date: 2007-01-05 10:19 Message: Logged In: YES user_id=984087 Originator: YES I am attaching the code fragment as a file since the indentation got all messed up in the original post. File Added: opttest.py -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1627266&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1599254 ] mailbox: other programs' messages can vanish without trace
Bugs item #1599254, was opened at 2006-11-19 11:03 Message generated for change (Settings changed) made by akuchling You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1599254&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.5 Status: Open Resolution: None >Priority: 9 Private: No Submitted By: David Watson (baikie) Assigned to: A.M. Kuchling (akuchling) Summary: mailbox: other programs' messages can vanish without trace Initial Comment: The mailbox classes based on _singlefileMailbox (mbox, MMDF, Babyl) implement the flush() method by writing the new mailbox contents into a temporary file which is then renamed over the original. Unfortunately, if another program tries to deliver messages while mailbox.py is working, and uses only fcntl() locking, it will have the old file open and be blocked waiting for the lock to become available. Once mailbox.py has replaced the old file and closed it, making the lock available, the other program will write its messages into the now-deleted "old" file, consigning them to oblivion. I've caused Postfix on Linux to lose mail this way (although I did have to turn off its use of dot-locking to do so). A possible fix is attached. Instead of new_file being renamed, its contents are copied back to the original file. If file.truncate() is available, the mailbox is then truncated to size. Otherwise, if truncation is required, it's truncated to zero length beforehand by reopening self._path with mode wb+. In the latter case, there's a check to see if the mailbox was replaced while we weren't looking, but there's still a race condition. Any alternative ideas? Incidentally, this fixes a problem whereby Postfix wouldn't deliver to the replacement file as it had the execute bit set. -- Comment By: A.M. Kuchling (akuchling) Date: 2006-12-20 14:48 Message: Logged In: YES user_id=11375 Originator: NO Committed length-checking.diff to trunk in rev. 53110. -- Comment By: David Watson (baikie) Date: 2006-12-20 14:19 Message: Logged In: YES user_id=1504904 Originator: YES File Added: mailbox-test-lock.diff -- Comment By: David Watson (baikie) Date: 2006-12-20 14:17 Message: Logged In: YES user_id=1504904 Originator: YES Yeah, I think that should definitely go in. ExternalClashError or a subclass sounds fine to me (although you could make a whole taxonomy of these errors, really). It would be good to have the code actually keep up with other programs' changes, though; a program might just want to count the messages at first, say, and not make changes until much later. I've been trying out the second option (patch attached, to apply on top of mailbox-copy-back), regenerating _toc on locking, but preserving existing keys. The patch allows existing _generate_toc()s to work unmodified, but means that _toc now holds the entire last known contents of the mailbox file, with the 'pending' (user-visible) mailbox state being held in a new attribute, _user_toc, which is a mapping from keys issued to the program to the keys of _toc (i.e. sequence numbers in the file). When _toc is updated, any new messages that have appeared are given keys in _user_toc that haven't been issued before, and any messages that have disappeared are removed from it. The code basically assumes that messages with the same sequence number are the same message, though, so even if most cases are caught by the length check, programs that make deletions/replacements before locking could still delete the wrong messages. This behaviour could be trapped, though, by raising an exception in lock() if self._pending is set (after all, code like that would be incorrect unless it could be assumed that the mailbox module kept hashes of each message or something). Also attached is a patch to the test case, adding a lock/unlock around the message count to make sure _toc is up-to-date if the parent process finishes first; without it, there are still intermittent failures. File Added: mailbox-update-toc.diff -- Comment By: A.M. Kuchling (akuchling) Date: 2006-12-20 09:46 Message: Logged In: YES user_id=11375 Originator: NO Attaching a patch that adds length checking: before doing a flush() on a single-file mailbox, seek to the end and verify its length is unchanged. It raises an ExternalClashError if the file's length has changed. (Should there be a different exception for this case, perhaps a subclass of ExternalClashError?) I verified that this change works by running a program that added 25 messages, pausing betwe
[ python-Bugs-1552726 ] Python polls unnecessarily every 0.1 second when interactive
Bugs item #1552726, was opened at 2006-09-05 10:42 Message generated for change (Settings changed) made by akuchling You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1552726&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: None Status: Open Resolution: Fixed >Priority: 9 Private: No Submitted By: Richard Boulton (richardb) Assigned to: A.M. Kuchling (akuchling) Summary: Python polls unnecessarily every 0.1 second when interactive Initial Comment: When python is running an interactive session, and is idle, it calls "select" with a timeout of 0.1 seconds repeatedly. This is intended to allow PyOS_InputHook() to be called every 0.1 seconds, but happens even if PyOS_InputHook() isn't being used (ie, is NULL). To reproduce: - start a python session - attach to it using strace -p PID - observe that python repeatedly This isn't a significant problem, since it only affects idle interactive python sessions and uses only a tiny bit of CPU, but people are whinging about it (though some appear to be doing so tongue-in-cheek) and it would be nice to fix it. The attached patch (against Python-2.5c1) modifies the readline.c module so that the polling doesn't happen unless PyOS_InputHook is not NULL. -- Comment By: Richard Boulton (richardb) Date: 2006-09-08 10:30 Message: Logged In: YES user_id=9565 I'm finding the function because it's defined in the compiled library - the header files aren't examined by configure when testing for this function. (this is because configure.in uses AC_CHECK_LIB to check for rl_callback_handler_install, which just tries to link the named function against the library). Presumably, rlconf.h is the configuration used when the readline library was compiled, so if READLINE_CALLBACKS is defined in it, I would expect the relevant functions to be present in the compiled library. In any case, this isn't desperately important, since you've managed to hack around the test anyway. -- Comment By: A.M. Kuchling (akuchling) Date: 2006-09-08 09:12 Message: Logged In: YES user_id=11375 That's exactly my setup. I don't think there is a -dev package for readline 4. I do note that READLINE_CALLBACKS is defined in /usr/include/readline/rlconf.h, but Python's readline.c doesn't include this file, and none of the readline headers include it. So I don't know why you're finding the function! -- Comment By: Richard Boulton (richardb) Date: 2006-09-08 05:34 Message: Logged In: YES user_id=9565 HAVE_READLINE_CALLBACK is defined by configure.in whenever the readline library on the platform supports the rl_callback_handler_install() function. I'm using Ubuntu Dapper, and have libreadline 4 and 5 installed (more precisely, 4.3-18 and 5.1-7build1), but only the -dev package for 5.1-7build1. "info readline" describes rl_callback_handler_install(), and configure.in finds it, so I'm surprised it wasn't found on akuchling's machine. I agree that the code looks buggy on platforms in which signals don't necessarily get delivered to the main thread, but looks no more buggy with the patch than without. -- Comment By: A.M. Kuchling (akuchling) Date: 2006-09-07 10:38 Message: Logged In: YES user_id=11375 On looking at the readline code, I think this patch makes no difference to signals. The code in readline.c for the callbacks looks like this: has_input = 0; while (!has_input) { ... has_input = select.select(rl_input); } if (has_input > 0) {read character} elif (errno == EINTR) {check signals} So I think that, if a signal is delivered to a thread and select() in the main thread doesn't return EINTR, the old code is just as problematic as the code with this patch. The (while !has_input) loop doesn't check for signals at all as an exit condition. I'm not sure what to do at this point. I think the new code is no worse than the old code with regard to signals. Maybe this loop is buggy w.r.t. to signals, but I don't know how to test that. -- Comment By: A.M. Kuchling (akuchling) Date: 2006-09-07 10:17 Message: Logged In: YES user_id=11375 HAVE_READLINE_CALLBACK was not defined with readline 5.1 on Ubuntu Dapper, until I did the configure/CFLAG trick. I didn't think of a possible interaction with signals, and will re-open the bug while trying to work up a test case. -- Comment By: Michael Hudson (mwh) Date: 2006-09-07 10:12 Message: Logged In: YES user_id=6656 I'd be cautious
[ python-Bugs-1628895 ] Pydoc sets choices for doc locations incorrectly
Bugs item #1628895, was opened at 2007-01-05 10:24 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1628895&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Skip Montanaro (montanaro) Assigned to: Nobody/Anonymous (nobody) Summary: Pydoc sets choices for doc locations incorrectly Initial Comment: In pydoc.Helper.__init__ I see this code: execdir = os.path.dirname(sys.executable) homedir = os.environ.get('PYTHONHOME') for dir in [os.environ.get('PYTHONDOCS'), homedir and os.path.join(homedir, 'doc'), os.path.join(execdir, 'doc'), '/usr/doc/python-docs-' + split(sys.version)[0], '/usr/doc/python-' + split(sys.version)[0], '/usr/doc/python-docs-' + sys.version[:3], '/usr/doc/python-' + sys.version[:3], os.path.join(sys.prefix, 'Resources/English.lproj/Documenta$if dir and os.path.isdir(os.path.join(dir, 'lib')): self.docdir = dir I think the third choice in the list of candidate directories is wrong. execdir is the directory of the Python executable (e.g., it's /usr/local/bin by default). I think it should be set as execdir = os.path.dirname(os.path.dirname(sys.executable)) You're not going to find a "doc" directory in /usr/local/bin. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1628895&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Feature Requests-1627266 ] optparse "store" action should not gobble up next option
Feature Requests item #1627266, was opened at 2007-01-03 13:46 Message generated for change (Comment added) made by goodger You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1627266&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Raghuram Devarakonda (draghuram) >Assigned to: Greg Ward (gward) Summary: optparse "store" action should not gobble up next option Initial Comment: Hi, Check the following code: --opttest.py-- from optparse import OptionParser def process_options(): global options, args, parser parser = OptionParser() parser.add_option("--test", action="store_true") parser.add_option("-m", metavar="COMMENT", dest="comment", default=None) (options, args) = parser.parse_args() return process_options() print "comment (%r)" % options.comment - $ ./opttest.py -m --test comment ('--test') I was expecting this to give an error as "--test" is an option. But it looks like even C library's getopt() behaves similarly. It will be nice if optparse can report error in this case. -- >Comment By: David Goodger (goodger) Date: 2007-01-05 11:28 Message: Logged In: YES user_id=7733 Originator: NO I think what you're asking for is ambiguous at best. In your example, how could optparse possibly decide that the "--test" is a second option, not an option argument? What if you *do* want "--test" as an argument? Assigning to Greg Ward. Recommend closing as invalid. -- Comment By: Raghuram Devarakonda (draghuram) Date: 2007-01-05 10:19 Message: Logged In: YES user_id=984087 Originator: YES I am attaching the code fragment as a file since the indentation got all messed up in the original post. File Added: opttest.py -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1627266&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1628902 ] xml.dom.minidom parse bug
Bugs item #1628902, was opened at 2007-01-05 17:37 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1628902&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.4 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Pierre Imbaud (pmi) Assigned to: Nobody/Anonymous (nobody) Summary: xml.dom.minidom parse bug Initial Comment: xml.dom.minidom was unable to parse an xml file that came from an example provided by an official organism.(http://www.iptc.org/IPTC4XMP) The parsed file was somewhat hairy, but I have been able to reproduce the bug with a simplified version, attached. (ends with .xmp: its supposed to be an xmp file, the xmp standard being built on xml. Well, thats the short story). The offending part is the one that goes: xmpPLUS='' it triggers an exception: ValueError: too many values to unpack, in _parse_ns_name. Some debugging showed an obvious mistake in the scanning of the name argument, that goes beyond the closing " ' ". I digged a little further thru a pdb session, but the bug seems to be located in c code. Thats the very first time I report a bug, chances are I provide too much or too little information... To whoever it may concern, here is the invoking code: from xml.dom import minidom ... class xmp(dict): def __init__(self, inStream): xmldoc = minidom.parse(inStream) x = xmp('/home/pierre/devt/port/IPTCCore-Full/x.xmp') traceback: /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xmpLib.py in __init__(self, inStream) 26 def __init__(self, inStream): 27 print minidom ---> 28 xmldoc = minidom.parse(inStream) 29 xmpmeta = xmldoc.childNodes[1] 30 rdf = xmpmeta.childNodes[1] /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/nxml/dom/minidom.py in parse(file, parser, bufsize) /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xml/dom/expatbuilder.py in parse(file, namespaces) 922 fp = open(file, 'rb') 923 try: --> 924 result = builder.parseFile(fp) 925 finally: 926 fp.close() /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xml/dom/expatbuilder.py in parseFile(self, file) 205 if not buffer: 206 break --> 207 parser.Parse(buffer, 0) 208 if first_buffer and self.document.documentElement: 209 self._setup_subset(buffer) /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xml/dom/expatbuilder.py in start_element_handler(self, name, attributes) 743 def start_element_handler(self, name, attributes): 744 if ' ' in name: --> 745 uri, localname, prefix, qname = _parse_ns_name(self, name) 746 else: 747 uri = EMPTY_NAMESPACE /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xml/dom/expatbuilder.py in _parse_ns_name(builder, name) 125 localname = intern(localname, localname) 126 else: --> 127 uri, localname = parts 128 prefix = EMPTY_PREFIX 129 qname = localname = intern(localname, localname) ValueError: too many values to unpack The offending c statement: /usr/src/packages/BUILD/Python-2.4/Modules/pyexpat.c(582)StartElement() The returned 'name': (Pdb) name Out[5]: u'XMP Photographic Licensing Universal System (xmpPLUS, http://ns.adobe.com/xap/1.0/PLUS/) CreditLineReq xmpPLUS' Its obvious the scanning went beyond the attribute. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1628902&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1628906 ] clarify 80-char limit
Bugs item #1628906, was opened at 2007-01-05 11:45 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1628906&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: Python 3000 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Jim Jewett (jimjjewett) Assigned to: Nobody/Anonymous (nobody) Summary: clarify 80-char limit Initial Comment: PEP 3099 says: """ Coding style * The (recommended) maximum line width will remain 80 characters, for both C and Python code. Thread: "C style guide", http://mail.python.org/pipermail/python-3000/2006-March/000131.html """ It should be clarified that this really means 72-79 characters, perhaps by adding the following sentence: Note that according to PEP 8, this actually means no more than 79 characters in a line, and no more than about 72 in docstrings or comments. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1628906&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Feature Requests-1627266 ] optparse "store" action should not gobble up next option
Feature Requests item #1627266, was opened at 2007-01-03 13:46 Message generated for change (Comment added) made by draghuram You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1627266&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Raghuram Devarakonda (draghuram) Assigned to: Greg Ward (gward) Summary: optparse "store" action should not gobble up next option Initial Comment: Hi, Check the following code: --opttest.py-- from optparse import OptionParser def process_options(): global options, args, parser parser = OptionParser() parser.add_option("--test", action="store_true") parser.add_option("-m", metavar="COMMENT", dest="comment", default=None) (options, args) = parser.parse_args() return process_options() print "comment (%r)" % options.comment - $ ./opttest.py -m --test comment ('--test') I was expecting this to give an error as "--test" is an option. But it looks like even C library's getopt() behaves similarly. It will be nice if optparse can report error in this case. -- >Comment By: Raghuram Devarakonda (draghuram) Date: 2007-01-05 12:58 Message: Logged In: YES user_id=984087 Originator: YES It is possible to deduce "--test" as an option because it is in the list of options given to optparse. But your point about what if the user really wants "--test" as an option argument is valid. I guess this request can be closed. Thanks, Raghu. -- Comment By: David Goodger (goodger) Date: 2007-01-05 11:28 Message: Logged In: YES user_id=7733 Originator: NO I think what you're asking for is ambiguous at best. In your example, how could optparse possibly decide that the "--test" is a second option, not an option argument? What if you *do* want "--test" as an argument? Assigning to Greg Ward. Recommend closing as invalid. -- Comment By: Raghuram Devarakonda (draghuram) Date: 2007-01-05 10:19 Message: Logged In: YES user_id=984087 Originator: YES I am attaching the code fragment as a file since the indentation got all messed up in the original post. File Added: opttest.py -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1627266&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1628987 ] inspect trouble when source file changes
Bugs item #1628987, was opened at 2007-01-05 13:43 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1628987&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.4 Status: Open Resolution: None Priority: 5 Private: No Submitted By: phil (philipdumont) Assigned to: Nobody/Anonymous (nobody) Summary: inspect trouble when source file changes Initial Comment: backtrace (relevant outer frames only): File "/path/to/myfile", line 1198, in get_hook_name for frame_record in inspect.stack(): File "/usr/mbench2.2/lib/python2.4/inspect.py", line 819, in stack return getouterframes(sys._getframe(1), context) File "/usr/mbench2.2/lib/python2.4/inspect.py", line 800, in getouterframes framelist.append((frame,) + getframeinfo(frame, context)) File "/usr/mbench2.2/lib/python2.4/inspect.py", line 775, in getframeinfo lines, lnum = findsource(frame) File "/usr/mbench2.2/lib/python2.4/inspect.py", line 437, in findsource if pat.match(lines[lnum]): break IndexError: list index out of range Based on a quick look at the inspect code, I think this happens when you: - Start python and load a module - While it's running, edit the source file for the module (before inspect tries to look into it). - Call a routine in the edited module that will lead to a call to inspect.stack(). During an inspect.stack() call, inspect will open source files to get the source code for the routines on the stack. If the source file that is opened doesn't match the byte compiled code that's being run, there are problems. Inspect caches the files it reads (using the linecache module), so if the file gets cached before it is edited, nothing should go wrong. But if the source file is edited after the module is loaded and before inspect has a chance to cache the source, you're out of luck. Of course, this shouldn't be a problem in production code, but it has bit us more than once in a development environment. Seems like it would be easy to avoid by just comparing the timestamps on the source/object files. If the source file is newer, just behave the same as if it wasn't there. Attached is a stupid little python script that reproduces the problem. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1628987&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1599254 ] mailbox: other programs' messages can vanish without trace
Bugs item #1599254, was opened at 2006-11-19 11:03 Message generated for change (Comment added) made by akuchling You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1599254&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.5 Status: Open Resolution: None Priority: 9 Private: No Submitted By: David Watson (baikie) Assigned to: A.M. Kuchling (akuchling) Summary: mailbox: other programs' messages can vanish without trace Initial Comment: The mailbox classes based on _singlefileMailbox (mbox, MMDF, Babyl) implement the flush() method by writing the new mailbox contents into a temporary file which is then renamed over the original. Unfortunately, if another program tries to deliver messages while mailbox.py is working, and uses only fcntl() locking, it will have the old file open and be blocked waiting for the lock to become available. Once mailbox.py has replaced the old file and closed it, making the lock available, the other program will write its messages into the now-deleted "old" file, consigning them to oblivion. I've caused Postfix on Linux to lose mail this way (although I did have to turn off its use of dot-locking to do so). A possible fix is attached. Instead of new_file being renamed, its contents are copied back to the original file. If file.truncate() is available, the mailbox is then truncated to size. Otherwise, if truncation is required, it's truncated to zero length beforehand by reopening self._path with mode wb+. In the latter case, there's a check to see if the mailbox was replaced while we weren't looking, but there's still a race condition. Any alternative ideas? Incidentally, this fixes a problem whereby Postfix wouldn't deliver to the replacement file as it had the execute bit set. -- >Comment By: A.M. Kuchling (akuchling) Date: 2007-01-05 14:24 Message: Logged In: YES user_id=11375 Originator: NO As a step toward improving matters, I've attached the suggested doc patch (for both 25-maint and trunk). It encourages people to use Maildir :), explicitly states that modifications should be bracketed by lock(), and fixes the examples to match. It does not say that keys are invalidated by doing a flush(), because we're going to try to avoid the necessity for that. File Added: mailbox-docs.diff -- Comment By: A.M. Kuchling (akuchling) Date: 2006-12-20 14:48 Message: Logged In: YES user_id=11375 Originator: NO Committed length-checking.diff to trunk in rev. 53110. -- Comment By: David Watson (baikie) Date: 2006-12-20 14:19 Message: Logged In: YES user_id=1504904 Originator: YES File Added: mailbox-test-lock.diff -- Comment By: David Watson (baikie) Date: 2006-12-20 14:17 Message: Logged In: YES user_id=1504904 Originator: YES Yeah, I think that should definitely go in. ExternalClashError or a subclass sounds fine to me (although you could make a whole taxonomy of these errors, really). It would be good to have the code actually keep up with other programs' changes, though; a program might just want to count the messages at first, say, and not make changes until much later. I've been trying out the second option (patch attached, to apply on top of mailbox-copy-back), regenerating _toc on locking, but preserving existing keys. The patch allows existing _generate_toc()s to work unmodified, but means that _toc now holds the entire last known contents of the mailbox file, with the 'pending' (user-visible) mailbox state being held in a new attribute, _user_toc, which is a mapping from keys issued to the program to the keys of _toc (i.e. sequence numbers in the file). When _toc is updated, any new messages that have appeared are given keys in _user_toc that haven't been issued before, and any messages that have disappeared are removed from it. The code basically assumes that messages with the same sequence number are the same message, though, so even if most cases are caught by the length check, programs that make deletions/replacements before locking could still delete the wrong messages. This behaviour could be trapped, though, by raising an exception in lock() if self._pending is set (after all, code like that would be incorrect unless it could be assumed that the mailbox module kept hashes of each message or something). Also attached is a patch to the test case, adding a lock/unlock around the message count to make sure _toc is up-to-date if the parent process finishes first; without it, there are still intermittent failures. File Added: mailbox-update-toc.diff
[ python-Bugs-1599254 ] mailbox: other programs' messages can vanish without trace
Bugs item #1599254, was opened at 2006-11-19 11:03 Message generated for change (Comment added) made by akuchling You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1599254&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.5 Status: Open Resolution: None Priority: 9 Private: No Submitted By: David Watson (baikie) Assigned to: A.M. Kuchling (akuchling) Summary: mailbox: other programs' messages can vanish without trace Initial Comment: The mailbox classes based on _singlefileMailbox (mbox, MMDF, Babyl) implement the flush() method by writing the new mailbox contents into a temporary file which is then renamed over the original. Unfortunately, if another program tries to deliver messages while mailbox.py is working, and uses only fcntl() locking, it will have the old file open and be blocked waiting for the lock to become available. Once mailbox.py has replaced the old file and closed it, making the lock available, the other program will write its messages into the now-deleted "old" file, consigning them to oblivion. I've caused Postfix on Linux to lose mail this way (although I did have to turn off its use of dot-locking to do so). A possible fix is attached. Instead of new_file being renamed, its contents are copied back to the original file. If file.truncate() is available, the mailbox is then truncated to size. Otherwise, if truncation is required, it's truncated to zero length beforehand by reopening self._path with mode wb+. In the latter case, there's a check to see if the mailbox was replaced while we weren't looking, but there's still a race condition. Any alternative ideas? Incidentally, this fixes a problem whereby Postfix wouldn't deliver to the replacement file as it had the execute bit set. -- >Comment By: A.M. Kuchling (akuchling) Date: 2007-01-05 14:51 Message: Logged In: YES user_id=11375 Originator: NO Question about mailbox-update-doc: the add() method still returns self._next_key - 1; should this be self._next_user_key - 1? The keys in _user_toc are the ones returned to external users of the mailbox, right? (A good test case would be to initialize _next_key to 0 and _next_user_key to a different value like 123456.) I'm still staring at the patch, trying to convince myself that it will help -- haven't spotted any problems, but this bug is making me nervous... -- Comment By: A.M. Kuchling (akuchling) Date: 2007-01-05 14:24 Message: Logged In: YES user_id=11375 Originator: NO As a step toward improving matters, I've attached the suggested doc patch (for both 25-maint and trunk). It encourages people to use Maildir :), explicitly states that modifications should be bracketed by lock(), and fixes the examples to match. It does not say that keys are invalidated by doing a flush(), because we're going to try to avoid the necessity for that. File Added: mailbox-docs.diff -- Comment By: A.M. Kuchling (akuchling) Date: 2006-12-20 14:48 Message: Logged In: YES user_id=11375 Originator: NO Committed length-checking.diff to trunk in rev. 53110. -- Comment By: David Watson (baikie) Date: 2006-12-20 14:19 Message: Logged In: YES user_id=1504904 Originator: YES File Added: mailbox-test-lock.diff -- Comment By: David Watson (baikie) Date: 2006-12-20 14:17 Message: Logged In: YES user_id=1504904 Originator: YES Yeah, I think that should definitely go in. ExternalClashError or a subclass sounds fine to me (although you could make a whole taxonomy of these errors, really). It would be good to have the code actually keep up with other programs' changes, though; a program might just want to count the messages at first, say, and not make changes until much later. I've been trying out the second option (patch attached, to apply on top of mailbox-copy-back), regenerating _toc on locking, but preserving existing keys. The patch allows existing _generate_toc()s to work unmodified, but means that _toc now holds the entire last known contents of the mailbox file, with the 'pending' (user-visible) mailbox state being held in a new attribute, _user_toc, which is a mapping from keys issued to the program to the keys of _toc (i.e. sequence numbers in the file). When _toc is updated, any new messages that have appeared are given keys in _user_toc that haven't been issued before, and any messages that have disappeared are removed from it. The code basically assumes that messages with the same sequence number are the sam
[ python-Feature Requests-698900 ] Provide " plucker" format docs.
Feature Requests item #698900, was opened at 2003-03-06 13:45 Message generated for change (Settings changed) made by akuchling You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=355470&aid=698900&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. >Category: Documentation >Group: None Status: Open Resolution: None Priority: 4 Private: No Submitted By: Fred L. Drake, Jr. (fdrake) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Provide "plucker" format docs. Initial Comment: There have been a few requests for documents to be provided in the "plucker" format for use on PDAs. Plucker has the adantage of being free software (both in terms of liberty and price), whereas iSilo is merely low-priced (free in some flavors?). Information on Plucker can be found at www.plkr.org. Documentation for the conversion tool appears slim. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=355470&aid=698900&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-956303 ] Update pickle docs to describe format of persistent IDs
Bugs item #956303, was opened at 2004-05-18 18:45 Message generated for change (Settings changed) made by akuchling You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=956303&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Allan Crooks (amc1) Assigned to: Nobody/Anonymous (nobody) >Summary: Update pickle docs to describe format of persistent IDs Initial Comment: There is a bug in save_pers in both the pickle and cPickle modules in Python. It occurs when someone uses a Pickler instance which is using an ASCII protocol and also has persistent_id defined which can return a persistent reference that can contain newline characters in. The current implementation of save_pers in the pickle module is as follows: def save_pers(self, pid): # Save a persistent id reference if self.bin: self.save(pid) self.write(BINPERSID) else: self.write(PERSID + str(pid) + '\n') The else clause assumes that the 'pid' will not be a string which one or more newline characters. If the pickler pickles a persistent ID which has a newline in it, then an unpickler with a corresponding persistent_load method will incorrectly unpickle the data - usually interpreting the character after the newline as a marker indicating what type of data should be expected (usually resulting in an exception being raised when the remaining data is not in the format expected). I have attached an example file which illustrates in what circumstances the error occurs. Workarounds for this bug are: 1) Use binary mode for picklers. 2) Modify subclass implementations of save_pers to ensure that newlines are not returned for persistent ID's. Although you may assume in general that this bug would only occur on rare occasions (due to the unlikely situation where someone would implement persistent_id so that it would return a string with a newline character embedded), it may occur more frequently if the subclass implementation of persistent_id uses a string which has been constructed using the marshal module. This bug was discovered when our code implemented the persistent_id method, which was returning the marshalled format of a tuple which contained strings. It occurred when one or more of the strings had a length of ten characters - the marshalled format of that string contains the string's length, where the byte used to represent the number 10 is the same as the one which represents the newline character: >>> marshal.dumps('a' * 10) 's\n\x00\x00\x00aa' >>> chr(10) '\n' I have replicated this bug on Python 1.5.2 and Python 2.3b1, and I believe it is present on all 2.x versions of Python. Many thanks to SourceForge user (and fellow colleague) SMST who diagnosed the bug and provided the test cases attached. -- Comment By: Martin v. Löwis (loewis) Date: 2006-07-03 08:41 Message: Logged In: YES user_id=21627 Also lowering the priority. amc1, if you are still interested, are you willing to provide a documentation patch? -- Comment By: Tim Peters (tim_one) Date: 2004-11-07 17:40 Message: Logged In: YES user_id=31435 Unassigned myself (I don't have time for it), but changed the Category to Documentation. (Changing what a persistent ID can be would need to be a new feature request.) -- Comment By: Allan Crooks (amc1) Date: 2004-05-19 11:30 Message: Logged In: YES user_id=39733 I would at least like the documentation modified to make it clearer that certain characters are not permitted for persistent ID's. I think the text which indicates the requirement of printable ASCII characters is too subtle - there should be something which makes the requirement more obvious, the use of a "must" or "should" would help get the point across (as would some text after the statement indicating that characters such as '\b', '\n', '\r' are not permitted). Perhaps it would be an idea for save_pers to do some argument checking before storing the persistent ID, perhaps using an assertion statement to verify that the ID doesn't contain non-permitted characters (or at least, checking for the presence of a '\n' character embedded in the string). I think it is preferable to have safeguards implemented in Pickler to prevent potentially dodgy data being stored - I would rather have an error raised when I'm trying to pickle something than have the data stored and corrupted, only to notice it when it is unpickled (when it is too late). Confusingly, the code in save_pers in the pickle module seems to indicate that i
[ python-Bugs-1629125 ] Incorrect type in PyDict_Next() example code
Bugs item #1629125, was opened at 2007-01-05 15:15 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1629125&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Jason Evans (jasonevans) Assigned to: Nobody/Anonymous (nobody) Summary: Incorrect type in PyDict_Next() example code Initial Comment: In the PyDict_Next() documentation, there are two example snippets of code. In both snippets, the line: int pos = 0; should instead be: ssize_t pos = 0; or perhaps: Py_ssize_t pos = 0; On an LP64 system, the unfixed snippets will cause a compiler warning due to size mismatch between int and ssize_t. Using Python 2.5 on RHEL WS 4, x86_64. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1629125&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1629125 ] Incorrect type in PyDict_Next() example code
Bugs item #1629125, was opened at 2007-01-05 18:15 Message generated for change (Settings changed) made by rhettinger You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1629125&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Jason Evans (jasonevans) >Assigned to: Neal Norwitz (nnorwitz) Summary: Incorrect type in PyDict_Next() example code Initial Comment: In the PyDict_Next() documentation, there are two example snippets of code. In both snippets, the line: int pos = 0; should instead be: ssize_t pos = 0; or perhaps: Py_ssize_t pos = 0; On an LP64 system, the unfixed snippets will cause a compiler warning due to size mismatch between int and ssize_t. Using Python 2.5 on RHEL WS 4, x86_64. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1629125&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1628902 ] xml.dom.minidom parse bug
Bugs item #1628902, was opened at 2007-01-05 17:37 Message generated for change (Comment added) made by loewis You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1628902&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.4 >Status: Closed >Resolution: Duplicate Priority: 5 Private: No Submitted By: Pierre Imbaud (pmi) Assigned to: Nobody/Anonymous (nobody) Summary: xml.dom.minidom parse bug Initial Comment: xml.dom.minidom was unable to parse an xml file that came from an example provided by an official organism.(http://www.iptc.org/IPTC4XMP) The parsed file was somewhat hairy, but I have been able to reproduce the bug with a simplified version, attached. (ends with .xmp: its supposed to be an xmp file, the xmp standard being built on xml. Well, thats the short story). The offending part is the one that goes: xmpPLUS='' it triggers an exception: ValueError: too many values to unpack, in _parse_ns_name. Some debugging showed an obvious mistake in the scanning of the name argument, that goes beyond the closing " ' ". I digged a little further thru a pdb session, but the bug seems to be located in c code. Thats the very first time I report a bug, chances are I provide too much or too little information... To whoever it may concern, here is the invoking code: from xml.dom import minidom ... class xmp(dict): def __init__(self, inStream): xmldoc = minidom.parse(inStream) x = xmp('/home/pierre/devt/port/IPTCCore-Full/x.xmp') traceback: /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xmpLib.py in __init__(self, inStream) 26 def __init__(self, inStream): 27 print minidom ---> 28 xmldoc = minidom.parse(inStream) 29 xmpmeta = xmldoc.childNodes[1] 30 rdf = xmpmeta.childNodes[1] /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/nxml/dom/minidom.py in parse(file, parser, bufsize) /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xml/dom/expatbuilder.py in parse(file, namespaces) 922 fp = open(file, 'rb') 923 try: --> 924 result = builder.parseFile(fp) 925 finally: 926 fp.close() /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xml/dom/expatbuilder.py in parseFile(self, file) 205 if not buffer: 206 break --> 207 parser.Parse(buffer, 0) 208 if first_buffer and self.document.documentElement: 209 self._setup_subset(buffer) /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xml/dom/expatbuilder.py in start_element_handler(self, name, attributes) 743 def start_element_handler(self, name, attributes): 744 if ' ' in name: --> 745 uri, localname, prefix, qname = _parse_ns_name(self, name) 746 else: 747 uri = EMPTY_NAMESPACE /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xml/dom/expatbuilder.py in _parse_ns_name(builder, name) 125 localname = intern(localname, localname) 126 else: --> 127 uri, localname = parts 128 prefix = EMPTY_PREFIX 129 qname = localname = intern(localname, localname) ValueError: too many values to unpack The offending c statement: /usr/src/packages/BUILD/Python-2.4/Modules/pyexpat.c(582)StartElement() The returned 'name': (Pdb) name Out[5]: u'XMP Photographic Licensing Universal System (xmpPLUS, http://ns.adobe.com/xap/1.0/PLUS/) CreditLineReq xmpPLUS' Its obvious the scanning went beyond the attribute. -- >Comment By: Martin v. Löwis (loewis) Date: 2007-01-06 01:46 Message: Logged In: YES user_id=21627 Originator: NO Dupe of 1627096 -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1628902&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1628484 ] Python 2.5 64 bit compile fails on Solaris 10/gcc 4.1.1
Bugs item #1628484, was opened at 2007-01-05 09:45 Message generated for change (Comment added) made by loewis You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1628484&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Build Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Bob Atkins (bobatkins) Assigned to: Nobody/Anonymous (nobody) Summary: Python 2.5 64 bit compile fails on Solaris 10/gcc 4.1.1 Initial Comment: This looks like a recurring and somewhat sore topic. For those of us that have been fighting the dreaded: ./Include/pyport.h:730:2: error: #error "LONG_BIT definition appears wrong for platform (bad gcc/glibc config?)." when performing a 64 bit compile. I believe I have identified the problems. All of which are directly related to the Makefile(s) that are generated as part of the configure script. There does not seem to be anything wrong with the configure script or anything else once all of the Makefiles are corrected Python will build 64 bit Although it is possible to pass the following environment variables to configure as is typical on most open source software: CC C compiler command CFLAGS C compiler flags LDFLAGS linker flags, e.g. -L if you have libraries in a nonstandard directory CPPFLAGSC/C++ preprocessor flags, e.g. -I if you have headers in a nonstandard directory CPP C preprocessor These flags are *not* being processed through to the generated Makefiles. This is where the problem is. configure is doing everything right and generating all of the necessary stuff for a 64 bit compile but when the compile is actually performed - the necessary CFLAGS are missing and a 32 bit compile is initiated. Taking a close look at the first failure I found the following: gcc -pthread -c -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -I. -I./Include -fPIC -DPy_BUILD_CORE -o Modules/python.o ./Modules/python.c Where are my CFLAGS??? I ran the configure with: CFLAGS="-O3 -m64 -mcpu=ultrasparc -L/opt/lib/sparcv9 -R/opt/lib/sparcv9" \ CXXFLAGS="-O3 -m64 -mcpu=ultrasparc -L/opt/lib/sparcv9 -R/opt/lib/sparcv9" \ LDFLAGS="-m64 -L/opt/lib/sparcv9 -R/opt/lib/sparcv9" \ ./configure --prefix=/opt \ --enable-shared \ --libdir=/opt/lib/sparcv9 Checking the config.log and config.status it was clear that the flags were used properly as the configure script ran however, the failure is in the various Makefiles to actually reference the CFLAGS and LDFLAGS. LDFLAGS is simply not included in any of the link stages in the Makefiles and CFLAGS is overidden by BASECFLAGS, OPT and EXTRA_CFLAGS! Ah! EXTRA_CFLAGS="-O3 -m64 -mcpu=ultrasparc -L/opt/lib/sparcv9 -R/opt/lib/sparcv9" \ make Actually got the core parts to compile for the library and then failed to build the library because - LDFLAGS was missing from the Makefile for the library link stage - :-( Close examination suggests that the OPT environment variable could be used to pass the necessary flags through from conifgure but this still did not help the link stage problems. The fixes are pretty minimal to ensure that the configure variables are passed into the Makefile. My patch to the Makefile.pre.in is attached to this bug report. Once these changes are made Python will build properly for both 32 and 64 bit platforms with the correct CFLAGS and LDFLAGS passed into the configure script. BTW, while this bug is reported under a Solaris/gcc build the patches to Makefile.pre.in should fix similar build issues on all platforms. -- >Comment By: Martin v. Löwis (loewis) Date: 2007-01-06 01:52 Message: Logged In: YES user_id=21627 Originator: NO Can you please report what the actual problem is that you got? I doubt it's the #error, as that error is generated by the preprocessor, yet your fix seems to deal with LDFLAGS only. So please explain what command you invoked, what the actual output was, and what the expected output was. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1628484&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1409443 ] frame->f_lasti not always correct
Bugs item #1409443, was opened at 2006-01-18 16:57 Message generated for change (Comment added) made by rhettinger You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1409443&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Interpreter Core Group: Python 2.5 >Status: Closed >Resolution: Fixed Priority: 5 Private: No Submitted By: John Ehresman (jpe) Assigned to: Raymond Hettinger (rhettinger) Summary: frame->f_lasti not always correct Initial Comment: Contrary to the comment in ceval.c, the f_lasti field is not always correct because it is not updated by the PREDICT / PREDICTED macros. This means that when a GET_ITER is followed by a FOR_ITER, f_lasti will be left at the index of the GET_ITER the first time FOR_ITER is executed. I don't think this is a problem for YIELD_VALUE because it's not predicted to follow any other opcode. I'm running into this when examining bytecode in calling frames within a debugger callback. I suggest either documenting that f_lasti may be incorrect or adjusting it in the PREDICTED macro. -- >Comment By: Raymond Hettinger (rhettinger) Date: 2007-01-05 20:17 Message: Logged In: YES user_id=80475 Originator: NO Expanded comment in rev 53285. IMO, the f->f_lasti is not incorrect. In effect, a successful prediction links the opcodes so that two codes function as a single new code (GET_ITER, FOR_ITER) --> GET_ITER_FOR_ITER. -- Comment By: Neal Norwitz (nnorwitz) Date: 2006-01-19 00:32 Message: Logged In: YES user_id=33168 Raymond? Given that PREDICTED was added for performance, I would lean toward updating the doc. I didn't look at the code, but I'm pretty sure John's description is accurate. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1409443&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1514428 ] NaN comparison in Decimal broken
Bugs item #1514428, was opened at 2006-06-29 11:19 Message generated for change (Comment added) made by rhettinger You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1514428&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.5 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Nick Maclaren (nmm) >Assigned to: Tim Peters (tim_one) Summary: NaN comparison in Decimal broken Initial Comment: Methinks this is a bit off :-) True should be False. Python 2.5b1 (trunk:47059, Jun 29 2006, 14:26:46) [GCC 4.1.0 (SUSE Linux)] on linux2 >>> import decimal >>> d = decimal.Decimal >>> inf = d("inf") >>> nan = d("nan") >>> nan > inf True >>> nan < inf False >>> inf > nan True >>> inf < nan False b -- >Comment By: Raymond Hettinger (rhettinger) Date: 2007-01-05 21:05 Message: Logged In: YES user_id=80475 Originator: NO The Decimal Arithmetic Specification says that NaN comparisons should return NaN. The decimal module correctly implements this through the compare() method: >>> nan.compare(nan) Decimal('NaN') Since python's < and > operators return a boolean result, the standard is silent on what should be done. The current implementation uses the __cmp__ method which can only return -1, 0, or 1, so there is not a direct way to make both < and > both return False. If you want to go beyond the standard and have both < and > return False for all NaN comparisons, then the __cmp__ implementation would need to be replaced with rich comparisons. I'm not sure that this is desirable. IMO, that would be no better than the current arbitrary choice where all comparisons involving NaN report self > other. If someone has an application that would be harmed by the current implementation, then it should almost certainly be use the standard compliant compare() method instead of the boolean < and > operators. Tim, what say you? -- Comment By: CharlesMerriam (charlesmerriam) Date: 2006-08-23 03:43 Message: Logged In: YES user_id=1581732 More specifically, any comparison with a NaN should equal False, even inf, per IEEE 754. A good starting point to convince oneself of this is http://en.wikipedia.org/wiki/NaN. -- Comment By: Nick Maclaren (nmm) Date: 2006-07-13 05:35 Message: Logged In: YES user_id=42444 It's still there in Beta 2. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1514428&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1105286 ] Undocumented implicit strip() in split(None) string method
Bugs item #1105286, was opened at 2005-01-19 10:04 Message generated for change (Comment added) made by rhettinger You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105286&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: None >Status: Closed Resolution: Fixed Priority: 5 Private: No Submitted By: YoHell (yohell) Assigned to: Raymond Hettinger (rhettinger) Summary: Undocumented implicit strip() in split(None) string method Initial Comment: Hi! I noticed that the string method split() first does an implicit strip() before splitting when it's used with no arguments or with None as the separator (sep in the docs). There is no mention of this implicit strip() in the docs. Example 1: s = " word1 word2 " s.split() then returns ['word1', 'word2'] and not ['', 'word1', 'word2', ''] as one might expect. WHY IS THIS BAD? 1. Because it's undocumented. See: http://www.python.org/doc/current/lib/string-methods.html#l2h-197 2. Because it may lead to unexpected behavior in programs. Example 2: FASTA sequence headers are one line descriptors of biological sequences and are on this form: ">" + Identifier + whitespace + free text description. Let sHeader be a Python string containing a FASTA header. One could then use the following syntax to extract the identifier from the header: sID = sHeader[1:].split(None, 1)[0] However, this does not work if sHeader contains a faulty FASTA header where the identifier is missing or consists of whitespace. In that case sID will contain the first word of the free text description, which is not the desired behavior. WHAT SHOULD BE DONE? The implicit strip() should be removed, or at least should programmers be given the option to turn it off. At the very least it should be documented so that programmers have a chance of adapting their code to it. Thank you for an otherwise splendid language! /Joel Hedlund Ph.D. Student IFM Bioinformatics Link�ping University -- >Comment By: Raymond Hettinger (rhettinger) Date: 2007-01-05 21:16 Message: Logged In: YES user_id=80475 Originator: NO I think the current wording is clear enough and that further attempts to specify corner cases will only make the docs harder to understand and less useful. -- Comment By: YoHell (yohell) Date: 2006-11-07 09:11 Message: Logged In: YES user_id=1008220 *resubmission: grammar corrected* I'm opening this again, since the docs still don't reflect the behavior of the method. from the docs: """ If sep is not specified or is None, a different splitting algorithm is applied. First, whitespace characters (spaces, tabs, newlines, returns, and formfeeds) are stripped from both ends. """ This is not true when maxsplit is given. Example: >>> " foo bar ".split(None) ['foo', 'bar'] >>> " foo bar ".split(None, 1) ['foo', 'bar '] Whitespace is obviously not stripped from the ends before the rest of the string is split. -- Comment By: YoHell (yohell) Date: 2006-11-07 09:06 Message: Logged In: YES user_id=1008220 I'm opening this again, since the docs still don't reflect the behavior of the method. from the docs: """ If sep is not specified or is None, a different splitting algorithm is applied. First, whitespace characters (spaces, tabs, newlines, returns, and formfeeds) are stripped from both ends. """ This is not true when maxsplit is given. Example: >>> " foo bar ".split(None) ['foo', 'bar'] >>> " foo bar ".split(None, 1) ['foo', 'bar '] Whitespace is obviously not stripping whitespace from the ends of the string before splitting the rest of the string. -- Comment By: Wummel (calvin) Date: 2005-01-24 07:51 Message: Logged In: YES user_id=9205 This should probably also be added to rsplit()? -- Comment By: Terry J. Reedy (tjreedy) Date: 2005-01-24 02:15 Message: Logged In: YES user_id=593130 To me, the removal of whitespace at the ends (stripping) is consistent with the removal (or collapsing) of extra whitespace in between so that .split() does not return empty words anywhere. Consider: >>> ',1,,2,'.split(',') ['', '1', '', '2', ''] If ' 1 2 '.split() were to return null strings at the beginning and end of the list, then to be consistent, it should also put one in the middle. One can get this by being explicit (mixed WS can be handled by translation): >>> ' 1 2 '.split(' ') ['', '1', '', '2', ''] Having said this, I also agree that the extra words proposed by jj are helpful. BUG?? In 2.2, splitting an empty or whitespace only s
[ python-Bugs-1380970 ] split() description not fully accurate
Bugs item #1380970, was opened at 2005-12-14 18:33 Message generated for change (Comment added) made by rhettinger You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1380970&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: Python 2.4 >Status: Closed Resolution: None Priority: 5 Private: No Submitted By: K.C. (kace) Assigned to: Raymond Hettinger (rhettinger) Summary: split() description not fully accurate Initial Comment: The page http://docs.python.org/lib/string-methods.html reads, in part, "If sep is not specified or is None, a different splitting algorithm is applied. First, whitespace characters (spaces, tabs, newlines, returns, and formfeeds) are stripped from both ends." However, this is not the behaviour that I'm seeing. (Although, I should note that I'd find the described behaviour more desirable.) Example, >>> trow = '1586\tsome-int-name\tNODES: 111_222\n' >>> print trow 1234some-int-name NODES: 111_222 >>> trow.split(None,2) ['1234', 'some-int-name', 'NODES: 111_222\n'] # end example. Notice that the trailing newline has not been stripped as the documentation said it should be. Thanks all. K.C. -- >Comment By: Raymond Hettinger (rhettinger) Date: 2007-01-05 21:19 Message: Logged In: YES user_id=80475 Originator: NO I prefer the docs as they currently read. -- Comment By: Collin Winter (collinwinter) Date: 2006-01-26 11:04 Message: Logged In: YES user_id=1344176 I've provided a patch for this: #1414934. -- Comment By: K.C. (kace) Date: 2005-12-14 18:36 Message: Logged In: YES user_id=741142 Also, (oops) the example comes from the most recent version: $ python Python 2.4.2 (#2, Oct 4 2005, 13:57:10) [GCC 3.4.2 [FreeBSD] 20040728] on freebsd5 Type "help", "copyright", "credits" or "license" for more information. >>> -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1380970&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1414673 ] Underspecified behaviour of string methods split, rsplit
Bugs item #1414673, was opened at 2006-01-25 10:23 Message generated for change (Comment added) made by rhettinger You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1414673&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: Python 2.4 >Status: Closed >Resolution: Wont Fix Priority: 5 Private: No Submitted By: Collin Winter (collinwinter) Assigned to: Raymond Hettinger (rhettinger) Summary: Underspecified behaviour of string methods split, rsplit Initial Comment: The documentation for the string methods split and rsplit do not address the case where sep=None and maxsplit=0. Should this strip off the leading and trailing whitespace, but not do any splits? Should it simply return the invocant string? -- >Comment By: Raymond Hettinger (rhettinger) Date: 2007-01-05 21:23 Message: Logged In: YES user_id=80475 Originator: NO The docs for split() and rsplit() have reached their limits of complexity. Let the corner cases be defined by what the implementation currently does. IMO, any more attempts to expand these docs can only result in a decrease in clarity and usability. What is there now does a good job at showing you what you need to know to use the methods effectively. -- Comment By: Matt Fleming (splitscreen) Date: 2006-02-18 07:12 Message: Logged In: YES user_id=1126061 >From the documentation of split() "If maxsplit is given, splits at no more than maxsplit places (resulting in at most maxsplit+1 words)." I know that at the moment rsplit() and split() remove any leading whitespace but leave trailing space intact, but I would have thought leaving the string entirely intact would make more sense. Surely, to comply with the statement 'resulting in at most maxsplit+1 words)' the entire string should be returned when maxsplit=0. I can see the point that the leading whitespace isn't actually returned but i don't see why it should be discarded. Just a thought. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1414673&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1472695 ] 32/64bit pickled Random incompatiblity
Bugs item #1472695, was opened at 2006-04-18 20:10 Message generated for change (Settings changed) made by rhettinger You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1472695&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.4 >Status: Closed >Resolution: Wont Fix Priority: 5 Private: No Submitted By: Peter Maxwell (pm67nz) Assigned to: Raymond Hettinger (rhettinger) Summary: 32/64bit pickled Random incompatiblity Initial Comment: The unsigned long integers which make up the state of a Random instance are converted to Python integers via a cast to long in _randommodule.c's random_getstate function, so on a 32bit platform Random.getstate() returns a mix of postitive and negative integers, while on a 64bit platform the negative numbers are replaced by larger positive numbers, their 32bit-2s-complement equivalents. As a result, unpicking a Random instance from a 64bit machine on a 32bit platform produces the error "OverflowError: long int too large to convert to int". Unpickling a 32bit Random on a 64bit machine succeeds, but the resulting object is in a slightly confused state: >>> r32 = cPickle.load(open('r32_3.pickle')) >>> for i in range(3): ... print r64.random(), r32.random() ... 0.237964627092 4292886520.32 0.544229225296 0.544229225296 0.369955166548 4292886520.19 -- Comment By: Tim Peters (tim_one) Date: 2006-04-25 19:26 Message: Logged In: YES user_id=31435 > do you think we should require that the world not > change for 32-bit pickles? I don't understand the question. If a pre-2.5 pickle here can be read in 2.5, where both producer & consumer are the same 32-vs-64 bit choice; and a 2.5+ pickle here is portable between 32- and 64- boxes, I'd say "good enough". While desirable, it's not really critical that a 2.5 pickle here be readable by an older Python. While that's critical for pickle in general, and critical too for everyone-uses-'em types (ints, strings, lists, ...), when fixing a bug in a specific rarely-used type's pickling strategy some slop is OK. IOW, it's just not worth heroic efforts to hide all pain. The docs should mention incompatibilities, though. Does that answer the question? -- Comment By: Raymond Hettinger (rhettinger) Date: 2006-04-25 18:00 Message: Logged In: YES user_id=80475 Tim, do you think we should require that the world not change for 32-bit pickles? -- Comment By: Peter Maxwell (pm67nz) Date: 2006-04-21 01:03 Message: Logged In: YES user_id=320286 OK, here is a candidate patch, though I don't know if it is the best way to do it or meets the style guidelines etc. It makes Random pickles interchangable between 32bit and 64bit machines by encoding their states as Python long integers. An old pre-patch 32bit pickle loaded on a 64bit machine still fails (OverflowError: can't convert negative value to unsigned long) but I hope that combination is rare enough to ignore. Also on a 32bit machine new Random pickles can't be unpickled by a pre-patch python, but again there are limits to sane backward compatability. -- Comment By: Neal Norwitz (nnorwitz) Date: 2006-04-19 02:02 Message: Logged In: YES user_id=33168 Peter, thanks for the report. Do you think you could work up a patch to correct this problem? -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1472695&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1486663 ] Over-zealous keyword-arguments check for built-in set class
Bugs item #1486663, was opened at 2006-05-11 11:17 Message generated for change (Comment added) made by rhettinger You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1486663&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Interpreter Core Group: Python 2.4 Status: Open Resolution: None Priority: 6 Private: No Submitted By: dib (dib_at_work) >Assigned to: Georg Brandl (gbrandl) Summary: Over-zealous keyword-arguments check for built-in set class Initial Comment: The fix for bug #1119418 (xrange() builtin accepts keyword arg silently) included in Python 2.4.2c1+ breaks code that passes keyword argument(s) into classes derived from the built-in set class, even if those derived classes explictly accept those keyword arguments and avoid passing them down to the built-in base class. Simplified version of code in attached BuiltinSetKeywordArgumentsCheckBroken.py fails at (G) due to bug #1119418 if version < 2.4.2c1; if version >= 2.4.2c1 (G) passes thanks to that bug fix, but instead (H) incorrectly-in-my-view fails. [Presume similar cases would fail for xrange and the other classes mentioned in #1119418.] -- David Bruce (Tested on 2.4, 2.4.2, 2.5a2 on linux2, win32.) -- >Comment By: Raymond Hettinger (rhettinger) Date: 2007-01-05 21:26 Message: Logged In: YES user_id=80475 Originator: NO I prefer the approach used by list(). -- Comment By: Žiga Seilnacht (zseil) Date: 2006-05-19 20:19 Message: Logged In: YES user_id=1326842 See patch #1491939 -- Comment By: Žiga Seilnacht (zseil) Date: 2006-05-19 15:02 Message: Logged In: YES user_id=1326842 This bug was introduced as part of the fix for bug #1119418. It also affects collections.deque. Can't the _PyArg_NoKeywords check simply be moved to set_init and deque_init as it was done for zipimport.zipimporter? array.array doesn't need to be changed, since it already does all of its initialization in its __new__ method. The rest of the types changed in that fix should not be affected, since they are immutable. -- Comment By: Georg Brandl (gbrandl) Date: 2006-05-11 12:23 Message: Logged In: YES user_id=849994 Raymond, what to do in this case? Note that other built-in types, such as list(), do accept keyword arguments. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1486663&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com