[ python-Bugs-1460493 ] Why not drop the _active list?
Bugs item #1460493, was opened at 2006-03-29 07:16 Message generated for change (Comment added) made by atila-cheops You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1460493&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: HVB bei TUP (hvb_tup) Assigned to: Nobody/Anonymous (nobody) Summary: Why not drop the _active list? Initial Comment: I am using a modified version of subprocess.py, where I have removed the _active list and all references to it. I have tested it (under Windows 2000) and there were no errors. So what is the reason for managing the _active list at all? Why not drop it? -- Comment By: cheops (atila-cheops) Date: 2006-03-30 08:04 Message: Logged In: YES user_id=1276121 what happens if you are doing a _cleanup (iterating over a copy of _active) in multiple threads? can it not happen then that you clean up a process 2 times? thread 1 starts a _cleanup: makes a copy of _active[:] and starts polling thread 2 starts a _cleanup: makes a copy of _active[:] and starts polling thread 1 encounters a finished process and removes it from _active[] thread 2 does not know the process is removed, finds the same process finished and tries to remove it from _active but this fails, because thread 1 removed it already so the action of cleaning up should maybe be serialized if 1 thread is doing it, the other one should block everyone who needs this can of course patch the subprocess.py file, but shouldn't this be fixed in the library? -- Comment By: Neal Norwitz (nnorwitz) Date: 2006-03-30 07:43 Message: Logged In: YES user_id=33168 If you always called wait() the _active list isn't beneficial to you. However, many people do not call wait and the _active list provides a mechanism to cleanup zombied children. This is important for many users. If you need thread safely, you can handle the locking yourself before calling poll()/wait(). -- Comment By: Martin v. Löwis (loewis) Date: 2006-03-29 20:41 Message: Logged In: YES user_id=21627 The purpose of the _active list is to wait(2) for open processes. It needs to stay. -- Comment By: Tristan Faujour (tfaujour) Date: 2006-03-29 13:59 Message: Logged In: YES user_id=1488657 I agree. The use of _active makes subprocess.py thread-UNsafe. See also: Bug #1199282 In order to have a thread-safe subprocess.py, I commented out the call to _cleanup() in Popen.__init__(). As a side effect, _active becomes useless. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1460493&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1460493 ] Why not drop the _active list?
Bugs item #1460493, was opened at 2006-03-29 07:16 Message generated for change (Comment added) made by atila-cheops You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1460493&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: HVB bei TUP (hvb_tup) Assigned to: Nobody/Anonymous (nobody) Summary: Why not drop the _active list? Initial Comment: I am using a modified version of subprocess.py, where I have removed the _active list and all references to it. I have tested it (under Windows 2000) and there were no errors. So what is the reason for managing the _active list at all? Why not drop it? -- Comment By: cheops (atila-cheops) Date: 2006-03-30 08:06 Message: Logged In: YES user_id=1276121 the same problem probably exists in popen2.py there _active is also used so if something is fixed in subprocess.py, it should probably also be fixed in popen2.py -- Comment By: cheops (atila-cheops) Date: 2006-03-30 08:04 Message: Logged In: YES user_id=1276121 what happens if you are doing a _cleanup (iterating over a copy of _active) in multiple threads? can it not happen then that you clean up a process 2 times? thread 1 starts a _cleanup: makes a copy of _active[:] and starts polling thread 2 starts a _cleanup: makes a copy of _active[:] and starts polling thread 1 encounters a finished process and removes it from _active[] thread 2 does not know the process is removed, finds the same process finished and tries to remove it from _active but this fails, because thread 1 removed it already so the action of cleaning up should maybe be serialized if 1 thread is doing it, the other one should block everyone who needs this can of course patch the subprocess.py file, but shouldn't this be fixed in the library? -- Comment By: Neal Norwitz (nnorwitz) Date: 2006-03-30 07:43 Message: Logged In: YES user_id=33168 If you always called wait() the _active list isn't beneficial to you. However, many people do not call wait and the _active list provides a mechanism to cleanup zombied children. This is important for many users. If you need thread safely, you can handle the locking yourself before calling poll()/wait(). -- Comment By: Martin v. Löwis (loewis) Date: 2006-03-29 20:41 Message: Logged In: YES user_id=21627 The purpose of the _active list is to wait(2) for open processes. It needs to stay. -- Comment By: Tristan Faujour (tfaujour) Date: 2006-03-29 13:59 Message: Logged In: YES user_id=1488657 I agree. The use of _active makes subprocess.py thread-UNsafe. See also: Bug #1199282 In order to have a thread-safe subprocess.py, I commented out the call to _cleanup() in Popen.__init__(). As a side effect, _active becomes useless. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1460493&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1460493 ] Why not drop the _active list?
Bugs item #1460493, was opened at 2006-03-29 07:16 Message generated for change (Comment added) made by atila-cheops You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1460493&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: HVB bei TUP (hvb_tup) Assigned to: Nobody/Anonymous (nobody) Summary: Why not drop the _active list? Initial Comment: I am using a modified version of subprocess.py, where I have removed the _active list and all references to it. I have tested it (under Windows 2000) and there were no errors. So what is the reason for managing the _active list at all? Why not drop it? -- Comment By: cheops (atila-cheops) Date: 2006-03-30 08:17 Message: Logged In: YES user_id=1276121 also #1214859 is interesting, has a patch -- Comment By: cheops (atila-cheops) Date: 2006-03-30 08:06 Message: Logged In: YES user_id=1276121 the same problem probably exists in popen2.py there _active is also used so if something is fixed in subprocess.py, it should probably also be fixed in popen2.py -- Comment By: cheops (atila-cheops) Date: 2006-03-30 08:04 Message: Logged In: YES user_id=1276121 what happens if you are doing a _cleanup (iterating over a copy of _active) in multiple threads? can it not happen then that you clean up a process 2 times? thread 1 starts a _cleanup: makes a copy of _active[:] and starts polling thread 2 starts a _cleanup: makes a copy of _active[:] and starts polling thread 1 encounters a finished process and removes it from _active[] thread 2 does not know the process is removed, finds the same process finished and tries to remove it from _active but this fails, because thread 1 removed it already so the action of cleaning up should maybe be serialized if 1 thread is doing it, the other one should block everyone who needs this can of course patch the subprocess.py file, but shouldn't this be fixed in the library? -- Comment By: Neal Norwitz (nnorwitz) Date: 2006-03-30 07:43 Message: Logged In: YES user_id=33168 If you always called wait() the _active list isn't beneficial to you. However, many people do not call wait and the _active list provides a mechanism to cleanup zombied children. This is important for many users. If you need thread safely, you can handle the locking yourself before calling poll()/wait(). -- Comment By: Martin v. Löwis (loewis) Date: 2006-03-29 20:41 Message: Logged In: YES user_id=21627 The purpose of the _active list is to wait(2) for open processes. It needs to stay. -- Comment By: Tristan Faujour (tfaujour) Date: 2006-03-29 13:59 Message: Logged In: YES user_id=1488657 I agree. The use of _active makes subprocess.py thread-UNsafe. See also: Bug #1199282 In order to have a thread-safe subprocess.py, I commented out the call to _cleanup() in Popen.__init__(). As a side effect, _active becomes useless. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1460493&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1460340 ] random.sample can raise KeyError
Bugs item #1460340, was opened at 2006-03-28 19:05 Message generated for change (Comment added) made by rhettinger You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1460340&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: paul rubin (phr) >Assigned to: Nobody/Anonymous (nobody) Summary: random.sample can raise KeyError Initial Comment: I have only tested this in 2.3 and the relevant code in random.py has changed in the current svn branch, but from inspection it looks to me like the bug may still be there. If you initialize a dictionary as follows: a={}.fromkeys(range(10)+range(10,100,2)+range(100,110)) then random.sample(a,3) raises KeyError most times that you call it. -- >Comment By: Raymond Hettinger (rhettinger) Date: 2006-03-30 05:18 Message: Logged In: YES user_id=80475 Bah. I'm not overly concerned. It is a Python fact of life that objects defining __getitem__ cannot aways be clearly categorized as being either a sequence or a mapping but not both. You can add some additional checks like checking for a keys() method, but there is a limit to what you can do against these safe-cracking style efforts to trick the routine. I hope this quest for theoretical perfection doesn't lead to throwing the baby out with the bathwater. It would be ashamed to lose the automated choice of the best performing algorithm. If that happens, somebody's real-world use cases are certain to suffer. -- Comment By: Tim Peters (tim_one) Date: 2006-03-29 08:02 Message: Logged In: YES user_id=31435 Ya, this is flaky for dicts; re-opening. For example, >>> d = dict((i, complex(i, i)) for i in xrange(30)) >>> random.sample(d, 5) # ask for 5 and it samples values [(25+25j), (4+4j), (16+16j), (13+13j), (17+17j)] >>> random.sample(d, 6) # ask for 6 and it samples keys [20, 11, 9, 4, 14, 1] That one doesn't have to do with internal exceptions, it has only to do with which of sample()'s internal algorithms gets used. Paul's point about potential bias is also a worry. Here's a full example: """ from random import sample d = dict.fromkeys(range(24)) d['x'] = 'y' count = hits = 0 while 1: count += 1 hits += sample(d, 1) == ['x'] if count % 1 == 0: print count, "%.2f%%" % (100.0 * hits / count) """ Since len(d) == 25, I'd hope to see 'x' selected 1/25 = 4% of the time. Instead it gets selected 0.16% of the time (1/25**2 -- and Paul's analysis of why is on target). -- Comment By: paul rubin (phr) Date: 2006-03-29 05:07 Message: Logged In: YES user_id=72053 Actually the previous comment is wrong too; 99% of the time, sample(a,1) will return None since that's the value connected to every key in the dictionary, i.e. it's population[j] for every j. The other 1% of the time, the dict gets converted to a list, and the sample returns a key from the dict rather than a value, which is certainly wrong. And you can see how the probabilities are still messed up if the values in the dict are distinct. I think it's ok to give up on dicts, but some warning should about it be added to the manual unless dict-like things somehow get detected in the code. It would be best to test for the object really being a sequence, but I don't know if such a test exists. Maybe one should be added. I'll leave it to you guys to reopen this bug if appropriate. -- Comment By: paul rubin (phr) Date: 2006-03-29 04:46 Message: Logged In: YES user_id=72053 I don't think the fix at 43421 is quite right, but I can't easily test it in my current setup. Suppose a = dict.fromkeys(range(99) + ['x']) b = random.sample(a,1) 99% of the time, there's no KeyError and b gets set to [j] where j is some random integer. 1% of the time, there's a KeyError, random.sample is called recursively, and the recursive call returns [some integer j] 99% of the time, and returns ['x'] 1% of the time. So in total, ['x'] gets returned .01% of the time instead of 1% of the time. I think it's better to not set result[i]=population[j] inside the loop. Instead, just build up the selected set until it has enough indices; then try to make a result list using those indices, and if there's a KeyError, convert the population to a list and use the same selection set to make the results. gbrandl also correctly points out that a dict is not a sequence type, so maybe it's ok to just punt on dicts. But it's obvious from the code comments that somebody once wanted dicts to work, and it's reasonable
[ python-Bugs-1460605 ] Python 2.4.2 does not compile on SunOS 5.10 using gcc
Bugs item #1460605, was opened at 2006-03-29 13:22 Message generated for change (Comment added) made by schiotz You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1460605&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Build Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jakob Schiøtz (schiotz) Assigned to: Nobody/Anonymous (nobody) Summary: Python 2.4.2 does not compile on SunOS 5.10 using gcc Initial Comment: Core Python does not compile on my university's Sun server. $ ./configure --prefix=$HOME $ gmake [ lots of output deleted ] gcc -c -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -I. -I./Include -DPy_BUILD_CORE -o Objects/cobject.o Objects/cobject.c gcc -c -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -I. -I./Include -DPy_BUILD_CORE -o Objects/complexobject.o Objects/complexobject.c Objects/complexobject.c: In function `complex_pow': Objects/complexobject.c:476: error: invalid operands to binary == Objects/complexobject.c:476: error: wrong type argument to unary minus Objects/complexobject.c:476: error: invalid operands to binary == Objects/complexobject.c:476: error: wrong type argument to unary minus gmake: *** [Objects/complexobject.o] Error 1 $ uname -a SunOS hald 5.10 Generic_118822-18 sun4u sparc SUNW,Sun-Fire-15000 ~/src/Python-2.4.2 $ gcc --version gcc (GCC) 3.4.3 (csl-sol210-3_4-branch+sol_rpath) Copyright (C) 2004 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. -- >Comment By: Jakob Schiøtz (schiotz) Date: 2006-03-30 13:34 Message: Logged In: YES user_id=56465 Wow! You have released a new version while I compiled the old one. The new version compiles just fine. Thanks for your help! Jakob -- Comment By: Martin v. Löwis (loewis) Date: 2006-03-29 22:40 Message: Logged In: YES user_id=21627 Can you please try 2.4.3 instead? -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1460605&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1460605 ] Python 2.4.2 does not compile on SunOS 5.10 using gcc
Bugs item #1460605, was opened at 2006-03-29 11:22 Message generated for change (Comment added) made by gbrandl You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1460605&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Build Group: None >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: Jakob Schiøtz (schiotz) Assigned to: Nobody/Anonymous (nobody) Summary: Python 2.4.2 does not compile on SunOS 5.10 using gcc Initial Comment: Core Python does not compile on my university's Sun server. $ ./configure --prefix=$HOME $ gmake [ lots of output deleted ] gcc -c -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -I. -I./Include -DPy_BUILD_CORE -o Objects/cobject.o Objects/cobject.c gcc -c -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -I. -I./Include -DPy_BUILD_CORE -o Objects/complexobject.o Objects/complexobject.c Objects/complexobject.c: In function `complex_pow': Objects/complexobject.c:476: error: invalid operands to binary == Objects/complexobject.c:476: error: wrong type argument to unary minus Objects/complexobject.c:476: error: invalid operands to binary == Objects/complexobject.c:476: error: wrong type argument to unary minus gmake: *** [Objects/complexobject.o] Error 1 $ uname -a SunOS hald 5.10 Generic_118822-18 sun4u sparc SUNW,Sun-Fire-15000 ~/src/Python-2.4.2 $ gcc --version gcc (GCC) 3.4.3 (csl-sol210-3_4-branch+sol_rpath) Copyright (C) 2004 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. -- >Comment By: Georg Brandl (gbrandl) Date: 2006-03-30 11:58 Message: Logged In: YES user_id=849994 Closing as Fixed then. -- Comment By: Jakob Schiøtz (schiotz) Date: 2006-03-30 11:34 Message: Logged In: YES user_id=56465 Wow! You have released a new version while I compiled the old one. The new version compiles just fine. Thanks for your help! Jakob -- Comment By: Martin v. Löwis (loewis) Date: 2006-03-29 20:40 Message: Logged In: YES user_id=21627 Can you please try 2.4.3 instead? -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1460605&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1460493 ] Why not drop the _active list?
Bugs item #1460493, was opened at 2006-03-29 09:16 Message generated for change (Comment added) made by loewis You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1460493&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: HVB bei TUP (hvb_tup) Assigned to: Nobody/Anonymous (nobody) Summary: Why not drop the _active list? Initial Comment: I am using a modified version of subprocess.py, where I have removed the _active list and all references to it. I have tested it (under Windows 2000) and there were no errors. So what is the reason for managing the _active list at all? Why not drop it? -- >Comment By: Martin v. Löwis (loewis) Date: 2006-03-30 18:31 Message: Logged In: YES user_id=21627 attila-cheops, please read the 2.5 version of popen2. popen2 now only adds processes to _active in __del__, not in __init__, so the problem with the application still wanting to wait/poll is solved. Multiple threads simultaneously isn't a problem, since it is (or should be) harmless to invoke poll on a process that has already been waited for. For only one of the poll calls, the wait will actually succeed, and that should be the one that removes it from the _active list. The same strategy should be applied to subprocess. -- Comment By: cheops (atila-cheops) Date: 2006-03-30 10:17 Message: Logged In: YES user_id=1276121 also #1214859 is interesting, has a patch -- Comment By: cheops (atila-cheops) Date: 2006-03-30 10:06 Message: Logged In: YES user_id=1276121 the same problem probably exists in popen2.py there _active is also used so if something is fixed in subprocess.py, it should probably also be fixed in popen2.py -- Comment By: cheops (atila-cheops) Date: 2006-03-30 10:04 Message: Logged In: YES user_id=1276121 what happens if you are doing a _cleanup (iterating over a copy of _active) in multiple threads? can it not happen then that you clean up a process 2 times? thread 1 starts a _cleanup: makes a copy of _active[:] and starts polling thread 2 starts a _cleanup: makes a copy of _active[:] and starts polling thread 1 encounters a finished process and removes it from _active[] thread 2 does not know the process is removed, finds the same process finished and tries to remove it from _active but this fails, because thread 1 removed it already so the action of cleaning up should maybe be serialized if 1 thread is doing it, the other one should block everyone who needs this can of course patch the subprocess.py file, but shouldn't this be fixed in the library? -- Comment By: Neal Norwitz (nnorwitz) Date: 2006-03-30 09:43 Message: Logged In: YES user_id=33168 If you always called wait() the _active list isn't beneficial to you. However, many people do not call wait and the _active list provides a mechanism to cleanup zombied children. This is important for many users. If you need thread safely, you can handle the locking yourself before calling poll()/wait(). -- Comment By: Martin v. Löwis (loewis) Date: 2006-03-29 22:41 Message: Logged In: YES user_id=21627 The purpose of the _active list is to wait(2) for open processes. It needs to stay. -- Comment By: Tristan Faujour (tfaujour) Date: 2006-03-29 15:59 Message: Logged In: YES user_id=1488657 I agree. The use of _active makes subprocess.py thread-UNsafe. See also: Bug #1199282 In order to have a thread-safe subprocess.py, I commented out the call to _cleanup() in Popen.__init__(). As a side effect, _active becomes useless. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1460493&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1461610 ] xmlrpclib.binary doesn't exist
Bugs item #1461610, was opened at 2006-03-30 14:52 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1461610&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: Chris AtLee (catlee) Assigned to: Nobody/Anonymous (nobody) Summary: xmlrpclib.binary doesn't exist Initial Comment: The current 2.4 and 2.5 docs mention that the xmlrpclib has a function called binary which converts any python value to a Binary object. However, this function does not exist in either 2.4.3 or 2.5. The Binary constructor accepts a data parameter, so I would say just remove mention of the binary function from the docs and leave the implementation as-is. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1461610&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1461783 ] Invalid modes crash open()
Bugs item #1461783, was opened at 2006-03-31 00:50 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1461783&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Interpreter Core Group: Python 2.5 Status: Open Resolution: None Priority: 5 Submitted By: splitscreen (splitscreen) Assigned to: Nobody/Anonymous (nobody) Summary: Invalid modes crash open() Initial Comment: The 2.5a0 interpreter can be crashed by opening a file with an invalid mode, i.e open("bogus", "bogus") results in the Microsoft Visual C++ Debug Library spewing an "assertion failed" message with the expression ("Invalid file open mode", 0) in the file _open.c, line 98. Attached is a very rough patch that raises and IOError if the open mode doesn't begin with an 'a', 'w' or 'r' on the Windows platform. If this is something wrong at my end (and I feel it may be) please let me know. I'd be happy to offer any more information. BTW, I'm using Microsoft Visual Studio 2005. Thanks, Matt -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1461783&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1460340 ] random.sample can raise KeyError
Bugs item #1460340, was opened at 2006-03-28 19:05 Message generated for change (Comment added) made by tim_one You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1460340&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: paul rubin (phr) >Assigned to: Tim Peters (tim_one) Summary: random.sample can raise KeyError Initial Comment: I have only tested this in 2.3 and the relevant code in random.py has changed in the current svn branch, but from inspection it looks to me like the bug may still be there. If you initialize a dictionary as follows: a={}.fromkeys(range(10)+range(10,100,2)+range(100,110)) then random.sample(a,3) raises KeyError most times that you call it. -- >Comment By: Tim Peters (tim_one) Date: 2006-03-30 22:55 Message: Logged In: YES user_id=31435 Assigned to me. The current situation is unacceptable, because internal code comments and the test suite were left implying that random.sample() "should work" for dicts -- but it doesn't, and the failure modes are both subtle and silent. Note that my first example was utterly vanilla, using a small dict with a contiguous range of integer keys. That's not asking sample() to crack a safe, it's asking it to borrow candy from a dead baby ;-) I don't care a lot about the second example, but it would would also work right if dicts were forced into sample()'s first internal algorithm (and potential optimization be damned in the case of a dict). -- Comment By: Raymond Hettinger (rhettinger) Date: 2006-03-30 05:18 Message: Logged In: YES user_id=80475 Bah. I'm not overly concerned. It is a Python fact of life that objects defining __getitem__ cannot aways be clearly categorized as being either a sequence or a mapping but not both. You can add some additional checks like checking for a keys() method, but there is a limit to what you can do against these safe-cracking style efforts to trick the routine. I hope this quest for theoretical perfection doesn't lead to throwing the baby out with the bathwater. It would be ashamed to lose the automated choice of the best performing algorithm. If that happens, somebody's real-world use cases are certain to suffer. -- Comment By: Tim Peters (tim_one) Date: 2006-03-29 08:02 Message: Logged In: YES user_id=31435 Ya, this is flaky for dicts; re-opening. For example, >>> d = dict((i, complex(i, i)) for i in xrange(30)) >>> random.sample(d, 5) # ask for 5 and it samples values [(25+25j), (4+4j), (16+16j), (13+13j), (17+17j)] >>> random.sample(d, 6) # ask for 6 and it samples keys [20, 11, 9, 4, 14, 1] That one doesn't have to do with internal exceptions, it has only to do with which of sample()'s internal algorithms gets used. Paul's point about potential bias is also a worry. Here's a full example: """ from random import sample d = dict.fromkeys(range(24)) d['x'] = 'y' count = hits = 0 while 1: count += 1 hits += sample(d, 1) == ['x'] if count % 1 == 0: print count, "%.2f%%" % (100.0 * hits / count) """ Since len(d) == 25, I'd hope to see 'x' selected 1/25 = 4% of the time. Instead it gets selected 0.16% of the time (1/25**2 -- and Paul's analysis of why is on target). -- Comment By: paul rubin (phr) Date: 2006-03-29 05:07 Message: Logged In: YES user_id=72053 Actually the previous comment is wrong too; 99% of the time, sample(a,1) will return None since that's the value connected to every key in the dictionary, i.e. it's population[j] for every j. The other 1% of the time, the dict gets converted to a list, and the sample returns a key from the dict rather than a value, which is certainly wrong. And you can see how the probabilities are still messed up if the values in the dict are distinct. I think it's ok to give up on dicts, but some warning should about it be added to the manual unless dict-like things somehow get detected in the code. It would be best to test for the object really being a sequence, but I don't know if such a test exists. Maybe one should be added. I'll leave it to you guys to reopen this bug if appropriate. -- Comment By: paul rubin (phr) Date: 2006-03-29 04:46 Message: Logged In: YES user_id=72053 I don't think the fix at 43421 is quite right, but I can't easily test it in my current setup. Suppose a = dict.fromkeys(range(99) + ['x']) b = random.sample(a,1) 99% of the time, there's no KeyError and b gets se
[ python-Bugs-1461855 ] fdopen() not guaranteed to have Python semantics
Bugs item #1461855, was opened at 2006-03-31 04:37 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1461855&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: John Levon (movement) Assigned to: Nobody/Anonymous (nobody) Summary: fdopen() not guaranteed to have Python semantics Initial Comment: The specification for seek() says: seek( offset[, whence]) Note that if the file is opened for appending (mode 'a' or 'a+'), any seek() operations will be undone at the next write. Consider operating on an fdopen()ed file. The Python source simply calls into the OS-provided fdopen(): http://pxr.openlook.org/pxr/source/Modules/posixmodule.c#3530 However, the POSIX standard http://www.opengroup.org/onlinepubs/009695399/functions/fdopen.html says: "Although not explicitly required by this volume of IEEE Std 1003.1-2001, a good implementation of append (a) mode would cause the O_APPEND flag to be set." Thus, to ensure Python semantics, Python's fdopen() must perform an fcntl() to ensure O_APPEND is set. For example, on Solaris, this optional O_APPEND behaviour is not applied: http://cvs.opensolaris.org/source/xref/on/usr/src/lib/libc/port/stdio/fdopen.c?r=1.22#97 This has recently caused serious problems with the Mercurial SCM. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1461855&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com