[issue34227] Weighted random.sample() (weighted sampling without replacement)
New submission from Piotr Jurkiewicz : Function random.choices(), which appeared in Python 3.6, allows to perform weighted random sampling with replacement. Function random.sample() performs random sampling without replacement, but cannot do it weighted. I propose to enhance random.sample() to perform weighted sampling. That way all four possibilities will be supported: - non-weighted sampling with replacement: random.choices(..., weights=None) (exists) - weighted sampling with replacement: random.choices(..., weights=weights) (exists) - non-weighted sampling without replacement: random.sample(..., weights=None) (exists) - weighted sampling without replacement: random.sample(..., weights=weights) (NEW) Rationale: Weighted sampling without replacement is a popular problem. There are lot of questions on StackOverflow and similar sites how to implement it. Unfortunately, many proposed solutions are wrong, for example: https://stackoverflow.com/a/353510/2178047 https://softwareengineering.stackexchange.com/a/233552/161807 or have excessive computational complexity (e.g. quadratic). There are lot of suggestions to use numpy.random.choice() to do that, which supports all four possibilities with a single function: numpy.random.choice(a, size=None, replace=True, p=None) But of course this is an overkill to install numpy just to do that. I think that this should be possible with stdlib, without the need to implement it by yourself or to install numpy. Especially, that it can be implemented in 2 lines (plus 4 lines of error checking), as you can see in the PR. -- components: Library (Lib) messages: 322367 nosy: piotrjurkiewicz priority: normal severity: normal status: open title: Weighted random.sample() (weighted sampling without replacement) type: enhancement ___ Python tracker <https://bugs.python.org/issue34227> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34227] Weighted random.sample() (weighted sampling without replacement)
Change by Piotr Jurkiewicz : -- keywords: +patch pull_requests: +7988 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue34227> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23351] socket.settimeout(5.0) does not have any effect
New submission from Piotr Jurkiewicz: After setting socket.settimeout(5.0), socket.send() returns immediately, instead of returning after specified timeout. Steps to reproduce: Open two python interpreters. In the first one (the receiver) execute: >>> import socket >>> r = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM) >>> r.bind("test.sock") In the second one (the sender) execute: >>> import socket >>> s = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM) Then run the following command 11 times: >>> s.sendto("msg", "test.sock") On the 12 run command will block. This happens because datagram sockets queue on Linux is 11 messages long. Interrupt the command. So far so good. Then set sender socket timeout: >>> s.settimeout(5.0) Expected behavior: s.sendto() should block for a 5 seconds and THEN raise error 11 (EAGAIN/EWOULDBLOCK). Actual behavior: s.sendto() raises the error IMMEDIATELY. >>> s.sendto("msg", "test.sock") Traceback (most recent call last): File "", line 1, in socket.error: [Errno 11] Resource temporarily unavailable So, in fact, s.settimeout(5.0) does not have any effect. I think that problem is that settimeout() sets the socket to the non-blocking mode (docs say: "Timeout mode internally sets the socket in non-blocking mode."). As described [here](http://stackoverflow.com/q/13556972/2178047) setting timeout on non-blocking sockets is impossible. In fact, when I set timeout manually with setsockopt(), everything works as expected: >>> s.setblocking(1) #go back to blocking mode >>> tv = struct.pack("ll", 5, 0) >>> s.setsockopt(socket.SOL_SOCKET, socket.SO_SNDTIMEO, tv) Now s.sendto() raises the error after 5 seconds, as expected. -- components: IO, Library (Lib) messages: 235013 nosy: piotrjurkiewicz priority: normal severity: normal status: open title: socket.settimeout(5.0) does not have any effect type: behavior versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6 ___ Python tracker <http://bugs.python.org/issue23351> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23351] socket.settimeout(5.0) does not have any effect
Piotr Jurkiewicz added the comment: Does not work on Debian 7 Wheezy, kernel 3.2.65. $ python test.py ('sending ', 0) took 0.000s ('sending ', 1) took 0.000s ('sending ', 2) took 0.000s ('sending ', 3) took 0.000s ('sending ', 4) took 0.000s ('sending ', 5) took 0.000s ('sending ', 6) took 0.000s ('sending ', 7) took 0.000s ('sending ', 8) took 0.000s ('sending ', 9) took 0.000s ('sending ', 10) took 0.000s ('sending ', 11) took 0.000s Traceback (most recent call last): File "test.py", line 17, in s.sendto("hello", SOCKNAME) socket.error: [Errno 11] Resource temporarily unavailable $ uname -a Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1+deb7u1 x86_64 GNU/Linux -- status: pending -> open ___ Python tracker <http://bugs.python.org/issue23351> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com