[issue34227] Weighted random.sample() (weighted sampling without replacement)

2018-07-25 Thread Piotr Jurkiewicz


New submission from Piotr Jurkiewicz :

Function random.choices(), which appeared in Python 3.6, allows to perform 
weighted random sampling with replacement. Function random.sample() performs 
random sampling without replacement, but cannot do it weighted.

I propose to enhance random.sample() to perform weighted sampling. That way all 
four possibilities will be supported:

- non-weighted sampling with replacement: random.choices(..., weights=None) 
(exists)

- weighted sampling with replacement: random.choices(..., weights=weights) 
(exists)

- non-weighted sampling without replacement: random.sample(..., weights=None) 
(exists)

- weighted sampling without replacement: random.sample(..., weights=weights) 
(NEW)

Rationale:

Weighted sampling without replacement is a popular problem. There are lot of 
questions on StackOverflow and similar sites how to implement it. 
Unfortunately, many proposed solutions are wrong, for example:

https://stackoverflow.com/a/353510/2178047
https://softwareengineering.stackexchange.com/a/233552/161807

or have excessive computational complexity (e.g. quadratic). There are lot of 
suggestions to use numpy.random.choice() to do that, which supports all four 
possibilities with a single function:

numpy.random.choice(a, size=None, replace=True, p=None)

But of course this is an overkill to install numpy just to do that.

I think that this should be possible with stdlib, without the need to implement 
it by yourself or to install numpy. Especially, that it can be implemented in 2 
lines (plus 4 lines of error checking), as you can see in the PR.

--
components: Library (Lib)
messages: 322367
nosy: piotrjurkiewicz
priority: normal
severity: normal
status: open
title: Weighted random.sample() (weighted sampling without replacement)
type: enhancement

___
Python tracker 
<https://bugs.python.org/issue34227>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34227] Weighted random.sample() (weighted sampling without replacement)

2018-07-25 Thread Piotr Jurkiewicz


Change by Piotr Jurkiewicz :


--
keywords: +patch
pull_requests: +7988
stage:  -> patch review

___
Python tracker 
<https://bugs.python.org/issue34227>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23351] socket.settimeout(5.0) does not have any effect

2015-01-29 Thread Piotr Jurkiewicz

New submission from Piotr Jurkiewicz:

After setting socket.settimeout(5.0), socket.send() returns immediately, 
instead of returning after specified timeout.

Steps to reproduce:

Open two python interpreters.

In the first one (the receiver) execute:

>>> import socket
>>> r = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
>>> r.bind("test.sock")

In the second one (the sender) execute:

>>> import socket
>>> s = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)

Then run the following command 11 times:

>>> s.sendto("msg", "test.sock")

On the 12 run command will block. This happens because datagram sockets queue 
on Linux is 11 messages long. Interrupt the command.

So far so good.

Then set sender socket timeout:

>>> s.settimeout(5.0)

Expected behavior:

s.sendto() should block for a 5 seconds and THEN raise error 11 
(EAGAIN/EWOULDBLOCK).

Actual behavior:

s.sendto() raises the error IMMEDIATELY.

>>> s.sendto("msg", "test.sock")
Traceback (most recent call last):
  File "", line 1, in 
socket.error: [Errno 11] Resource temporarily unavailable

So, in fact, s.settimeout(5.0) does not have any effect.

I think that problem is that settimeout() sets the socket to the non-blocking 
mode (docs say: "Timeout mode internally sets the socket in non-blocking 
mode.").

As described [here](http://stackoverflow.com/q/13556972/2178047) setting 
timeout on non-blocking sockets is impossible.

In fact, when I set timeout manually with setsockopt(), everything works as 
expected:

>>> s.setblocking(1)  #go back to blocking mode
>>> tv = struct.pack("ll", 5, 0)
>>> s.setsockopt(socket.SOL_SOCKET, socket.SO_SNDTIMEO, tv)

Now s.sendto() raises the error after 5 seconds, as expected.

--
components: IO, Library (Lib)
messages: 235013
nosy: piotrjurkiewicz
priority: normal
severity: normal
status: open
title: socket.settimeout(5.0) does not have any effect
type: behavior
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6

___
Python tracker 
<http://bugs.python.org/issue23351>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23351] socket.settimeout(5.0) does not have any effect

2015-02-04 Thread Piotr Jurkiewicz

Piotr Jurkiewicz added the comment:

Does not work on Debian 7 Wheezy, kernel 3.2.65.

$ python test.py
('sending ', 0)
took 0.000s
('sending ', 1)
took 0.000s
('sending ', 2)
took 0.000s
('sending ', 3)
took 0.000s
('sending ', 4)
took 0.000s
('sending ', 5)
took 0.000s
('sending ', 6)
took 0.000s
('sending ', 7)
took 0.000s
('sending ', 8)
took 0.000s
('sending ', 9)
took 0.000s
('sending ', 10)
took 0.000s
('sending ', 11)
took 0.000s
Traceback (most recent call last):
  File "test.py", line 17, in 
s.sendto("hello", SOCKNAME)
socket.error: [Errno 11] Resource temporarily unavailable

$ uname -a
Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1+deb7u1 x86_64 GNU/Linux

--
status: pending -> open

___
Python tracker 
<http://bugs.python.org/issue23351>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com