[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

Josh Rosenberg Wed, 29 Aug 2018 10:45:01 -0700


Josh Rosenberg <shadowranger+pyt...@gmail.com> added the comment:


Victor, that was a little overboard. By that logic, there doesn't need to be a 
Windows version of Python.

That said, Paul doesn't seem to understand that the real resolution limit isn't 
1 ms; that's the lower limit on arguments to the API, but the real limit is the 
system clock, which has a granularity in the 10-16 ms range. It's a problem 
with Windows in general, and the cure is worse than the disease.

Per 
https://msdn.microsoft.com/en-us/library/windows/desktop/ms724411(v=vs.85).aspx 
, the resolution of the system timer is typically in the range of 10 
milliseconds to 16 milliseconds.

Per 
https://docs.microsoft.com/en-us/windows/desktop/Sync/wait-functions#wait-functions-and-time-out-intervals
 :

> Wait Functions and Time-out Intervals

> The accuracy of the specified time-out interval depends on the resolution of 
> the system clock. The system clock "ticks" at a constant rate. If the 
> time-out interval is less than the resolution of the system clock, the wait 
> may time out in less than the specified length of time. If the time-out 
> interval is greater than one tick but less than two, the wait can be anywhere 
> between one and two ticks, and so on.

All the Windows synchronization primitives (e.g. WaitForSingleObjectEx 
https://docs.microsoft.com/en-us/windows/desktop/api/synchapi/nf-synchapi-waitforsingleobjectex
 , which is what ultimately implements timed lock acquisition on Windows) are 
based on the system clock, so without drastic measures, it's impossible to get 
better granularity than the 10-16 ms of the default system clock configuration.

The link on "Wait Functions and Time-out Intervals" does mention that this 
granularity *can* be increased, but it recommends against fine-grained tuning 
(so you can't just tweak it before a wait and undo the tweak after; the only 
safe thing to do is change it on program launch and undo it on program exit). 
Even then, it's a bad idea for Python to use it; per timeBeginPeriod's own docs 
( 
https://docs.microsoft.com/en-us/windows/desktop/api/timeapi/nf-timeapi-timebeginperiod
 ):

> This function affects a global Windows setting. Windows uses the lowest value 
> (that is, highest resolution) requested by any process. Setting a higher 
> resolution can improve the accuracy of time-out intervals in wait functions. 
> However, it can also reduce overall system performance, because the thread 
> scheduler switches tasks more often. High resolutions can also prevent the 
> CPU power management system from entering power-saving modes. Setting a 
> higher resolution does not improve the accuracy of the high-resolution 
> performance counter.

Basically, to improve the resolution of timed lock acquisition, we'd have to 
change the performance profile of the entire OS while Python was running, 
likely increasing power usage and possibly reducing performance. Global 
solutions to local problems are a bad idea.

The most reasonable solution to the problem is to simply document it (maybe not 
for queue.Queue, but for the threading module). Possibly even provide an 
attribute in the threading module similar to  threading.TIMEOUT_MAX that 
reports the system clock's granularity for informational purposes (might need 
to be a function so it reports the potentially changing granularity).

Other, less reasonable solutions, would be:

1. Expose a function (with prominent warnings about not using it in a fine 
grained manner, and the effects on power management and performance) that would 
increase the system clock granularity as much as possible timeGetDevCaps 
reports possible (possibly limited to a user provided suggestion, so while the 
clock could go to 1 ms resolution, the user could request only 5 ms resolution 
to reduce the costs of doing so). Requires some additional state (whether 
timeBeginPeriod has been called, and with what values) so timeEndPeriod can be 
called properly before each adjustment and when Python exits. Pro is the code 
is *relatively* simple and would mostly fix the problem. Cons are that it 
wouldn't be super discoverable (unless we put notes in every place that uses 
timeouts, not just in threading docs), it encourages bad behavior (one 
application deciding its needs are more important that conserving power), and 
we'd have to be *really* careful to pair our calls universally (timeEndPeriod 
mus
 t be called, even when other cleanup is skipped, such as when calling 
os._exit; AFAICT, the docs imply that per-process adjustments to the clock 
aren't undone even when the process completes, which means failure to pair all 
calls would leave the system with a suboptimal system clock resolution that 
would remain in effect until rebooted).

2. (Likely a terrible idea, and like option 1, should be explicitly opt-in, not 
enabled by default) Offer the option to have Python lock timeouts only use 
WaitForSingleObjectEx only to sleep to within one system clock tick of the 
target time (and not at all if the timeout is less than the clock resolution), 
then, before reacquiring the GIL, perform a time slice yielding busy loop until 
you pass the target time (as determined by a higher resolution clock than the 
system clock). Bad for power management, bad for single core machines (where 
even with time slice yielding, you're still constantly getting scheduled), etc.

----------
nosy: +josh.r

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue34535>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

Reply via email to