Hi all,

(For those of you wondering about the subject, note that this message is sent 
in compliance with This email sent in compliance with 
<https://twistedmatrix.com/trac/wiki/CompatibilityPolicy#ProcedureforExceptionstothisPolicy>.)

I've been trying to improve the reliability of the buildbots.  (On that note, 
by the way, several builders are failing and more will be soon as security 
updates roll out, it would be great if someone could review 
<https://twistedmatrix.com/trac/ticket/7651> before the whole fleet turns red.)

In so doing I (re-)discovered this bug: 
<https://twistedmatrix.com/trac/ticket/2673>.  At first I just wanted to delete 
an intermittently failing test which appears to be nonsense.  However, after 
some prodding from exarkun, I investigated and discovered that this very 
poorly-written test case does in fact illustrate a real problem with our 
current threadpool implementation which can result in deadlocks, hangs on exit, 
and other unpleasantness.  In my use of Twisted I have encountered several 
"huh, this can't happen, maybe cosmic rays or something" bugs which might have 
been caused by this, so I would very much like to fix it.

One reason that our threadpool implementation is problematic is that it has a 
somewhat complex internal design, lots of weird little features (context 
preservation, completion callbacks, workload estimation) all built at the same 
layer with no composition.  I tried to figure out a way to salvage its current 
architecture and I could not reasonably do so.

I have prototyped a completely new threadpool implementation 
("twisted.threads") in a spike.  It is definitely not ready for review, but I 
would definitely appreciate design-level feedback on the code right now.  You 
can see the implementation here: 
<https://twistedmatrix.com/trac/browser/branches/desynchronize-2673/twisted/threads>
 (or here: 
<https://github.com/twisted/twisted/tree/desynchronize-2673/twisted/threads> if 
that's more your speed)  It's less code than the existing implementation and 
provides more useful features at the same time.  It might, in fact, provide 
some of the threading primitives that we would need for a reactor-per-thread 
implementation.

However, there is a very ugly implementation detail that prevents us from 
marching onwards to a glorious multithreading future: 
twisted.internet.interfaces.IReactorThreads.getThreadPool.  The nominally 
"public" interface provided by its documented return type, the concrete 
(old-style!) class twisted.python.threadpool.ThreadPool, is ridiculously 
over-broad.  Here are the features (attributes and methods) in its "public" 
interface:

callInThread
callInThreadWithCallback
adjustPoolsize
start
stop
q
min
max
joined
started
name
waiters
threads
working
workers
start
startAWorker
stopAWorker
dumpStats
__getstate__
__setstate__
threadFactory
currentThread

Here's what I would like its public interface to be, more or less what it's 
intended to be, based on the ways it's documented and used in Twisted and in 
the various projects I can see using it:

callInThread
callInThreadWithCallback
adjustPoolsize
start
stop

It would not be too much effort to also preserve, for legacy purposes:

min
max
joined
started
name
dumpStats

However, I would REALLY like to delete these implementation details:

workers
threads
working
startAWorker
q
__getstate__
__setstate__
threadFactory
currentThread

Deleting these implementation details straight up would also make it such that 
anyone who inherited from the public class ThreadPool would not be able to 
override any of these things (oh my goodness, why would you do that, I really 
hope nobody has done that).

The three ways I could proceed, in order of my own preference, are:

Put the new implementation with entirely new interfaces into twisted.threads, 
put a compatibility shim into twisted.python.threadpool.ThreadPool that wraps 
those objects, provides my more minimal interface proposed above, and deletes 
90% of its tests.  Change IReactorThreads.getThreadPool to return a documented 
interface considerably more restricted than 

Compatibility implications: twisted.internet.interfaces.IReactorThreads's 
contract would be relaxed.  Implementors of IReactorThreads would not see an 
impact unless they were talking to deleted parts of ThreadPool internally.  
ThreadPool's clients would have several methods removed, and subclassing would 
effectively not be possible any more (a bunch of overridable hooks would have 
been removed, and no replacements would be provided). However, I prefer this 
option because there are an extremely restrictive set of use-cases (basically: 
only logging, any behavioral changes would have been broken) where clients 
could have made legitimate use of these attributes or overriding any of these 
functions.

Delete all of twisted.python.threadpool's test coverage, deprecate the entire 
module, and delete it in the next release (14.1+1).  This has the advantage of 
making the test suite reliable, and gets the benefits for any users of 
callInThread, but does not relay any of the benefits to users of ThreadPool.  
The only ones doing this directly in Twisted are deferToThreadPool (and by 
extension, deferToThread) and WSGIResource.  I could update these in tandem, 
however; the code changes would be very small.  I don't think it would be 
incompatible to make WSGIResource itself accept a different type of constructor 
argument (although if we don't, I wonder if subclassing anything with a 
constructor can be considered applicable to the compatibility policy in 
Twisted; hmm).

Actually go through the trouble of emulating all the attributes on ThreadPool 
and emulating them as best I can.  Delete the direct tests for ThreadPool 
itself and write a private subclass that will be returned from getThreadPool 
that returns a compatibility shim which isinstance(ThreadPool) for 
compatibility and still has all the gross attributes.  This change would be 
technically compatible, but would be a huge amount of additional work and I am 
doubtful that it would provide any value.

So, does anyone out there have any code which makes use of the aforementioned 
bad attributes of ThreadPool, whose applications would break if I removed them? 
 If so, how can I make you feel as bad about yourselves for using it as I feel 
bad about myself for writing it? ;-)

-glyph


_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Reply via email to