On Fri, Oct 24, 2008 at 4:48 PM, Glenn Linderman <[EMAIL PROTECTED]> wrote: > On approximately 10/24/2008 2:15 PM, came the following characters from the > keyboard of Rhamphoryncus: >> >> On Oct 24, 2:59 pm, Glenn Linderman <[EMAIL PROTECTED]> wrote: >> >>> >>> On approximately 10/24/2008 1:09 PM, came the following characters from >>> the keyboard of Rhamphoryncus: >>> >>>> >>>> PyE: objects are reclassified as shareable or non-shareable, many >>>> types are now only allowed to be shareable. A module and its classes >>>> become shareable with the use of a __future__ import, and their >>>> shareddict uses a read-write lock for scalability. Most other >>>> shareable objects are immutable. Each thread is run in its own >>>> private monitor, and thus protected from the normal threading memory >>>> module nasties. Alas, this gives you all the semantics, but you still >>>> need scalable garbage collection.. and CPython's refcounting needs the >>>> GIL. >>>> >>> >>> Hmm. So I think your PyE is an instance is an attempt to be more >>> explicit about what I said above in PyC: PyC threads do not share data >>> between threads except by explicit interfaces. I consider your >>> definitions of shared data types somewhat orthogonal to the types of >>> threads, in that both PyA and PyC threads could use these new shared >>> data items. >>> >> >> Unlike PyC, there's a *lot* shared by default (classes, modules, >> function), but it requires only minimal recoding. It's as close to >> "have your cake and eat it too" as you're gonna get. >> > > Yes, but I like my cake frosted with performance; Guido's non-acceptance of > granular locks in the blog entry someone referenced was due to the slowdown > acquired with granular locking and shared objects. Your PyE model, with > highly granular sharing, will likely suffer the same fate.
No, my approach includes scalable performance. Typical paths will involve *no* contention (ie no locking). classes and modules use shareddict, which is based on a read-write lock built into the interpreter, so it's uncontended for read-only usage patterns. Pretty much everything else is immutable. Of course that doesn't include the cost of garbage collection. CPython's refcounting can't scale. > The independent threads model, with only slight locking for a few explicitly > shared objects, has a much better chance of getting better performance > overall. With one thread running, it would be the same as today; with > multiple threads, it should scale at the same rate as the system... minus > any locking done at the higher level. So use processes with a little IPC for these expensive-yet-"shared" objects. multiprocessing does it already. >>> I think/hope that you meant that "many types are now only allowed to be >>> non-shareable"? At least, I think that should be the default; they >>> should be within the context of a single, independent interpreter >>> instance, so other interpreters don't even know they exist, much less >>> how to share them. If so, then I understand most of the rest of your >>> paragraph, and it could be a way of providing shared objects, perhaps. >>> >> >> There aren't multiple interpreters under my model. You only need >> one. Instead, you create a monitor, and run a thread on it. A list >> is not shareable, so it can only be used within the monitor it's >> created within, but the list type object is shareable. >> > > The python interpreter code should be sharable, having been written in C, > and being/becoming reentrant. So in that sense, there is only one > interpreter. Similarly, any other reentrant C extensions would be that way. > On the other hand, each thread of execution requires its own interpreter > context, so that would have to be independent for the threads to be > independent. It is the combination of code+context that I call an > interpreter, and there would be one per thread for PyC threads. Bytecode > for loaded modules could potentially be shared, if it is also immutable. > However, that could be in my mental "phase 2", as it would require an extra > level of complexity in the interpreter as it creates shared bytecode... > there would be a memory savings from avoiding multiple copies of shared > bytecode, likely, and maybe also a compilation performance savings. So it > sounds like a win, but it is a win that can deferred for initial simplicity, > to prove the concept is or is not workable. > > A monitor allows a single thread to run at a time; that is the same > situation as the present GIL. I guess I don't fully understand your model. To use your terminology, each monitor is a context. Each thread operates in a different monitor. As you say, most C functions are already thread-safe (reentrant). All I need to do is avoid letting multiple threads modify a single mutable object (such as a list) at a time, which I do by containing it within a single monitor (context). -- Adam Olsen, aka Rhamphoryncus -- http://mail.python.org/mailman/listinfo/python-list