[ Second time lucky... ] Paul Rubin wrote: ... > I don't see how generators substitute for microthreads. In your example > from another post:
I've done some digging and found what you mean by microthreads - specifically I suspect you're referring to the microthreads package for stackless? (I tend to view an activated generator as having a thread of control, and since it's not a true thread, but is similar, I tend to view that as a microthread. However your term and mine don't co-incide, and it appears to cause confusion, so I'll switch my definition to match yours, given the microthreads package, etc) The reason I say this is because it naturally encourages small components which are highly focussed in what they do. For example, when I was originally looking at how to wrap network handling up, it was logical to want to do this: [ writing something probably implementable using greenlets, but definitely pseudocode ] @Nestedgenerator def runProtocol(...) while: data = get_data_from_connection( ... ) # Assume non-blocking socket def get_data_from_connection(...) try: data = sock.recv() return data except ... : Yield(WaitSocketDataReady(sock)) except ... : return failure Of something - you get the idea (the above code is naff, but that's because it's late here) - the operation that would block normally you yield inside until given a message. The thing about this is that we wouldn't have resulted in the structure we do have - which is to have components for dealing with connected sockets, listening sockets and so on. We've been able to reuse the connected socket code between systems much more cleanly that we would have done (I suspect) than if we'd been able to nest yields (as I once asked about here) or have true co-routines. At some point it would be interesing to rewrite our entire system based on greenlets and see if that works out with more or less reuse. (And more or less ability to make code more parallel or not) [re-arranging order slightly of comments ] > class encoder(component): > def __init__(self, **args): > self.encoder = unbreakable_encryption.encoder(**args) > def main(self): > while 1: > if self.dataReady("inbox"): > data = self.recv("inbox") > encoded = self.encoder.encode(data) > self.send(encoded, "outbox") > yield 1 > ... > In that particular example, the yield is only at the end, so the > generator isn't doing anything that an ordinary function closure > couldn't: > > def main(self): > def run_event(): > if self.dataReady("inbox"): > data = self.recv("inbox") > encoded = self.encoder.encode(data) > self.send(encoded, "outbox") > return run_event Indeed, in particular we can currently rewrite that particular example as: class encoder(component): def __init__(self, **args): self.encoder = unbreakable_encryption.encoder(**args) def mainLoop(self): if self.dataReady("inbox"): data = self.recv("inbox") encoded = self.encoder.encode(data) self.send(encoded, "outbox") return 1 And that will work today. (We have a 3 callback form available for people who aren't very au fait with generators, or are just more comfortable with callbacks) That's a bad example though. A more useful example is probably something more like this: (changed example from accidental early post) ... center = list(self.rect.center) self.image = self.original current = self.image scale = 1.0 angle = 1 pos = center while 1: self.image = current if self.dataReady("imaging"): self.image = self.recv("imaging") current = self.image if self.dataReady("scaler"): # Scaling scale = self.recv("scaler") w,h = self.image.get_size() self.image = pygame.transform.scale(self.image, (w*scale, h*scale)) if self.dataReady("rotator"): angle = self.recv("rotator") # Rotation self.image = pygame.transform.rotate(self.image, angle) if self.dataReady("translation"): # Translation pos = self.recv("translation") self.rect = self.image.get_rect() self.rect.center = pos yield 1 (this code is from Kamaelia.UI.Pygame.BasicSprite) Can it be transformed to something event based? Yes of course. Is it clear what's happening though? I would say yes. Currently we encourage the user to look to see if data is ready before taking it, simply because it's the simplest interface that we can guarantee consistency with. For example, currently the exception based equivalent would be: try: pos = self.recv("translation") except IndexError: pass Which isn't necessarily ideal, because we haven't really finalised the implementation of inboxes and outboxes (eg will we always through IndexError?). We are certain though that the behaviour of send/recv/dataReady can remain consistent until then. (Some discussions regarding Twisted people have suggested twisted deferred queues might be useful here, but I haven't had a chance to look at them in detail.) At the moment, one option that springs to mind is this: yield WaitDataAvailable("inbox") (This is largely because we're looking at how to add syntactic sugar for synchronous bidirectional messaging) Allowing the scheduler to suspend the generator until data is ready. This however doesn't work for the example above. Whereas currently the following: self.pause() yield 1 Will prevent the component being run until one of the inboxes has a delivery made to it *or* a message taken from its outboxes. (Very course grained) > Notice the kludge FWIW, it's deliberate because we can maintain API consistency, until we decide on better syntactic sugar. > The reason for that is that Python generators aren't really coroutines > and you can't yield except from the top level function in the generator. Agreed - as noted above. We're finding this to be a strength though. (Though to confirm/deny this properly would require a rewrite using greenlets or similar) > Now instead of calling .next on a generator every time you want to let > your microthread run, just call the run_event function that main has > returned. But then you're building state machines. We're using generators because they allow people to write code looking completely single threaded, throw in yields in key locations, abstract out input/output and do all this in small gradual steps. (I reference an example of this below) Relevant quote that might help explain where I'm coming from is this: "Threads are for people who cant program state machines." -- Alan Cox I'd agree really on some level, but I'm always left wondering - what about the people who cant program state machines, but don't want to use threads etc? (For whatever reason - maybe the architecture they're running on has a poor threads implementation) Initially co-routines struck me as the halfway house, but we decided to stick with standard python and explore a generator based approach. > However, I suppose there's times when you'd want to read an > event, do something with it, yield, read another event, and do > something different with it, before looping. In that case you can use > yields in different parts of that state machine. That's indeed what we do for a number of different existing components. Also there's the viewpoint aspect - you can view the system as event based (receiving a message as an event) or you can view it as dataflow. From our perspective we view it as a dataflow system. > But it's not that > big a deal; you could just use multiple functions otherwise. Having a single function with yields peppering it provides a simpler path from single program single threaded to sitting inside a larger system whilst remaining single threaded. We have a walk through of how to write a component here [1] which is based on the experience of writing components for multicast handling. [1] http://kamaelia.sourceforge.net/cgi-bin/blog/blog.cgi?rm=viewpost&postid=1113495151 The components written are sufficient for the tasks we need them at present by probably need work for the general case. However the resulting code remains close to looking single threaded - lowering the barrier for bug finding. (I'm a firm believer that > 90% of the population can't write bug free code - me included) The final multicast transciever may have some issues that jump out to someone else that wouldn't necessarily jump out if I'd turned the code inside out into seperate state functions. I'm fairly certain it would've been less clear (to someone coming along later) how to join the sender/receiver code into a single transceiver. The other thing is the alternatives to generators/coroutines are: * threads/processes * State machine style approaches Having worked on a (very) large project (in C++) which was very state machine based, I've come to have a natural dislike for them, and wondered at the time if a generator/coroutine approach would be easier to pickup and more maintainable. It might be, it might not be. The idea behind our work is to have a go, build something and see if it really is better or worse. If it's worse, that's life. If it's better, hopefully other people will copy the approach of use the tools we release. Until then (he says optimistically) other people do have GOOD systems like twisted which is one of the nicest systems of it's kind. (Personally I'd expect that if our stuff pans out we'd need to do a partial rewrite to simplify the process for people to cherry pick code into twisted (or whatever), if they want it.) > All in all, maybe I'm missing something but I don't see generators as > being that much help here. With first-class continuations like > Stackless used to have, the story would be different, of course. I suppose what I'm saying is what you're losing isn't as large as you think it is, and brings benefits of its own along the way. This does mean though that we now have the ability to compose interesting systems in a unix pipeline approach using graphical pipeline editors that produce code that looks like this: pipeline( ReadFileAdaptor( filename = '/data/dirac-video/bar.drc', readmode = 'bitrate', bitrate = 480000 ), SingleServer( ).activate() pipeline( TCPClient( host = "127.0.0.1", port = 1601 ), DiracDecoder( ), RateLimit( messages_per_second = 15, buffer=2), VideoOverlay( ), ).run() ... which creates 2 pipelines - one represents a server sending data out over a network socket, the other represents a client that connects, decodes and displays the video. The Tk integration was relatively quick to write, because it /couldn't/ be complex. The Pygame integration was fairly simple, because it /couldn't/ be complex. (which may be fringe benefits of generators) We haven't looked at interating gtk, wx or qt yet. >From our perspective the implementation of pipeline is the interesting part. Currently this is simply a wrapper component, however it is responsible for activating the components passed over, and /could/ run the generator based components in different processes (and hence processors potentially). Alternatively that could be left to the scheduler to do, but I suspect something with a bit of control would be nice. None of this is really special to generators as is probably obvious, but that's where we started because we hypothesised that the resulting code *might* be cleaner, whilst potentially able to be just as efficient as more state machine based approaches. If greenlets had been available when we started I suspect we would have used those. We rejected stackless at the time because generators were available, and whilst not as good from some perspectives /are/ part of the standard language since 2.2.something. That decision has meant that we're able to (and do) run on things like mobiles, and upwards without changes, except to packaging. At the end of the day, the only reason I'm talking about this stuff at all is because we're finding it useful - perhaps more so than I expected when I first realised the limitations of generators :-) If you don't find it useful, then fair enough :) Best Regards, Michael. -- http://mail.python.org/mailman/listinfo/python-list