On Tuesday, May 27, 2003, at 12:26 PM, Luke Palmer wrote:
We could also have things like:
sub { ... } closure { ... }
I think you've finally gone crazy. :-) All four of these things are closures.
coroutine { ... } thread { ... }
Well, yes, I've been crazy for a while now. But seriously, all four of those are closures -- but I'm theorizing that the last two constructs create something "more" -- a closure bound to an encapsulating something-else.
OTOH, the difference between a thread and a coroutine is mostly internal, not external.
Again, I beg to differ. But, these are the kinds of misunderstandings that are keeping a good coroutine proposal from getting in....
Here's how I think of things:
coroutines - Used for easy iteration over complex data structures, pipelines, and communication of the results of certain algorithms.
threads - Used for user interfaces, speed (on SMP systems), very occasionally pipelines, and headaches.
I may have missed things in the threads section, because I haven't done very much threaded programming. To be honest, my favorite use so far has been using them to (painstakingly) emulate coroutines :-)
AHA! I think what I am proposing as a "thread" is perhaps not the all-encompassing "thread" as implemented by many other languages, but merely "an encapsulated method of parallelization".
We are perhaps used to thinking of threads as intra-process versions of fork(), which I would argue is a damn annoying way to do it -- far too low-level. All a thread really has to be is a block of code that:
(a) executes in parallel with "sibling" blocks of code, and (b) is independent of exceptions thrown by "sibling" blocks of code
The conventional thread interface is sortof lame. What I'm very fuzzily envisioning, while hoping that my dumb-guy analysis inspires a "eureka!" moment in somebody more experienced in these implementations than I am, is "nestable" threads and subthreads, the way coroutines can be "nestable".
So it's not like doing a fork(). It's like calling a subroutine and getting a result. Now, in some cases (like a top-level event loop) that subroutine will never return, which is just as true of normal subroutines.
If you call one routine, piece o' cake, it's not a thread, and it doesn't have to do anything fancy. If you call a _junction_ of routines, however, _then_ it knows it has to do the extra fluff to make them parallel, which it then automatically does. So don't execute a junction of Code blocks in parallel unless you intend to do that!
So rather than having fork()y threads, perhaps we can use Code junctions to represent parallelization, and call threads _as if they were simply coroutines_.
(?)
But I'm pretty sure these two concepts are things that we don't want to unify, even though they both have to do with "state". I like your musings on "state", however, and making them more explicit might enable us to come up with some very cool ideas.
If we define a thread as a coroutine that runs in parallel, the syntax might converge:
sub foo() is cothread { ... yield() ... return() }
# start foo() as a coroutine, (blocks until explicitly yields):
my $results = foo(...);
# start foo() as a parallel thread (nonblocking):
my $results_junction = parallel( &foo(...), &bar(...), &baz(...) )
In this situation, C<parallel> would indicate that you couldn't continue on until all three threads had suspended or exited -- so in order to have truly "top-level" parallel threads, you'd have to set them up so that your entire program was encapsulated within them. (Otherwise you'd just get temporary parallelization, which is just as desirable.) (You could declare a scheduler as an optional named parameter of C<parallel>.)
So to what extent is it OK to "hide" the complexity of a coroutine, or
a thread, in order to have the caller side interface as clean and brief
as possible? (A question over which I remember having a vigorous but
unanswerable debate with someone -- specifically over C++ operator
overloading, which suffers from the problem in spades.)
There is very little complexity to a coroutine. It's a difficult *concept*, in the ways it has traditionally been explained.
Somewhat unrelatedly, I have a mini-rant about encapsulating coroutines inside the sub call interface. Why do so many people consider this a good thing? I don't go around coding all of my functions with ten C<state> variables, and I consider it a good practice. My subs tend to do the same thing each time they're called... which is precisely how the concept works. There are things that are meant to keep state, and these things are called objects! Why don't we use their interface to manage our state?
I say this because Damian's coroutine proposal could be greatly simplified (IMHO making it clearer and easier) if calling the sub didn't imply starting an implicit coroutine the first time. I might write something that exemplifies this.
I'd be quite interested in this -- please do. I *like* Damian's latest coroutine proposal quite a bit, enough so that it really got me thinking about how perplexingly lame typical thread syntax is.
MikeL