On Monday, May 26, 2003, at 06:51 PM, John Macdonald wrote:
This is an interesting idea. I'd add forked processes to the list (which includes the magic opens that fork a child process on the end of a pipeline instead of opening a file.
I thought about that quite a bit, but fork() is a process-level thing, whereas even threads are more internally controllable/implementable, so I thought that would be too controversial. People already *know* how to fork processes in Perl, whereas thread syntax is newer and more malleable.
There is a danger in hiding the distinction between them too much. They have quite difference performance overheads.
Some thinking out loud: a thread is significantly more expensive than a coroutine. Why? Because threads must be executable in parallel, so you're duping quite a bit of data. (You don't dup nearly as much stuff for a coroutine, _unless_ of course you're cloning an already-active coroutine.) So a conventional thread is like cloning a "very-top-level" coroutine.
Now, if we were to say:
sub foo(...) is coroutine { ... yield() ... return() }
We would expect
foo(...args...);
to give us back that coroutine, from the last yield point. In the above, C<yield> yields out of the coroutine, and C<return> yields-without-saving-state such that the next foo() invocation will start from the beginning of the routine.
Similarly, then, I would expect:
sub foo(...) is threaded { ... yield() ... return() }
foo(...args...)
to start &foo as a new thread. C<yield()> would temporarily suspend the thread, and C<return()> would end the thread. (Note that you could use &_.yield to yield the current Code object, so you can have nested yields w/out confusion -- see C<leave>, from A6.)
These are using nearly identical syntax, but there is still a clear semantic difference between them -- the trait C<threaded> is sufficient for that.
We could also have things like:
sub { ... } closure { ... } coroutine { ... } thread { ... }
If that's preferable syntax. As long as they're similar, and share similar suspend/resume capabilities.
OTOH, the difference between a thread and a coroutine is mostly internal, not external. If we define a thread as a coroutine that runs in parallel, the syntax might converge:
sub foo() is cothread { ... yield() ... return() }
# start foo() as a coroutine, (blocks until explicitly yields):
my $results = foo(...);
# start foo() as a parallel thread (nonblocking):
my $results_junction = parallel( &foo(...), &bar(...), &baz(...) )
In this situation, C<parallel> would indicate that you couldn't continue on until all three threads had suspended or exited -- so in order to have truly "top-level" parallel threads, you'd have to set them up so that your entire program was encapsulated within them. (Otherwise you'd just get temporary parallelization, which is just as desirable.) (You could declare a scheduler as an optional named parameter of C<parallel>.)
So to what extent is it OK to "hide" the complexity of a coroutine, or a thread, in order to have the caller side interface as clean and brief as possible? (A question over which I remember having a vigorous but unanswerable debate with someone -- specifically over C++ operator overloading, which suffers from the problem in spades.)
I'm of mixed opinion. I like the notion of merging them, because I think they are largely the same concept, implemented in different ways. I _VERY MUCH_ like the idea of any arbitrary Code block being parallelizable with the addition of a single word. Few languages do a decent job of parallelization, and current thread syntax is often overly burdensome.
Importantly, I hope that it _might_ be possible to, in the multiprocessor-laden future, automagically parallelize some coroutines without going full-fledged into multiple threads, which are far too expensive to produce any net benefit for anything but the largest tasks. (The trick is in the dataflow analysis -- making sure there's no side effects on either path that will conflict, which is bloody difficult, if not impossible.) Such that given:
my $result = foo(...);
... more stuff ...
print $result;
foo() recognizes that it can be run in parallel with the main flow, which is only blocked at the C<print> statement if C<$result> isn't completed yet. Until such time as dataflow analysis can accomplish that, however, it may take a keyword:
my $result = parallel foo(...); ... print $result;
or
my $result |||= foo(...); ... print $result;
MikeL