Re: Cothreads [Philosophy]

Dulcimer Thu, 29 May 2003 07:18:47 -0700

You may evaluate for yourself whether thse comments are actually worth
two cents, but:


I want to be able to tell whether my program is threading or using some
other useful-but-not-what-I'd-do-in-a-hurry tool.

I don't want to learn a thousand different details that are only there
to make sure I know which tool I'm using.

So, if 

  sub x() is threaded {...}

I'll know by looking at the code. If I'm using a 3rd party module, it
had *better* tell me in the documentation if something I'm doing
requires that I know that I'm spawning threads, or any otherimportant
overhead from coro's or whatever..... but if not, it had better handle
them efficiently and *quietly*.

At the top level, I'd prefer to be able to ask "hey, is this sub
threaded?" with something like C<foo() if &x.threaded;> but otherwise,
I don't want to be bothered with it.

Regarding advocacy, I want to be able to point someone to a syntax
reference page and not have it intimidate them any more than it already
will....

So I'd recommend 

  use threads;
  sub x () is threaded {...}

and then I don't usually need to worry about it. I just say

  x();

and hope that it doesn something useful, since it's a stupid name for a
sub. :)

In that vein, I like the idea of consolidating thread and coroutine
syntax. Hell, let's call them something else entirely, like "spawns" or
something, if it helps. I'd live if it uses keywords, I'd be happy if
it uses objects, but I like the idea of it being something simple,
however it's implemented.

  my $t = thread.new(&x,[EMAIL PROTECTED]).someMethod;
  foo();
  bar();
  print $t; 

A bit ugly, and a bit confusing because of the seperation, but hey,
it's threaded. Add a few comments and enjoy the ride. :)

Paul


--- Michael Lazzaro <[EMAIL PROTECTED]> wrote:
> 
> On Tuesday, May 27, 2003, at 07:32 PM, Jonathan Scott Duff wrote:
> > On Tue, May 27, 2003 at 02:05:57PM -0700, Michael Lazzaro wrote:
> >> If we could think about "threads" not in terms of forkyness, but 
> >> simply
> >> in terms of coroutines that can be called in parallel, it should
> be
> >> possible to create an implementation of "threading" that had to do
> a
> >> whole heck-of-a-lot less duplication of state, etc.
> >
> > See, this where I start to feel all Cozeny and wonder what the heck
> > we're doing even thinking about how it's implemented. What I want
> to
> > know is how it looks from the user perspective.
> 
> Sorry, yes, I'm not talking at all about implementation.  I'm just 
> talking about syntax/semantics of invoking them.  Underneath, threads
> 
> and coroutines may well be implemented by giant, man-eating
> woodchucks 
> battling to the death on a rainy day under a popsicle-stick bridge,
> for 
> all I care.  :-)
> 
> 
> > If, in order to
> > understand threads, I have to first understand coroutines, I think
> > that's a loss because it throws away (or at least morphs into an
> > unrecognizable form) all of collect CS knowledge of what
> "threading"
> > usually means.  In other words, I think the idea of fork-like
> > behaviour is important to threads.
> 
> [Philosophy]
> 
> Here's the possible defect in my brain that started my train of
> thought.
> 
> I hadn't used coroutines in a long while, so when the topic came up I
> 
> had to do a little reviewing to make sure I was on the same page.  
> Reviewing done, I started thinking about how coroutines were a 
> different concept from continuations, which were different from 
> threads, and we already had closures, and at what point was the 
> aggregation of all those things going to cause potential Perl6
> converts 
> to look at our hypothetical table-o-contents and say "WTF -- exactly 
> how much 'Perl' am I gonna have to learn to be considered a 'good'
> Perl 
> programmer?"
> 
> Because _all_ those things are useful concepts, but in spite of the 
> fact that they address very similar problem spaces -- allowing
> "program 
> state" to be saved and resumed in less-than-strictly-linear ways -- 
> they historically approach it in rather different ways.  Whether
> you're 
> sucking in a concept from a well-worn but sortof-crazy-aunt language 
> like Lisp, or from a possibly-created-as-the-result-of-a-wager
> language 
> like Icon, each concept largely reached its current fruition 
> independent of the others.  No currently-popular(?) language uses
> _all_ 
> of them at the same time, or to be more precise, you don't see many 
> programmers using _all_ those concepts simultaneously within a given 
> application (and my gut tells me I'd not be keen to debug it if I saw
> 
> it.)
> 
> OK.  So I get to thinking about coroutines.  Coroutines are one of 
> those things that Freak People Out, but once you get to use them, 
> they're quite easy to understand, and damn useful.  They're like 
> blocking threads; you call them like a subroutine, they (eventually) 
> return a result to you, and you go on your merry way.  They are used 
> _exactly_ as subroutines are used... the only difference is that,
> like 
> a thread, you can _resume_ the subroutine from the point you last 
> yield()ed, and continue on your way within the subroutine as if that 
> yield() never happened.  Again, like a suspended and resumed thread.
> 
> True threads, on the other hand, are one of those things that
> everyone 
> *thinks* they know, but which have so many issues with scheduling, 
> blocking, data sharing and hiding, and general wiggyness that it's a 
> lot harder to implement a well-engineered group of threads than most 
> people are willing to admit.  Much of that is intrinsic, by
> definition, 
> to _any_ attempt at parallelization.  But how much?
> 
> Threads are a low-level interface to parallelization.  But 
> realistically, people use threads for a *very* narrow set of
> real-world 
> concepts:
> 
> (Concept 1) Creating event loops, listeners, etc., in which an event 
> spawns an "action", which runs in parallel to anything else which is 
> running and is independent of all those other things -- so whether it
> 
> succeeds, fails, or just never ends, it doesn't affect the rest of
> the 
> program.  We'll call this the "subprocess" model.
> 
> (Concept 2) Splitting multiple "lengthy" tasks to run in parallel, to
> 
> allow calculations on one task to proceed even if another task is in
> a 
> blocked (waiting) state, as frequently happens in I/O, or to allow 
> multiple calculations to be run on multiple processors.  We'll call 
> this the "parallelization" model.
> 
> 
> Current popular thread semantics, I would argue, are entirely
> designed 
> around (Concept 1), and are designed to be very similar to actual 
> _process_ forking.  Which at first glance is great, because if you 
> understand process fork()ing you understand threading, but there are 
> two problems with that.  First, threads aren't processes, so the 
> analogy breaks down upon more complicated inspection -- if you think
> of 
> threads as processes, you're going to very quickly get burned by the 
> fact that they aren't.  Second, you don't _want_ them to be like 
> processes, because you often don't _want_ your entire program to be 
> duplicated as with fork(), to run off on its own like some goateed 
> Kirk/Spock/whoever from any given "transporter malfunctioning and/or 
> parallel universe" Star Trek episode -- often, the threads represent 
> small snippets of much narrower functionality.  This model is
> therefore 
> rather unhelpful for the narrower "parallel task" cases of (Concept
> 2).
> 
> If you look at (Concept 1) vs. (Concept 2), they're pretty 
> significantly different.  An interface that best supports (Concept 1)
> 
> is not going to well support (Concept 2), and vice versa -- I would 
> argue it can't be done.  They're at least as different as the
> concepts 
> of continuation vs. coroutine, for example!
> 
> So how would you design code parallelization -- multiple pieces of
> code 
> running at the same time, independent of each other, if you were 
> approaching it from the direction of solving (Concept 2)?  A pretty 
> easy question, since other languages have already answered it.  In 
> general, you'd want to call the code, and return a result, and the 
> caller wouldn't give a flying damn about whether that call resulted
> in 
> a parallel process or not, so long as the result was obtained before 
> the caller needed it:
> 
>       my $result = big_long_calculation();
>       ...
>       print $result;
> 
> You wouldn't want the caller to have to specify that a thread was
> being 
> fork()ed and joined back up again later -- the caller doesn't care.  
> It's up to big_long_calculation() to run in parallel, if it was 
> specifically designed to allow it.  MUCH LIKE COROUTINES, you don't 
> want to be exposed to the "how" of the guts of
> big_long_calculation... 
> you should just know that it works.
> 
> But if, on the other hand, you _want_ to explicitly say that two
> tasks 
> can run in parallel, you could have a syntax that could do that:
> 
>      my @results = parallel( get_slow_data($a), get_slow_data($b) );
> 
> Again, you don't really give a fuzz about _how_ that's happening, so 
> long as the results get to you by the time you need them.
> 
> How, then, would you design an interface around (Concept 1) -- the 
> listener/eventloop/subprocess model?  Pseudocode for a top-level
> event 
> loop might be something like:
> 
>      loop {
>          sleep($fraction), next
>            unless $e = get_event();
> 
>          process_event($e);
>      }
> 
> ... where process_event implicitly does so in a parallel thread. 
> Note 
> that you're not using the return value of process_event, so there's
> no 
> "result" to wait for -- the loop simply continues to the next event.
> 
> The pseudocode implies the ability to mark process_event() as allowed
> 
> to "automatically" be parallelized when called:
> 
>      thread process_event (...args...) { ... }
> or
>      sub process_event (...args...) is threaded { ... }
> 
> The point being, you would _still_ be calling process_event() as a 
> normal subroutine; you wouldn't particularly care about how it did 
> whatever it did, as long as it did it.
> 
> And the larger point being, if you were to design semantics around
> easy 
> support for the basic things threads are actually used for in the
> real 
> world, it probably wouldn't be the fork()-like threading semantics we
> 
> so often see.  It'd look, conceptually, a lot more like subroutines
> or 
> coroutines.
> 
> :-/
> 
> MikeL
> 
> 
> [*] Note, as an aside, that another aspect of this "coroutine"
> version 
> of threads is that you get to choose whether or not exceptions in
> your 
> "threads" propagate to the caller.  By default, they would, because 
> you'd want to know about them:
> 
>      my $result = big_long_calculation();
> 
> but you would also want to be able to prevent that from happening, 
> which you easily could using obvious, already-familiar constructs:
> 
>      thread process_event () {
>          ...
>          try { stuff } catch Exception { }
>      }
> 
> You're also allowed to meaningfully have subthreads within threads, 
> thread "groups" controlled by different schedulers, etc, as you would
> 
> expect.  Note also that coroutine-like semantics for threads might 
> imply that shared data between threads would largely be a matter of 
> variable scope, as in Austin's previous threads proposal.  Which
> could 
> be a lot more intuitive, if done right.
> 


__________________________________
Do you Yahoo!?
Yahoo! Calendar - Free online calendar with sync to Outlook(TM).
http://calendar.yahoo.com

Re: Cothreads [Philosophy]

Reply via email to