Re: Iterator semantics

2008-09-12 Thread Eric Wilhelm
Hi Larry,

# from Larry Wall
# on Thursday 11 September 2008 12:13:

>So when you put something into a list context, some of the values
>will be considered "easy", and some will be considered "hard".
>The basic question is whether we treat those the same or differently
>from a referential point of view.  ...
>The easy ones are the values that have already been calculated, 
>presumably...
>
>Suppose we have
>
>    my @a = 1..10;  # assume easy
>    my @b = =$socket;   # assume hard
>
>    for @a,@b {...}
>
>If the for list sees @b as hard and just copies @b's iterator into its
>own list for lazy evaluation, then it's going to end up iterating the
>socket without loading @b.  Contrariwise if @b is eagerly evaluated it
>might clobber the other iterator in the for list.  So those iterators
>must be kept unified.  It's not enough to copy out the iterators
>from the array; access to the hard elements of the array must still
>be modulated by the array container.  A snapshot of the array
> container will not do in this case.

If a lazy list is an array with an iterator welded onto the end, then:

  1 .. 10 is just: 1, 2, 3, weld_here iter(4, thru => 10)
  1 .. *  is just: 1, 2, 3, weld_here iter(4)

And because an iterator doesn't go backwards, the array has to remember 
the previous values to allow me to maintain the "just an array" 
abstraction, so after I asked for @a[3], the internal state is like:

  1 .. * becomes: 1, 2, 3, 4, weld_here iter(5)

>As a first shot at that definition, I'll submit:
>
>    1 .. $n # easy
>    1 .. *  # hard
>
>On the other hand, I can argue that if the first expression is easy,
>then the first $n elements of 1..* should also be considered easy, and
>it's not hard till you try to get to the *.  :)
>
>It could also be that I'm confusing things here, of course, and that
>1..* is something easy and immutable that nevertheless cannot be
>calculated eagerly.

But you can take a copy of the "@a = 1 .. *" however you like as long 
as "just an array" holds such that @a[41] is always the same regardless 
of whether you (internally) have to kick the welded-on iter 27 times to 
get that element or have already passed it.  So in this case you could 
(internally) end up with two copies each holding their own iter() in 
different states while still giving the same result at the "just an 
array" level.  But this is just an optimization on the general case you 
stated of "must be modulated by the array container" and it is an 
optimization which is only possible for predictable iters.

So, perhaps the case is:

  1 .. $n  # bounded and predictable
  1 .. *   # unbounded and predictable
  =$handle # bounded and unpredictable
  =$socket # unbounded and unpredictable

By "unpredictable", I'm meaning simply that the value of a given element 
cannot be calculated independently by replicating a (pure) function 
f($i).  Perhaps a filehandle isn't a thorough example of that though?

What about:

  my @a = 1..*;
  my @b = =$socket;
  for @a,@b {...; @a = something(); ...}

or:

  my @a = 1..*;
  my @b = =$socket;
  my @c = (@a, @b);
  for @c {...; @a = something(); ...}

In the first case, changing @a changes the for() iterator but in the 
second, the for() will still be nibbling on 1..*, right?  So the 
welded-on-iter is basically by-reference until I do something under 
the "just an array" paradigm which breaks the weld?

  @a = 1..*;
  @a[-1] = 9; # @a = (9) now ?

That's just my thoughts from what I understand and sorry if introducing 
welding into the analogy causes bits of molten metal to go flying 
around ;-)

--Eric
-- 
"You can't win. You can't break even. You can't quit."
--Ginsberg's Restatement of the Three Laws of Thermodynamics
---
http://scratchcomputing.com
---


Re: Iterator semantics

2008-09-12 Thread Daniel Ruoso
Qui, 2008-09-11 às 12:13 -0700, Larry Wall escreveu:
> And I guess the fundamental underlying constraint is that a list cannot
> be considered immutable unless its feeds can be considered immutable,
> at least in some kind of idempotent sense.  This conflicts with the
> whole point of reactive programming, which is that you have to react
> because don't know what's coming.

This is actually something I was talking in the IRC this week. The
amount of polymorphism Perl 6 supports makes it quite impossible to
detect if the feeds can be considered immutable or not (in terms of
concept, I mean, runtime tips could allow optimizations).

But one thing needs to be clear, "List" is immutable as a type, meaning
that the API for List only allows you to read from it, not write to it,
but it doesn't necessarily means that it is immutable as an instance,
because the List might have a "live" backend.

Since List is not a native type, the interpreter doesn't really have any
control on what it does to provide its values, and that's what I mean by
saying that we can't infer if the feeds can or cannot be considered
immutables.

> This seems like it's close to the fundamental difficulty we're trying
> to solve here.  And maybe the solution is much like the value/object
> distinction, where lists get to *decide* for themselves where they
> switch from easy eager immutable semantics to hard lazy reactive semantics.
> And if we're pretty clear about the boundary between those, it
> could work out much like the boundary between DFAable regexes and
> side-effectful regexes as defined in S05.  And maybe it's even the
> same boundary in some theoretical sense.

The problem is that this concept should apply to the entire chain, this
means that it can only be considered easy if all the feeds on the chain
are easy, and it is too easy for it to be considered hard... for
instance, is a 'map' considered easy or hard?

In the end, that means that most of the time "easy" feeds will be made
dirty by hard feeds and all the chain will be made lazy, so we have
little gain.

In SMOP, I'm probably going to presume that everything needs to be lazy,
then even:

  my @o = grep { /\d+/ } =$*IN;
  my @a = (1,2,(3,4,@o));
  my @b = (1,2,(3,4,@a));

Will still allow @b to be seen in slice context, where you would see @a
also in slice context, because @a was not eagerly evaluated when
composing @b, and eventually @o might never be evaluated.

I think this is consistent with the spec that says somewhere that the
only operators that imply eager evaluation are the short-circuit boolean
operators (besides the 'eager' operator, of course, and the use of lazy
values in void context).

Of course the spec only says that it should be lazy with the feed
operators, but in SMOP I tend to think that all this evaluation will be
lazy.

daniel



Re: Iterator semantics

2008-09-12 Thread Ashley Winters
On Fri, Sep 12, 2008 at 11:33 AM, Daniel Ruoso <[EMAIL PROTECTED]> wrote:
> In SMOP, I'm probably going to presume that everything needs to be lazy,
> then even:
>
>  my @o = grep { /\d+/ } =$*IN;
>  my @a = (1,2,(3,4,@o));
>  my @b = (1,2,(3,4,@a));

Can only one array "have" the iterator? If not, that makes for =$*IN {
=$*IN if $needs_skipping } look like a forbidden idiom. On the other
hand, if it's really really lazy, then I could run the following two
statements without reading anything out of the iterator

my @digitlike = grep { /\d+/ } =$*IN;
my @hexlike = grep { /+/ } =$*IN;

But, if the taking of an iterator causes the array to autoactualize
(made this word up to parallel autovivify), what happens to the other
lazy array? Does the iterator buffer its contents to be served up to
other readers, or is the $iterator.next function
first-come-first-serve? Or is there some possible "batching" behavior
that allows one of the iterators to gobble more than is used?

given a non-deterministic iterator which would return <1 2 3 a b c 5 6
7 d e f 8 9 0>
say @digitlike[0..4];
say @hexlike[0..4]; # what's this? <1 2 3 a b> or <7 d e f 8> or
<> or undefined behavior?

In the event you picked <1 2 3 a b>, when did the @hexlike grep block
get called?

- Ashley Winters