Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

Ilya Zakharevich Sat, 16 Sep 2000 12:45:41 -0700
On Sat, Sep 16, 2000 at 07:15:34PM +1100, Jeremy Howard wrote:
> Why is it important for overloaded objects to be used as array indices?

Overloaded objects should behave the same way as non-objects.

> Why 
> does RFC 204 rule that out? RFC 204 simply specifies that a list reference
> as an index provides multidimensional access:
> 
>   $a[ [1,1] ] == $a[1][1];

I repeat: what does

    $a[ $ind ]

does if $ind is a (blessed) reference to array (1,1), but behaves as
if it were 11 (due to overloading)?

> RFC 81 expands on the existing operator '..' in a list context to allow more
> generic list generation. It is particularly useful to generate lists to act
> as array slices:
> 
>   @a[ 1..5 : 3] == @a[1,3,5];
> 
> This would seem to conflict with the meaning of '..' outlined in RFC 231.

Sorry, I see no conflict.  (Assuming that ternary '..' is allowed, the
token tie::multi::range() would be followed by 3 numbers, not 2.)

These calls will result in

  tied(@a)->FETCH_RANGE(tie::multi::range(), 1, 5, 3)
  tied(@a)->FETCH_RANGE(1, 3, 3)

If FETCH_RANGE uses tie::multi::inline() to preprocess the keys, this
*by definition* will result in the same array of keys.  If not, it
is the responsibility of FETCH_RANGE to insure the equivalence.

And $a[ 1..5e6 ] would not need to create 5e6 Perl objects the only
purpose of which is to inform the range extractor that it needs to
create an object representing the slice.

> > Because ',' is already special there.  There is little chance that ';'
> > operator is created as a general-purpose operator.

> When we first discussed ';' on the list, we looked at making it special in
> an index only. But the more generic approach of making it a cartesian
> product operator seems cleaner--it avoids 'special' meanings in favour of
> providing a generic operator.

No, it is not a generic operator.  Its behavior depends on whether it
is used *inside parens*, or not.  Additionally, the behaviour of
cartesian product makes very little sense: if you did not want it 3
times, you should not insert it into the language.

> > a) "Lazy generation" is not defined, as stated it is a good wish only.
> >    What is
> >
> >      @a = (0, 2..99, 200..9998, 1000000);
> >      f(@a);

> Lazy generation is a well understood concept in other languages.

Maybe.  But it is not defined in the corresponding RFC nevertheless.
At least: all I could deduce was that the following constructs are
made synonymous:

  @a = ($a .. $b);
  tie @a, Array::Range, $a, $b;

No other usage of .. is covered.

> > b) The call for $a[2,3;5,6] is
> >
> >   *) Put already-available SV pointers for $a, 2,3,4,5 and the cashed SV*
> >      for tie::multi::separator() on stack;
> >
> >   *) Put the (cached) CV* for the method on stack;
> >
> >   *) invoke the call frame;
> >
> > This is not *very* quick, but at least it may be "not that slow".
> > While all the alternatives require creation of anonymous lists, which
> > (I expect) will slow things down 7..10 times for the call above.  For
> > $a[1..100;1..100] it may easily be 100..1000 times slower.

> Lists of lists of known simple type are proposed by RFC 203 to be stored as
> true arrays (i.e. contiguously in memory). Their overhead is not the same as
> Perl 5 lists of lists.

Maybe.  But you still need to create 2000000-elements temporary array
the only purpose of which is to inform the tied array that you need
the upper-left 1000x1000 submatrix.

*You do not want to create new values uncessesarily*.  This is too
slow.  Quick operations should reuse already available values
instead.  See how scratchpads work...

Even if it is creation of a "streamlined" array, creation still will
takes much more time than operation dispatch - which is in turn
painfully slow.

> The index in $a[1..100;1..100] should be generated lazily.

This is *exactly* what my proposal is doing.  The difference is that
it defines what "lazily" means.

> > *) They are not compatible with overloading (unless overloaded things
> >    are dramatically changed);

> There are a number of RFCs proposing substantially changing overloading.
> What specific changes would we need to ensure were incorporated in P6 to
> avoid this incompatibility?

I see no way how they can be made compatible.  Overloading allows
objects to behave *both* as numbers and as array references.

Well, maybe there is a solution: 2 new overloaded accessors in
addition to '""', '0+', 'bool', '@{}', '${}' etc: "extract the value
as the array/hash index", defaulting to '0+' and '""' correspondingly.

> > *) They go very high on the bizzareness scale.
> >
> Bizzare??? Which RFC?

Binary ';'.

> RFCs 90 and 91: These builtins are in almost all languages with rich array
> functionality. 'merge' and 'demerge' are more frequently called 'zip' and
> 'unzip', but those terms were almost universally rejected on -language.

These are convenience functions.  I do not see what they have to do
with the language design...

> RFC 204: Isn't it fairly intuitive that:
> 
>   $a[ [1,1] ] == $a[1][1];

It may be - for people who do not understand overloading.

> RFC 205: When first proposed, everyone on this list felt that:
> 
>   @a[ 1..3 ; 1..3 ]
> 
> is a fairly intuitive way of writing a 2d list slice, since it is in line
> with how most other languages write slices. The observation that here ';' is
> acting to create a cartesian product leads to the generalisation to any list
> context.

... to *overgeneralization* ....  This *assumes* that you use
references to access elements of multi-dim arrays, and *generalizes*
this to a construct with very little usage outside of the array access.

Say, much more useful generalization would be for

  sub foo (@a, @b);
  foo(2,3,4 ; 7,8,9)

call foo() with @a being 2,3,4, and @b being 7,8,9.

But keep in mind that my proposal *does not contradict* your
definition of ';'.  It just provides the *same* semantic inside
hash/array indices, and shows how to implement it without any
extravaganza.

> I think the level of consensus achieved with the syntax proposed in RFCs 81,
> 204, and 205 speaks volumes.

I do not understand what "level of consensus" should do with language design...

> Ilya, are you saying that we've all wasted all the time that we've put into
> these RFCs? When you say that the existing RFC can not be accepted, are you
> referring to 203, 204, 205, or some other proposal or group of proposals?

As I said: Using array references to get multidimensional access *at
least needs some work*.  Having "lazy lists" undefined does not help
either.  What my RCS does: it is a simple, completely-defined
alternative which covers practically the same range of problems,
without requiring anything fancy.

Ilya
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

Reply via email to