This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Arrays: Apply operators element-wise in a list context =head1 VERSION Maintainer: Jeremy Howard <[EMAIL PROTECTED]> Date: 10 Aug 2000 Last Modified: 21 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 82 Version: 4 Status: Frozen =head1 DISCUSSION The first source of discussion was around whether there is any consistent meaning to array operations. The source of this was that some felt that other people may want C<*> to mean something other than element-wise multiplication by default (e.g. matrix inner product). However no-one actually said that I<they> wanted it to work this way, only that I<others> may prefer it, particularly mathematicians. The standard use of element-wise operations in mathematical programming languages such as Mathematica and J suggests that this is unlikely to be a source of confusion in practice. The second source of discussion was around whether C<||> and C<&&> should be an exception, as specified in RFC 45. This is discussed in detail in the CONFLICTS section. =head1 ABSTRACT It is proposed that in a list context, operators are applied element-wise to their arguments. Furthermore, it is proposed that this behaviour be extended to functions that do not provide a specific list context. =head1 CHANGES =head2 Since v3 =over 4 =item * Added discussion of conflict with RFC 45 =back =head2 Since v2 =over 4 =item * Extended to work with multidimensional arrays =item * Added ability to broadcast vectors across multidimensional arrays =item * Made operating on non-equal sized lists an error =back =head2 Since v1 =over 4 =item * Added the ability to apply an operator to a scalar and a list. =item * Added more examples, including text processing examples. =back =head1 DESCRIPTION Currently, operators applied to lists in a list context behave counter-intuitively: @b = (1,2,3); @c = (2,4,6); @d = @b * @c; # Returns (9) == scalar @b * scalar @c This RFC proposes that operators in a list context should be applied element-wise to the elements of their arguments: @d = @b * @c; # Returns (2,8,18) If the lists are not of equal length, an error is raised. =head2 Multidimensional array operations RFC 202 describes multidimensional arrays in Perl 6. Element-wise list operations also apply to multidimensional arrays: my int @mat1 = ([1,2], [3,4]); my int @mat2 = ([2,2], [1,1]); my @mat3 = @mat1 * @mat2; # ([2,4],[3,4]) An error is raised if the two arrays do not have equal dimensions. =head2 Broadcasting If an operator is used in a list context with one list (or multidimensional array), and one or more scalars, the scalars are treated as if they were an array of that scalar with the same dimensions as the array: my int @mat1 = ([1,2], [3,4]); @e = @mat1 * 2; # ([2,4],[6,8]) @f = @mat1 * (2,2,2) # Same thing If one operand is a vector of the same bounds as the equivalent dimension of the other operand, the vector's elements are 'broadcast' across every other dimension of the other operand: my int @mat1 = ([1,2], [3,4]); my int @vec1 = (2,3); # 1st dimension @g = @mat1 * @vec1; # ([2,4],[9,16]) my int @vec2 = ([2],[3]); # 2nd dimension @h = @mat1 * @vec2; # ([2,6],[6,12]) If the operands are a column vector and a row vector, the elements of each vector are combined into a two dimensional array: my int @vec1 = (2,3); # 1st dimension my int @vec2 = ([2],[3]); # 2nd dimension @i = @vec1*@vec2; # ([2*2,3*2],[2*3,3*3]) == ([4,6],[6,9]) Equivalent combinatorial broadcasting occurs if the operands are perpendicular planes (creating a cube), and so forth for higher dimensional arrays. =head2 Element-wise functions Functions that do not return a list should be treated in the same way: @e = (-1,1,-3); @f = abs(@e); # Returns (1,1,3) =head1 EXAMPLES =head2 Text processing If @first_names contains a list of peoples first names, and @surnames contains their surnames, this creates a new list that concatenates the elements of the two lists: @full_names = @first_names . @surnames; To quote a number of lines of a message by prefixing them all with '> ': @quoted_lines = '> ' . @raw_lines; To create a histogram for a list of scores: @people = ('adam', 'eve ', 'bob '); @scores = (7,9,5); # Score for each person @histogram = '#' x @scores; # Returns ('xxxxxxx','xxxxxxxxx','xxxxx') print join("\n", @people . ' ' . @histogram); adam xxxxxxx eve xxxxxxxxx bob xxxxx =head2 Number crunching This snippet multiplies the absolute values of three arrays together and sums the results, in a very efficient way: @b = (1,2,3); @c = (2,4,6); @d = (-2,-4,-6); $sum = reduce ^_+^_, abs(@b * @c + @d); Lists can be reordered or sliced with list generation functions (RFC 81) allowing flexible data manipulation: @a = (3,6,9); @reverse = (3..1); @b = @a * @a[@rev]; # (3*9, 6*6, 9*3) = (27,36,27) Slicing plus array operations makes matrix algebra easy: @a = (1,2,3, 2,4,6, 3,6,9); @column1of3 = (1..7:3); # (1,4,7) - every 3rd elem from 1 to 7 @row1of3 = (1..3); # (1,2,3) $sum_col1_by_row1 = sum ( @a[@column1of3] * @a[@row1of3] ); # (1*1+2*2+3*3)=14 =head1 CONFLICTS L<RFC 45> specifies alternative semantics for C<&&> and C<||> in a list context, which is to evaluate the operands in a scalar context and then propagate the result to the left-hand side in a list context. The authors have been unable to resolve this conflict. This author does not support RFC 45, partly because it breaks consistency with RFC 82, and partly because it makes || and && act in a weird cross between scalar and list context (evaluates in scalar context, propagates in list context). Also, boolean operations are frequently applied to lists of elements. There is really nothing about this operation that makes it only useful for scalars. For example: @mask = (1,0,1,0); @a = (4,3,2,1); @b = ('a','b','c','d'); @mult_a = @mask * @a; # (4,0,2,0) @mask_a = @mask && @a; # (4,0,2,0) @mask_b = @mask && @b; # ('a',0,'c',0) Finally, the control-flow/short-circuiting side effet of || and && would be very useful for array operations. In lazily generated lists, using || or && for an element-wise operation would avoid computing elements of the RHS where not necessary to do so. RFC 45's proposal really only talks about flow-control (rather than boolean operations), but flow control statements can be used directly: @a = @c unless @a = @b; achieves the same thing as RFC 45's proposed @a = @b || @c without the double evaluation of @b caused by @a = @b ? @b : @c =head1 IMPLEMENTATION These operators and functions should be evaluated lazily. For instance: @b = (1,2,3); @c = (2,4,6); @d = (-2,-4,-6); $sum = reduce ^_+^_, @b * @c + @d; should be evaluated as if it read: $sum = 0; $sum += $b[$_] * $c[$_] + $d[_] for (0..$#a-1)); That is, no temporary list is created, and only one loop is required. The proposal to handle functions is tricky, since there is currently no obvious way to see whether a function is going to return a list. For instance, in the case: @b = abs(@a); we either need some kind of more advanced prototyping (or other way of creating a signature) so that Perl knows to apply abs() to the elements of @a, or we need to manually change abs to check for list context and Do The Right Thing. =head1 REFERENCES The Mathematica Navigator, Heikki Ruskeepää, Academic Press, ISBN 0-12-603640-3, p383. Expression Templates (C++ Implementation): http://extreme.indiana.edu/~tv- eldhui/papers/techniques/techniques01.html#l32 Implementation in Perl Data Language: http://pdl.perl.org Array operations in Numerical Python: http://starship.python.net/~da/numtut/array.html#SEC10 RFC 76: reduce RFC 23: Higher-order functions RFC 81: Lazily evaluated list generation functions RFC 202: Overview of multidimensional array RFCs