Re: Musings on operator overloading

Larry Wall Wed, 26 Mar 2008 10:59:04 -0700

On Wed, Mar 26, 2008 at 06:06:29PM +0100, TSa wrote:
> HaloO,
>
> Larry Wall wrote:
>> That interpretation doesn't help me solve my generic parsing problems,
>> which is about the relationship of op1 to op2 and op3 in
>>
>>     op1 a() op2 b() op3 c()
>>
>> and presumably the same thing for postfixes in the other order.
>
> My idea is to have a term re-writing stage before the precedence
> parser does its job. I assume that "chalkboard mathematics" means
> term re-writing. Which sort of means that infix:<->($x,$y) is a macro
> that expands to infix:<+>($x,prefix:<->($y)).


Alas, you can't figure out which terms to rewrite until you've parsed them.

>> So here's another question in the same vein.  How would mathematicians
>> read these (assuming Perl has a factorial postfix operator):
>
> Without implying to actually being a mathematician I'll
> give my thoughts on the subject.
>
>
>>     1 + a(x)**2!
>
> That is a poor version the the version below and obviously
> depends on the precedence that ! has relative to **.
>
>>     1 + a(x)²!
>
> This means to me to square the return value of a(x), then
> take the factorial and then add 1. Getting a(x) raised to
> 2! would require the ! to be superscripted as well. Its ASCII
> version would explicitly require a(x)**(2!). So a(x)**2! is
> either ambiguous or requires lower precedence for !. My actual
> reading of the ASCII version picks a(x) as the operation with
> highest precedence and going from there outwards encountering
> + to the left and ** to the right with ** being of higher
> precedence. Then I'm left with + to the left and ! to the right
> with precedence of ! higher than +.

That's what I thought.  Now note that ! can't easily be rewritten
as a simple binary operator (unless you do something recursive, and
then it's not simple).

Now, I think I know how to make the parser use precedence on either
a prefix or a postfix to get the desired effect (but perhaps not going
both directions simulatenously).  But that leads me to a slightly
different parsing question, which comes from the asymmetry of postfix
operators.  If we make postfix:<!> do the precedence trick above with
respect to infix:<**> in order to emulate the superscript notation,
then the next question is, are these equivalent:

    1 + a(x)**2!
    1 + a(x)**2.!

likewise, should these be parsed the same?

    $a**2i
    $a**2.i

and if so, how to we rationalize a class of postfix operators that
*look* like ordinary method calls but don't parse the same.  In the
limit, suppose some defines a postfix "say" looser than comma:

    (1,2,3)say
    1,2,3say
    1,2,3.say

Would those all do the same thing?  Or should we maybe split postfix
dot notation into two different characters depending on whether
we mean normal method call or a postfix operator that needs to be
syntactically distinguished because we can't write $a.i as $ai?

I suppose, if we allow an unspace without any space as a degenerate
case, we could write $a\i instead, and it would be equivalent to ($a)i.
And it resolves the hypothetical postfix:<say> above:

    1,2,3.say           # always a method, means 1, 2, (3.say)
    1,2,3\ say          # the normal unspace, means (1, 2, 3)say
    1,2,3\say           # degenerate unspace, means (1, 2, 3)say

This may also simplify the parsing rules inside double quoted
strings if we don't have to figure out whether to include postfix
operators like .++ in the interpolation.  It does risk a visual
clash if someone defines postfix:<t>:

    $x\t                # means ($x)t
    "$x\t"              # means $x ~ "\t"

I deem that to be an unlikely failure mode, however.  So maybe .++
is just gone now, and you have to write \++ instead.  Any objections?

I suppose that means that .(), .[], and .{} would also be gone, in favor
of \(), \[], and \{}.  We'd given a different meaning to &foo\() at one
point, but I think that went away.  And since these postfixes aren't
alphanumeric, you don't need the degenerate unspace to separate them
from a variable name.  So you generally wouldn't see them unless you
wanted $a\ () with extra whitespace.

Another possible downside: getting rid of the .postop form also
has the side effect of getting rid of these bare forms

    .++         # meaning ($_)++
    .()         # meaning ($_)()
    .[]         # meaning ($_)[]
    .{}         # meaning ($_.{}
    .<>         # meaning ($_)<>
    .i          # meaning ($_)i   :)

I'm not sure that's a great loss.  It does suggest an alternate
approach, though, which is to keep the .() forms with forced method
precedence, and only require \x form for alpha postfixes that want
to keep their own precedence.  Not sure if I like that entirely,
but it could fall out naturally from the definition of unspace in
the current scheme of things, so maybe it's a non-issue.

And it makes it really easy to define postfixes equivalent to
prefixes:

    &postfix:<-> ::= &prefix:<->;
    $x\-

    &postfix:<abs> ::= &prefix:<abs>;
    $x\abs

and maybe even

    &circumfix:<| |> ::= &prefix:<abs>;
    |$x|

except for the fact that that would collide with prefix:<|>...

Larry

Re: Musings on operator overloading

Reply via email to