Re: Schwartzian Transform

Mark-Jason Dominus Wed, 28 Mar 2001 05:52:56 -0800

> So you can say
> 
>   use Memoize;
>   # ...
>   memoize 'f';
>   @sorted = sort { my_compare(f($a),f($b)) } @unsorted
> 
> to get a lot of the effect of the S word.

Yes, and of course the inline version of this technique is also
common:

   @sorted = sort { my $ac = $cache{$a} ||= f($a);
                    my $bc = $cache{$b} ||= f($b);
                    my_compare($ac,$bc);
                  } @unsorted;

Joseph Hall calls this the 'Orcish Maneuver'.

However (I don't know who suggested this, but:)

> > > > >I'd think /perl/ should complain if your comparison function isn't
> > > > >idempotent (if warnings on, of course).  If nothing else, it's probably an
> > > > >indicator that you should be using that schwartz thang.

I have to agree with whoever followed up that this is a really dumb
idea.  It reminds me of the time I was teaching the regex class at
TPC3, and I explained how the /o in

        /$foo/o

represents a promise to Perl that $foo will never change, so Perl can
skip the operation of checking to see if it has changed every time the
match is performed.  Then there was a question from someone in the
audience, asking if Perl would emit a warning if $foo changed.

On the other side of the argument, however, I should mention that I've
planned for a long time to write a Sort::Test module which *would*
check to make sure the comparator function behaved properly, and would
report problems.   When you use the module, it would make all your
sorts run really slowly, but you would get a warning if your
comparator was bad. 

Idempotency is not the important thing here.  The *important* property
that the comparator needs, and the one that bad comparators usually
lack is 
        if my_compare(a,b) < 0, and
           my_compare(b,c) < 0, then it should also be the case that
           my_compare(a,c) < 0

        for all keys a, b, and c.

Sort::Test would run a quadratic sort such as a bubble sort, and make
sure that this essential condition held true.  Note in particular that
if the comparator has the form { my_compare(f(a),f(b)) }, then it does
not matter if f() is idempotent; what really matters is that
my_compare should have the property above.

I had also planned to have optional checks:

        use Sort::Test 'self';

                (Make sure that my_compare(a,a) == 0 for all a)

        use Sort::Test 'twice';

                (Make sure that my_compare(a,b) == my_compare(a,b) for all a,b)

This last is essentially the idempotency restriction again.  The
reason I've never implemented this module is that in perl 5, sort()
cannot be overridden, so the usefulness seemed low; you would have to
rewrite your source code to use it.  I hope this limitation is fixed
in perl 6, because it would be a cool hack.

Finally, another argument in the opposite direction yet again.  It has
always seemed to me that this 'inconsistent sort comparator' thing is
a tempest in a teapot.  In the past it has gotten a lot of attention
because some system libraries have a qsort() function that dumps core
if the comparator is inconsistent.  

To me, this obviously indicates a defective implementation of
qsort().  If the sort function dumps core or otherwise detects an
inconsistent comparator, it is obviously functioning suboptimally.  An
optimal sort will not notice that the comparator is inconsistent,
because the only you can find out that the comparator is returning
inconsistent results is if you call it in a situation where you
already know what the result should be, and it returns a different
result.  An optimal sort function will not call the comparator if it
already knows what the result should be!

For example, consider the property from above:
        if my_compare(a,b) < 0, and
           my_compare(b,c) < 0, then
           my_compare(a,c) < 0.

If the qsort() already knows that a<b and b<c, what is it doing
calling my_compare(a,c)?  It should be able to figure that out without
the call!  The fact that it dumps core if c<a means that it has called
my_compare() when it didn't need to.  Similarly, a qsort that dumps
core if my_compare(a,a) != 0 has obviously wasted time by calling
my_compare() at all in this case; it should *assume* that my_compare
will return 0 here.

So perhaps the best answer to the whole discussion is that if qsort()
dumps core when given an inconsistent comparator, you should replace
it with a better qsort.
Re: Schwartzian Transform

Reply via email to