Luke Palmer wrote about:

=head1 Perl 6 and Set Theory

This document will introduce a new way of thinking about some Perl 6
constructs.  In addition, it proposes some minor changes that would
help this way of thinking be more consistent.  These changes may make
Perl 6 a better language in general, as a side effect.

Even in absence of an explicit "set" type, Perl 6 is quite proficient
in set theory.  There are two types of sets it knows about: finite and
infinite.  The former we call "junctions," while "classes" make up
infinite sets.
Theoretically at least, junctions can be non-finite too:

	$pos = any(1...);
	$big = any(1e6...);


=head2 Junctions

Junctions are a transparent mechanism for dealing with finite sets.
They indeed hold a discrete set of objects, but operations on the
junction are automatically delegated to operations on its members.
The two types of junctions, disjunctive (C<any>) and conjunctive
(C<all>) just determine how the junction is evaluated in boolean
context.
There are actually four types of junction:

	conjunction:   all(@elements)
	disjunction:   any(@elements)
	abjunction:    one(@elements)
	injunction:   none(@elements)


The expressions:

    1 | 2 | 3
    any(1, 2, 3)
    1 & 2 & 3
    all(1, 2, 3)
      1 ^ 2 ^ 3
      one(1, 2, 3)
      none(1, 2, 3)


represent the set:

{ 1, 2, 3 }

Which we will call N. Performing some
....unary...

operation I<z> on them constructs and returns the set:

    { z(x): x ∈ N }

If, for example, this operation is C<{ $^x + 2 }>, we have:

    { x + 2: x ∈ N }

or

    { 3, 4, 5 }

If that was a comparison operation, say C<{ $^x > 2 }>, the resultant
set would be:

    { undef, undef, 1 }
Err, no. The result is:

	{ 1 }

Comparison of a junction returns those states for which the comparison is true.
Furthermore, the resulting junction has a truth property assigned, according to
the truth of the comparison. Furtherurthermore, the type of the rsulting junction
is the same as the type of the junctive operand.

For example:

	all(1..4) > 2     ---->      all(3,3) but false
	all(1..4) > 0     ---->      all(1,2,3,4) but true

	any(1..4) > 2     ---->      any(3,4) but true
	any(1..4) > 5     ---->      any() but false

	one(1..4) > 2     ---->      one(3,4) but false
	one(1..4) > 3     ---->      one(4) but true

	none(1..4) > 2     ---->      none(3,4) but false
	none(1..4) > 5     ---->      none() but true



Evaluating this in boolean context should be true if the junction was
conjunctive, and false if it was disjunctive.  So, for a junction J,
boolean context evaluation is defined as follows:

          { ∃x: x ∈ J ∧ ?x, disjunctive
    ?J ≡ {
          { ∀x: x ∈ J ⇒ ?x, conjunctive
Pre-empted, of course, by the above truth/falsehood properties.

In general, the logic is:

            { ∃x: x ∈ J ∧ ?x,                            disjunctive
            {
            { ∀x: x ∈ J ⇒ ?x,                            conjunctive
      ?J ≡ {
            { ∃x: x ∈ J ∧ ?x ∧ ∀y: y ∈ J ⇒ x=y ∨ !y, abjunctive
            {
            { ∀x: x ∈ J ⇒ !x,                            injunctive



That is, the "type" of junction just determines whether the quantifier
is existential or universal.
....or exclusive or counterexistential.

There is one behavior of junctions, however, that doesn't work with
this definition:  If a boolean operation is applied in non-boolean
context to a junction, it will return a junction of only the values
for which the condition evaluated true.  But with a few minor changes,
this can be achieved:

    • Junctions discard C<undef>s on-sight (i.e. C<undef> can never be
      a member of a junction).
    • Comparison operators return C<undef> if they fail, and their
      left-hand side "C<but true>" if they succeed.
    • Comparison operators are right-associative.

Unfortunately, the last change clobbers the left-short-circuiting
behavior of comparison operators.  A way to fix this would be to treat
a chain of comparison operators as a single n-ary operator.  Other
methods for fixing this are welcome (and encouraged).
I'm not sure that any of the above semantic changes are required. All that is
needed is the define that logical operations on junctions return a junction of
the states of the leftmost junctive operand for which the operation was true.
And that the resultant junction be ascribed a C<true> or C<false> property
according to the overall truth/falsehood of the comparison.


Note that the definition of how junctions behave in itself allows
operators involving more than one junction to represent the outer
product of those junctions with respect to an operator.  Consider two
sets, A = any(1,2,3), and B = all(4,5,6).

    A + B = { x + B: x ∈ A }
          = { { x + y: y ∈ B }: x ∈ A }
          = { { 5, 6, 7 }, { 6, 7, 8 }, { 7, 8, 9 } }

Where each inner set is a conjunctive set, and the outer is a
disjunctive set.
No. At present, the proposal is that n arithmetic operation between
junctions produces a junction whose states are the distinct results
of the operation applied pairwise to all possible combinations of
states from each operand, and whose junctive type is the same as
the left-most operand.

This is consistent with the "parallelization" behaviour on arbitrary
subroutines:

	$A = any(1,2,3);
	$B = all(4,5,6);

	$C = foo($A, $B);   # $C = foo(1,4)|foo(2,4)|foo(3,4)|...|foo(3,6);

So:

      A + B = { x + y: x ∈ A, y ∈ B }
            = { 5, 6, 7, 8, 9 }

Presumably, under Luke's model:

	$C = foo($A, $B);

would be the same as:

	$C = foo(1,4)&foo(1,5)&foo(1,6) |
	     foo(2,4)&foo(2,5)&foo(2,6) |
	     foo(3,4)&foo(3,5)&foo(3,6);

See my comments on this below.



If you call this resultant set C, and ask:

    5 < C < 9

It would be true (in fact, "C<5 but true>"), as all(6, 7, 8) would
match.
Currently, it would be true because:

      5 < any(5, 6, 7, 8, 9) < 9

is true.

The semantics Luke suggests seem to be richer, but rely on the fixed
precedence relationships between the various junctive types. I'd really
like to (see someone else) explore how that expands to include abjunctions
and injunctions, before I contemplated so fundamental a change in the
semantics.



For instance, to declare a variable that can represent either an
integer or a string:

    my (Int | Str) $var;
Correct. It's not clear to me that the parens are required, though.



It is an interesting issue whether any junction can be used as a
class.  That is:

    my (1 | 2 | 4) $var;

would declare $var to only accept values 1, 2, and 4.  This probably
introduces too much complexity for what it's worth.
Yep. ;-)


I'd also like junctions and classes to be as powerful as real sets,
which means there needs to be an empty set. Is any() a good name for
this,
C<any()> works as an empty set.
So does C<all()>. And indeed C<one()>.

And which null set you use will depend on what you intend to do with it.
For example, if you're accumulating "seen" values, the empty set you need
to start with depends on whether they're conjunctively or disjunctively
being "seen". That is:

	my $seen = any();
	for <> {
	    next when $seen;
            do_something_with($_);
	    $seen |= $_;
	}

vs:

	my $seen = all();
	for <> {
	    if ($_ != $seen ) {
                do_something_with($_);
	        $seen &= $_;
	    }
	}
	

One might also contemplate C<none()> as the universal set. ;-)


> or should we use a more atomic name (like, god forbid, ∅).
I think not. At least, not without an explicit:

	use Zermelo::Fränkel;

;-)

Damian

Reply via email to