Re: A common and useful thing that doesn't appear to be easy in Perl 6

Larry Wall Tue, 06 Apr 2010 21:53:40 -0700

On Tue, Apr 06, 2010 at 08:31:24PM -0700, Damian Conway wrote:
: An issue came up in a class I was teaching today...
: 
: There doesn't seem to be an easy way to create a type that allows a set
: of enumerated bit-flags *and* all the combinations of those flags...and
: nothing else.
: 
: For example:
: 
:      enum Permissions ( Read => 0b0001, Write => 0b0010, Exec => 0b0100 );
: 
:      my Permissions $rwx = Read +| Write;    # Error


I think we're getting hung up on powers-of-two here, and the low-level
implementation details of stuffing bits into numbers.  Instead, I think
we should take a hint from that ancient tongue, Pascal, and simply
make sets of small integers do the right thing with respect to bitmaps.
Then normal enum semantics can provide those small integers, and all
the powers-of-two business is implicit in the construction of the set:

   enum Permissions <Read Write Exec>;
   subset Perms of Set of Permissions;
   my $rwx = Perms(Read,Write);

   say $rwx.perl;  # Perms(Read,Write) or some scuh
   say ~$rwx       # "Read Write"
   say $rwx.elems; # 2
   say +$rwx;      # 3

That is, the numeric value would be suitable for ORin into any numeric
set of permissions.  Sets based on types that are not amenable to
representation as bitmaps would, of course, not produce a meaningful
numeric value.

Alternately, + could return the same as .elems, and a different method
could be used to get the bitmap, but then it becomes harder to write
expressions using the bitmaps.  Perhaps that's a feature.

Also, while presumably Int can hold a bitset of any size, we might also
want to play with bitsets represented as strings.  We could overload
stringification for that, but that would likely be a mistake.  The
string forms of such sets should probably remain human readable.

: I know I could use junctions:
: 
:     subset BitFlag where Read|Write|Exec;
: 
:     my BitFlag $rwx = Read | Write;
: 
: but it's not easy to recover the actual bitpattern from that, except with:
: 
:     $bitpattern = [+|] $rwx.eigenstates;
: 
: which is suboptimal in readability (not to mention that it requires
: MONKEY_TYPING to allow access to the .eigenstates method).

To the extent that Junctions can be mapped to Sets, it's actually
only "all" junctions that correspond precisely to strict definition
of a set.  That is, a set is defined as all of its members.  It is
very much *not* a container for some subset of its members.  Hence,
it might be permissible to convert an "all" junction to a set without
invoking any monkey business.

    Set(Read | Write)   # bogus, R|W is really 3 sets, R, W, and RW!
    Set(Read & Write)   # okay, can only represent RW

(assuming here that the Set coercion can be passed a junction without
autothreading).

: Ideally, what I'd like to be able to do is something like:
: 
:     enum Permissions is bitset < Read Write Exec >;
: 
:     # The 'is bitset' starts numbering at 0b0001 (instead of the usual zero)
:     # and doubles each subsequent enumeration value (instead of the usual ++).
:     # It also implicitly fills in the various other bitwise-or permutations
:     # as valid-but-nameless enumerated values
: 
:     my Permissions $rwx = Read +| Write;    # Now fine

If you really want to use +| and friends for their innate *cough*
beauty, you can still do so--presuming these bitmapping sets numerify
to the correct bitmap.  However, I suspect some people might prefer
(or come to prefer) using set operators, particularly as these get used
for collections that people think of more as sets.  In this sense,
starting out with a use case based on low-level file status bits is
perhaps not most indicative of the best way forward.

: The closest I can think of at the moment is something like:
: 
:     enum NamedPermissions ( Read => 0b0001, Write => 0b0010, Exec => 0b0100 );
: 
:     subset Permissions of Int where 0 .. [+|]NamedPermissions.enums.values;
: 
:     my Permissions $rwx = Read +| Write;    # Fine
: 
: which is still a little too constructive, too explicit (you have to get the
: bit values right), as well as being too obscure for such a common task.
: 
: 
: Of course, I could always create a macro to encapsulate the explicit
: constructive obscurity:
: 
:     macro bitset ($typename, @value_names)
:         is parsed(/:s (<ident>) '<' ( <ident> )+ '>' /)
:     {
:         # [build code to implement the above trick here]
:     }
: 
:     # and later...
: 
:     bitset Permissions < Read Write Exec >;
: 
: but this is such a common requirement in engineering applications that it
: would be great if this wheel didn't constantly have to be reinvented.
: 
: 
: If anyone can think of a cleaner way to do it within the current semantics,
: I'd be very happy to hear of it. I'm jetlagged and bleary from a full day of
: teaching and I may well be missing an obvious answer.

I think Sets could do this cleanly in the current design, given a little
shove in the Pascal direction, but we could still perhaps do with a bit
more syntactic and/or semantic sugar.  It's just a bit awkward, after
you say:

   enum Permissions <Read Write Exec>;
   subset Perms of Set of Permissions;

that the name of the single-member sets are

    Perms(Read)
    Perms(Write)
    Perms(Exec)

We can't just steal the enum names for the single-member set names
though; they would disagree by those powers of two.  So Exec is going
to numerify to 2 while Perms(Exec) will numerify to 4.  But maybe
the situation of single-member sets just doesn't arise often enough
to sweat it, and the extra burden of coercing to the set type is
actually a feature.  Most of the time you'd be coercing a bunch of
members all at once to a set, and just doing set theory from there.
Or maybe the membership ops such as ∋ can just be smart about such
bitwise members.

Larry

Re: A common and useful thing that doesn't appear to be easy in Perl 6

Reply via email to