I just implemented Bag to the point where it passes the spectests. (https://github.com/masonk/rakudo/commit/2668178c6ba90863538ea74cfdd287684a20c520) However, in doing so, I discovered that I'm not really sure what Bags are for, anymore.
The more I think about Bags and Sets, the more my brain hurts. They're a half an EnumMap and half an Iterable that does Associative but not Positional. However, I'm starting to believe that they are more like Iterables than EnumMaps. When I imagine using them, I think of Sets as a cute way to operate on the unique elements of an Iterable. I think of Bags / KeyBags as a way to remove ordering, which is a generally useful thing (everything that I'm about to say applies to both Bags and KeyBags, but I'm going to only talk about Bags for the rest of this post). This is because, most of the time, we don't care about ordering, and having ordering on all of our collections even when we don't need it increases program complexity in time in a way that could be seen as analogous to the way in which unnecessarily global variables increased the space complexity of Perl 5. I want to propose one major change to the Bag spec: When a Bag is used as an Iterable, you get an Iterator that has each key in proportion to the number of times it appears in the Bag. With this one change to Bags, I could use them whenever I don't need ordering in my lists - which is usually. Even though there are some side effects that don't rely on ordering (e.g., incrementation), the majority of them do - so by using this new kind of Bag, I would be reducing the complexity of my programs. Now, since Sets already give us the distinct values, having Bags do the same thing seems like redundant functionality, where we could be getting novel functionality. I'd like to anticipate one objection to this - the existence of the 'hyper' operator/keyword. The hyper operator says, "I am taking responsibility for this particular code block and promising that it can execute out of order and concurrently". Creating a Bag instead of an Array says, "there is no meaning to the ordering of this group of things, ever". Basically, if I know at declaration time that my collection has no sense of ordering, then I shouldn't have to annotate every iteration of that collection as having no sense of ordering, which is nearly what hyper does (though, I readily admit, not quite, because there are unordered ways to create race conditions). I also have some convenience syntax suggestions. I do think this is important because Bags and Sets are competing with Arrays. If they aren't as convenient as Arrays to use, they won't get used - even though they're closer, semantically, to what the developer wants in a lot of cases. First, we should besigil Bags and Sets with @ instead of $. Without this convenience, I'm not likely to replace my Arrays with Bags, because going through them in a loop or map would be a pain compared to Arrays. If I have to say $bag.keys every single time, forgettaboutit. This, however, probably requires a change to S03, which says that the @ sigil is a means of coercing the object to the "Positional (or Iterable?)" role. It seems to me, based on the guiding principle that perl6 should support functional idioms and side-effect free computing, the more fundamental and important aspect of things with @ in front is that you can go through them one by one, and not that they're ordered (since ordering is irrelevant in functional computing, but iterating is not). My feeling is that we should reserve the special syntax for the more fundamental of the two operations, so as not to bias the programmer towards rigid sequentiality through syntax. Second, I would be even more likely to replace my ordered lists with Bags if there were a convenient operator for constructing Bags. I can't think of any good non-letter symbols that aren't taken right now (suggestions welcome), but, at least, &b and &s as aliases to bag and set would be convenient. Bags and Sets thus updated would look like this in use: C< my @array = < a a b c >; my @set = s...@array; for s...@array { say $_ }; for @set { say $_ }; # same thing # b«»a«»c«» # ordering undefined # most common use case for sets, I think, is "unique elements of @array", isn't it? hyper for @bag { ... }; # a«»b«»c«» a«» # ordering undefined => less-thinking-required hyper b< a b c c > === b< c c b a > # Wouldn't this be the best way to make a comparison with these semantics? # By the way, this useful idiom works as currently specced, but doesn't work in my implementation @bag{a} # 2 @bag{<a b z>} # 2, 1, 0 [+] bag @array{<a b z>} # 3 # this is also neat for "How many a's, b's, and z's do I have?" +...@bag # 4 @bag[2] # I can't think of a meaning for this - not Positional - S03 needs a change? @bag.WHAT # Bag() @bag.pairs # a => 2, b => 1, c => 1 # ordering undefined @bag.values # 2, 1, 1 # ordering undefined Junctions: Junctions seem like one time when we care more about the values than the keys, because C<any|all|none|one> on @array and b...@array will have the same behavior (if my suggestion above is taken with respect to @bag holding < a a b c > out of order instead of < a b c > out of order), and for Sets, it's the same story, with the added proviso that C<one> degenerates to C<any>. But @bag.any > $x seems like a pretty useful idiom. It would feel inconsistent for any(@bag) and @bag.any to do different things, however. On Oct 26, 2010, at 12:57 AM, nore...@github.com wrote: > Branch: refs/heads/master > Home: http://github.com/perl6/specs > > Commit: 32511f7db34905c740ed1030a70995239f7cfb66 > > http://github.com/perl6/specs/commit/32511f7db34905c740ed1030a70995239f7cfb66 > Author: TimToady <la...@wall.org> > Date: 2010-10-25 (Mon, 25 Oct 2010) > > Changed paths: > M S02-bits.pod > > Log Message: > ----------- > [S02] be more explicit about iterating sets/bags > > The intent has always been that when you use a set or bag as a list, > it behaves as a list of its keys, regardless of any underlying hash > interface it might also respond to. You must use .pairs explicitly > to get the hash pairs out of a set or bag as a list. > >