Re: A nil puzzle

Mark Engelberg Sat, 24 Jan 2009 14:01:01 -0800

On Sat, Jan 24, 2009 at 9:06 AM, Rich Hickey <richhic...@gmail.com> wrote:
> This is already wrong, right? Don't you mean (set s)? You haven't said
> what (powerset [1 2 3 2 1]) is going to be. subsequences != subsets.


Despite the name "powerset" being inspired by sets, when I posed the
puzzle of generalizing the behavior to sequences, I meant:
(powerset [1 2 1]) to mean something like (nil (1) (2) (1) (1 2) (1 1)
(2 1) (1 2 1))
Because sets are the only things that care about uniqueness, this is
how I'd expect sequences to behave.  I didn't specify this behavior
clearly, and I can see why my open-ended question could cause
confusion.  But I don't consider this to be essential to my main
points, though.

> You gotten tied up, here and elsewhere, conflating sequence with
> container, and almost all of your problems stem from wanting to say
> sequence and have it mean container. If you want to put the contents
> of a (possibly empty) set into a sequential container, use a
> sequential container - () or []. The sequential? predicate will cover
> both of those as well as lazy sequences of set contents.

There are a lot of different collections in Clojure (can the words
collection and container be used interchangeably in Clojure?).  But
they are all connected in the sense that they all implement a
list-like view of their contents.  This is IMO one of the coolest
things about Clojure.  Because most of the core Clojure functions use
this list-like view to manipulate collections, as soon as you use one
of these functions, you get back something different.  When I apply
cons, lazy-cons, first, rest, map, filter, etc. to something like [1 2
3], #{1 2 3} or '(1 2 3), I get back something that acts like a list,
but isn't necessarily a list.  What should I call this general concept
of what these functions return?  I've been calling this general
concept a sequence (which seems to be supported by the fact that these
functions are listed in the sequence-in sequence-out section of the
docs).  And yes, I also think of these non-specific sequences as
collections, because regardless of how they are implemented, they act
like lists (which are containers).   How can I not think of (rest [1 2
3]) or (lazy-cons 2 #{3}) as both a sequence, as well as some sort of
collection of 2 and 3?  That's certainly how they act.

So yes, I agree that I'm conflating sequence with container, but I'm
still looking for a better way to think about and classify these
inputs and outputs.  I'd certainly like to have a better
classification, because as I've tried to illustrate, when you approach
things with this viewpoint, the lack of an empty sequence feels a bit
bizarre - an odd divergence in the way that sequences mirror lists.
So maybe it all comes down to needing better terminology to discuss
the inputs and outputs of Clojure's sequence functions...

When analyzing any function, we need to understand what its domain
(set of inputs) and range (set of outputs) are.  So what are the
domain and range of Clojure's core sequence functions?

The domain is the set of things on which (seq x) works.  Is there a
standard term for this (I've been using the term seq-able)?  Is there
a built-in predicate to test this important classification?

The range is (I think) a proper subset of the domain.  The outputs of
Clojure's sequence functions are all seq-able, but certain things are
missing from the set of outputs.  You won't ever get back a vector,
for example, or any empty collection.  Is there a standard term for
the range of seq?  Is there a built-in predicate to test this
important classification?

> seq? tests if something implements ISeq, no more, no less - it is not
> the defining predicate of sequence. Is that what you want to promise?
>
> Or do you want to promise you'll return something on which (seq x)
> works? It works for nil and () and [], and the user wouldn't look for
> any one specifically.
>
> Or do you want to promise that it returns the result of seq. It is
> this notion that is most commonly meant by sequence in Clojure,
> whether you like it or not.

I think that the promise I want to make is one that is as consistent
as possible with Clojure's core sequence functions.

So: the domain of powerset should be the same as the domain of seq.
The range of powerset should be a homogeneous sequence whose elements
all belong to the range of seq.

> What you can't do is say - look, I promised sequence but it's broken
> because (seq? nil) -> false.

Agreed.  But I think it is reasonable to say - hey, this thing should
return a homogeneous sequence, but I can't find a label that
accurately captures the commonality of the elements as well as the
spirit of the problem.  All the standard Clojure concepts and
predicates don't quite work, so this is an awkward decision.

But although I didn't see it at the time of my initial post, it seems
now that the crux of my problem is that we lack good labels and
predicates for the domain and range of seq, even though these are
things we use all the time.  I was complaining that predicates like
seq?, sequential?, and list? didn't provide reasonable promises, when
in fact, these predicates don't match Clojure's internal promises
either.  It just becomes harder to ignore these details when you deal
with nested sequences, and you have to figure out exactly what value
belongs inside.

Right now, a certain amount of handwaving seems fine.  But as bigger
projects get written and Clojure, and as people develop automated
tools like QuickCheck to randomly generate test cases, and test
contracts, being able to specify these promises clearly starts to
matter.

I think the concepts I'm looking for would be something like this:
Domain of seq (called seqable).
(defn seqable? [s] (or nil (sequential? s)))
Range of seq (called sequence)
(defn sequence? [s] (or nil (seq? s)))

So, using this terminology, generalized powerset takes a seqable and
returns a sequence of sequences.  Does this sound right?

But would people be able to remember the distinction between
similar-sounding seq?, sequential?, seqable?, and sequence?  Probably
not.

> I'm not arguing against a convention here, and am amenable to your
> argument about the practicality of nil, which is covered by the
> promises "something on which (seq x) works" and "the result of seq".
>

Yes.  Now that I've clarified my thought processes in terms of
matching the domain and range of seq, I'm all the more confident that
(nil (1) (2) (3) (1 2) (1 3) (2 3) (1 2 3)) is the "right choice" for
this particular example, and similar problems.

>> In discussions about generating a
>> sequence of all permutations, all combinations, etc., you inevitably
>> end up with arguments over issues like what (permutations nil) should
>> be (some say (nil), some say (()), and some say nil).  If you have an
>> empty sequence, these issues go away.
>>
>
> Really? Don't you still have a problem with "some say (empty-seq),
> some say empty-seq" - why would that be different?

This is getting away from my main point, but since you asked, I'd like
to respond that no, you wouldn't have this problem with an empty
sequence.  Here's why:
Standard definition of permutations of lists:
The permutations of a list S is a list of all the arrangements of S.
So, the permutations of the empty list is a list of all the
arrangements of the empty list.
The only arrangement of the empty list is the empty list.
Therefore, the permutations of the empty list is the list of the empty list.

This falls straight out of the definition (and also happens to be the
natural base case for a recursive definition).  No ambiguity.  But as
soon as you try to define this for Clojure sequences, you run into
problems.  If you try to mirror the list-based definition, you'd
expect (permutations nil) to yield (nil) (and that's certainly how I'd
code it).  However, it has been said repeatedly that "nil is not the
empty sequence... nil is nothing."  So people can reasonably argue
that
(permutations nil) should be nil because it doesn't even make sense to
try to find the arrangements of "nothing".  nil has three purposes in
Clojure: it serves as an empty-like base case (e.g., (empty? nil) is
true, and (cons 1 nil) is (1)), it serves as a false value, and it
serves as a Java-like null "nothing" value.  This triple purpose
causes quite a bit of ambiguity that an empty sequence just doesn't
have.  To answer the question what does (permutations nil) mean, we
first have to figure out which conception of nil applies here.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: A nil puzzle

Reply via email to