Wow. Thanks, Cedric. I'll definitely look into those options... On Sunday, March 25, 2012 4:28:04 AM UTC+2, Cedric Greevey wrote: > > On Sat, Mar 24, 2012 at 9:39 PM, Cedric Greevey <cgree...@gmail.com> > wrote: > > In increasing order of difficulty... > > > > Option 1: > > > > Extend your comparator to sort first on the key you're actually > > interested in, then if that key isn't different on the others > > more-or-less arbitrarily. > > Or use this: > > (defn po->to [po] > (let [m (atom #{})] > (fn [a b] > (or > (po a b) > (and > (not (po b a)) > (or > (< (hash a) (hash b)) > (and > (== (hash a) (hash b)) > (not= a b) > (or (contains? @m [a b]) > (and > (not (contains? @m [b a])) > (swap! m conj [a b])))))))))) > > (defn sorted-set-po [po & keys] > (apply sorted-set-by (po->to po) keys)) > > Input: a partial order (may be true for a b, true for b a, or false for > both). > > Output: a sorted-set that uses a total order built from the partial > order. For things where the po says neither a < b nor b < a, it will > first compare their hashes. If those don't collide, the order of a and > b is decided that way. If they do, and a and b are not actually equal, > it will arbitrarily pick one to be "less than" the other and remember > its decision so that the ordering of unequal, hash-colliding items > will remain consistent. > > Notes: > 1. The memory use can grow over time due to the memoization-like > behavior using the atom. However, this only occurs for items with both > order and hash collisions, which should be extraordinarily infrequent. > (If hashes were uniform on [0,2^32-1] you'd need an average of around > 65,000 items before, per the birthday paradox, a single collision was > likely within your collection. And then it would probably be for a > pair of items that po considered unequal.) So the memory use growth > should be small (for collections of thousands to millions of items) to > nonexistent (for smaller ones). > > 2. Two sorted-set-pos with the same items, where they were added in > different sequences, may not sort them in the same order, though this > should be rare (again tied to the frequency of order+hash collisions). > But this is not really a problem since, if you're sorting with a > partial order, it makes sense to consider the ordering within equality > buckets of the partial order to be undefined (the way ordering of an > entire hash-set is undefined) and thus does not make sense to make > program semantics depend on the ordering. Also, all sets compare equal > purely by having the same contents (I checked APersistentSet.java to > be sure) so the two sorted-set-pos would still compare equal despite > traversing their elements in slightly different sequences. > > Here's a clooj REPL session testing po->to. Note that the last tests > intentionally induce an order+hash collision of unequal items and > check that the same instance of a po->to produces consistent results > when given the colliding items in either order (that is, it always > returns logical true when given the items in one order and always > returns logical false when given them in the opposite order). The very > last tests verify that it always returns false for equal items. > > user=> > (def foo (po->to #(< (:a %1) (:a %2)))) > #'user/foo > > user=> > (foo {:a 1 :b 1} {:a 2 :b 1}) > true > > user=> > (foo {:a 1 :b 1} {:a 2 :b 1}) > true > > user=> > (foo {:a 1 :b 1} {:a 2 :b 1}) > true > > user=> > (foo {:a 2 :b 1} {:a 1 :b 1}) > false > > user=> > (foo {:a 2 :b 1} {:a 1 :b 1}) > false > > user=> > (foo {:a 1 :b 1} {:a 1 :b 2}) > true > > user=> > (foo {:a 1 :b 1} {:a 1 :b 2}) > true > > user=> > (foo {:a 1 :b 1} {:a 1 :b 2}) > true > > user=> > (foo {:a 1 :b 2} {:a 1 :b 1}) > false > > user=> > (foo {:a 1 :b 2} {:a 1 :b 1}) > false > > user=> > (foo {:a 1 :b 2} {:a 1 :b 1}) > false > > user=> > (hash {:a 1 :b 1 :c 1}) > -1253235521 > > user=> > (hash {:a 1 :b 1 :c 2}) > -1253235522 > > user=> > (hash {:a 1 :b 2 :c 1}) > -1253235520 > > user=> > (hash {:a 1 :b 0 :c 1}) > -1253235522 > > user=> > (foo {:a 1 :b 1 :c 2} {:a 1 :b 0 :c 1}) > #{[{:a 1, :c 2, :b 1} {:a 1, :c 1, :b 0}]} > > user=> > (foo {:a 1 :b 1 :c 2} {:a 1 :b 0 :c 1}) > true > > user=> > (foo {:a 1 :b 1 :c 2} {:a 1 :b 0 :c 1}) > true > > user=> > (foo {:a 1 :b 0 :c 1} {:a 1 :b 1 :c 2}) > false > > user=> > (foo {:a 1 :b 0 :c 1} {:a 1 :b 1 :c 2}) > false > > user=> > (foo {:a 1 :b 0 :c 1} {:a 1 :b 1 :c 2}) > false > > user=> > (def bar (po->to #(< (:a %1) (:a %2)))) > #'user/bar > > user=> > (bar {:a 1 :b 0 :c 1} {:a 1 :b 1 :c 2}) > #{[{:a 1, :c 1, :b 0} {:a 1, :c 2, :b 1}]} > > user=> > (bar {:a 1 :b 0 :c 1} {:a 1 :b 1 :c 2}) > true > > user=> > (bar {:a 1 :b 1 :c 2} {:a 1 :b 0 :c 1}) > false > > user=> > (bar {:a 1 :b 1 :c 2} {:a 1 :b 0 :c 1}) > false > > user=> > (bar {:a 1 :b 1 :c 2} {:a 1 :b 1 :c 2}) > false > > user=> > (bar {:a 1 :b 1 :c 2} {:a 1 :b 1 :c 2}) > false > > user=> > (bar {:a 1 :b 0 :c 1} {:a 1 :b 0 :c 1}) > false > > user=> > (bar {:a 1 :b 0 :c 1} {:a 1 :b 0 :c 1}) > false >
On Sunday, March 25, 2012 4:28:04 AM UTC+2, Cedric Greevey wrote: > > On Sat, Mar 24, 2012 at 9:39 PM, Cedric Greevey <cgree...@gmail.com> > wrote: > > In increasing order of difficulty... > > > > Option 1: > > > > Extend your comparator to sort first on the key you're actually > > interested in, then if that key isn't different on the others > > more-or-less arbitrarily. > > Or use this: > > (defn po->to [po] > (let [m (atom #{})] > (fn [a b] > (or > (po a b) > (and > (not (po b a)) > (or > (< (hash a) (hash b)) > (and > (== (hash a) (hash b)) > (not= a b) > (or (contains? @m [a b]) > (and > (not (contains? @m [b a])) > (swap! m conj [a b])))))))))) > > (defn sorted-set-po [po & keys] > (apply sorted-set-by (po->to po) keys)) > > Input: a partial order (may be true for a b, true for b a, or false for > both). > > Output: a sorted-set that uses a total order built from the partial > order. For things where the po says neither a < b nor b < a, it will > first compare their hashes. If those don't collide, the order of a and > b is decided that way. If they do, and a and b are not actually equal, > it will arbitrarily pick one to be "less than" the other and remember > its decision so that the ordering of unequal, hash-colliding items > will remain consistent. > > Notes: > 1. The memory use can grow over time due to the memoization-like > behavior using the atom. However, this only occurs for items with both > order and hash collisions, which should be extraordinarily infrequent. > (If hashes were uniform on [0,2^32-1] you'd need an average of around > 65,000 items before, per the birthday paradox, a single collision was > likely within your collection. And then it would probably be for a > pair of items that po considered unequal.) So the memory use growth > should be small (for collections of thousands to millions of items) to > nonexistent (for smaller ones). > > 2. Two sorted-set-pos with the same items, where they were added in > different sequences, may not sort them in the same order, though this > should be rare (again tied to the frequency of order+hash collisions). > But this is not really a problem since, if you're sorting with a > partial order, it makes sense to consider the ordering within equality > buckets of the partial order to be undefined (the way ordering of an > entire hash-set is undefined) and thus does not make sense to make > program semantics depend on the ordering. Also, all sets compare equal > purely by having the same contents (I checked APersistentSet.java to > be sure) so the two sorted-set-pos would still compare equal despite > traversing their elements in slightly different sequences. > > Here's a clooj REPL session testing po->to. Note that the last tests > intentionally induce an order+hash collision of unequal items and > check that the same instance of a po->to produces consistent results > when given the colliding items in either order (that is, it always > returns logical true when given the items in one order and always > returns logical false when given them in the opposite order). The very > last tests verify that it always returns false for equal items. > > user=> > (def foo (po->to #(< (:a %1) (:a %2)))) > #'user/foo > > user=> > (foo {:a 1 :b 1} {:a 2 :b 1}) > true > > user=> > (foo {:a 1 :b 1} {:a 2 :b 1}) > true > > user=> > (foo {:a 1 :b 1} {:a 2 :b 1}) > true > > user=> > (foo {:a 2 :b 1} {:a 1 :b 1}) > false > > user=> > (foo {:a 2 :b 1} {:a 1 :b 1}) > false > > user=> > (foo {:a 1 :b 1} {:a 1 :b 2}) > true > > user=> > (foo {:a 1 :b 1} {:a 1 :b 2}) > true > > user=> > (foo {:a 1 :b 1} {:a 1 :b 2}) > true > > user=> > (foo {:a 1 :b 2} {:a 1 :b 1}) > false > > user=> > (foo {:a 1 :b 2} {:a 1 :b 1}) > false > > user=> > (foo {:a 1 :b 2} {:a 1 :b 1}) > false > > user=> > (hash {:a 1 :b 1 :c 1}) > -1253235521 > > user=> > (hash {:a 1 :b 1 :c 2}) > -1253235522 > > user=> > (hash {:a 1 :b 2 :c 1}) > -1253235520 > > user=> > (hash {:a 1 :b 0 :c 1}) > -1253235522 > > user=> > (foo {:a 1 :b 1 :c 2} {:a 1 :b 0 :c 1}) > #{[{:a 1, :c 2, :b 1} {:a 1, :c 1, :b 0}]} > > user=> > (foo {:a 1 :b 1 :c 2} {:a 1 :b 0 :c 1}) > true > > user=> > (foo {:a 1 :b 1 :c 2} {:a 1 :b 0 :c 1}) > true > > user=> > (foo {:a 1 :b 0 :c 1} {:a 1 :b 1 :c 2}) > false > > user=> > (foo {:a 1 :b 0 :c 1} {:a 1 :b 1 :c 2}) > false > > user=> > (foo {:a 1 :b 0 :c 1} {:a 1 :b 1 :c 2}) > false > > user=> > (def bar (po->to #(< (:a %1) (:a %2)))) > #'user/bar > > user=> > (bar {:a 1 :b 0 :c 1} {:a 1 :b 1 :c 2}) > #{[{:a 1, :c 1, :b 0} {:a 1, :c 2, :b 1}]} > > user=> > (bar {:a 1 :b 0 :c 1} {:a 1 :b 1 :c 2}) > true > > user=> > (bar {:a 1 :b 1 :c 2} {:a 1 :b 0 :c 1}) > false > > user=> > (bar {:a 1 :b 1 :c 2} {:a 1 :b 0 :c 1}) > false > > user=> > (bar {:a 1 :b 1 :c 2} {:a 1 :b 1 :c 2}) > false > > user=> > (bar {:a 1 :b 1 :c 2} {:a 1 :b 1 :c 2}) > false > > user=> > (bar {:a 1 :b 0 :c 1} {:a 1 :b 0 :c 1}) > false > > user=> > (bar {:a 1 :b 0 :c 1} {:a 1 :b 0 :c 1}) > false > > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en