Benchmarking the JVM can be difficult, as it uses a JIT compiler. Your
examples tally more or less with my own benchmarks, but for the most
accurate results I'd recommend using Criterium
<https://github.com/hugoduncan/criterium> and ensuring that the :jvm-opts
map in your project file is empty:

    :jvm-opts ^:replace {}

The reason that Clojure distinguishes between nil and the empty list is
because seqs may be lazy and nil is not. Clojure doesn't know whether a
lazy seq is empty or not until it's realised.

Regarding your benchmarks, it's true that checking whether non-seq
collections are empty takes longer than it could do, but in practice I
don't think this matters overly much. If you need to check whether a
collection is empty many times, chances are you're iterating over the
collection, in which case it should already have been transformed into a
seq.

I guess the question is: are there any scenarios that don't involve
iteration where you'd want to check whether a collection is empty many
times?

- James

On 1 October 2015 at 20:31, Dave Tenny <dave.te...@gmail.com> wrote:

> So I understand that 'seq' is the idiomatic way to see if a
> collection/sequence is empty.
>
> Logically I'm looking for an O(1) predicate with which I can determine if
> a seq/collection is empty, and a well behaved
> one that is idempotent and side effect free (for general performance
> reasons).
>
> 'seq'  leaves something to be desired here since it does heap allocations
> when called with various types of arguments.
>
> I'll also freely confess that I dislike seq as an emptiness predicate, I
> like empty? and not-empty? types of things, even better,
> I loved Common Lisp's treatment where NIL and empty lists have the same
> value in conditional situations
> (i.e. if x is an empty list it is == nil and is false, unlike Clojure).  I
> guess what I'm saying here is that 'seq' as idiomatic clojure for
> empty tests is distasteful.
>
> In my own personal libraries I use a predicate to give the common lisp
> like semantics I desire, but it still calls clojure.core/empty?,
> which still calls 'seq'.  So all I've done is make the test more expensive
> even if I find it more readable.
>
> While looking at the Clojure 1.7 implementation of 'seq' today in RT.java,
> I happened across the implementation of 'count',
> and it occurred to me that it was a much leaner and meaner test for empty
> sequences than 'seq'.  So here are some timings.
>
> user> (doseq [thing [[] {} #{} (seq []) (into [] (range 0 1000)) (range 0
> 1000)]]
>         (println "Time for 1M seq on" (type thing))
>         (time (dotimes [i 1000000] (seq thing))))
> Time for 1M seq on clojure.lang.PersistentVector
> "Elapsed time: 18.993869 msecs"
> Time for 1M seq on clojure.lang.PersistentArrayMap
> "Elapsed time: 14.461354 msecs"
> Time for 1M seq on clojure.lang.PersistentHashSet
> "Elapsed time: 23.044994 msecs"
> Time for 1M seq on nil
> "Elapsed time: 8.291876 msecs"
> Time for 1M seq on clojure.lang.PersistentVector
> "Elapsed time: 21.214759 msecs"
> Time for 1M seq on clojure.lang.LongRange
> "Elapsed time: 6.27834 msecs"
>
> Note the time on PersistentVector, we may have an O(n) implemention of
> `seq`
> which is undesirable for an emptiness predicate.  Larger inputs require
> more time for 'seq' to return.
>
> Now compare the same timings with 'count', adding in a clojure-level
> emptyness test (which would
> no doubt be faster in RT.java):
>
> user> (doseq [thing [[] {} #{} (seq []) (into [] (range 0 1000)) (range 0
> 1000)]]
>         (println "Time for 1M count on" (type thing))
>         (time (dotimes [i 1000000] (== (count thing) 0))))
>
> Time for 1M count on clojure.lang.PersistentVector
> "Elapsed time: 8.061005 msecs"
> Time for 1M count on clojure.lang.PersistentArrayMap
> "Elapsed time: 7.297715 msecs"
> Time for 1M count on clojure.lang.PersistentHashSet
> "Elapsed time: 10.165718 msecs"
> Time for 1M count on nil
> "Elapsed time: 4.190523 msecs"
> Time for 1M count on clojure.lang.PersistentVector
> "Elapsed time: 8.772564 msecs"
> Time for 1M count on clojure.lang.LongRange
> "Elapsed time: 27.166397 msecs"
>
>
> Except for the LongRange input, `count` is much faster.
>
> So I'm just bringing it up for discussion, wondering if we can get a
> faster clojure.core
> implementation of `empty?`.
>
> That said, there is one apples-to-apples problem in the above.
>
> (seq x) returns nil if empty.  (== (count x) 0) returns true if nil is
> empty.
>
> Clojure neophyte that I am, I can't understand where there is not a "!="
> function for numbers.
> Wrapping the above test in a 'not' gives:
>
> user> (doseq [thing [[] {} #{} (seq []) (into [] (range 0 1000)) (range 0
> 1000)]]
>         (println "Time for 1M count on" (type thing))
>         (time (dotimes [i 1000000] (not (== (count thing) 0)))))
>
> Time for 1M count on clojure.lang.PersistentVector
> "Elapsed time: 15.961422 msecs"
> Time for 1M count on clojure.lang.PersistentArrayMap
> "Elapsed time: 11.12519 msecs"
> Time for 1M count on clojure.lang.PersistentHashSet
> "Elapsed time: 12.899191 msecs"
> Time for 1M count on nil
> "Elapsed time: 7.119841 msecs"
> Time for 1M count on clojure.lang.PersistentVector
> "Elapsed time: 11.454936 msecs"
> Time for 1M count on clojure.lang.LongRange
> "Elapsed time: 28.544915 msecs"
>
> The O(1) times double when wrapping the test in 'not'.
>
> Perhaps some clojurian will tell me what I should use.
>
> Btw, don't even thing about 'not=':
>
> user> (doseq [thing [[] {} #{} (seq []) (into [] (range 0 1000)) (range 0
> 1000)]]
>         (println "Time for 1M count on" (type thing))
>         (time (dotimes [i 1000000] (not= (count thing) 0))))
> Time for 1M count on clojure.lang.PersistentVector
> "Elapsed time: 44.768754 msecs"
> Time for 1M count on clojure.lang.PersistentArrayMap
> "Elapsed time: 40.158507 msecs"
> Time for 1M count on clojure.lang.PersistentHashSet
> "Elapsed time: 41.868068 msecs"
> Time for 1M count on nil
> "Elapsed time: 36.098277 msecs"
> Time for 1M count on clojure.lang.PersistentVector
> "Elapsed time: 42.149571 msecs"
> Time for 1M count on clojure.lang.LongRange
> "Elapsed time: 59.107886 msecs"
>
> Well, that's not fair of course since '(not (==' and '(not (='  are two
> different things.
> But then you'd expect '=' to be as slow, and it isn't, so somethings gets
> optimized with '=' vs. '=='.
>
> user> (doseq [thing [[] {} #{} (seq []) (into [] (range 0 1000)) (range 0
> 1000)]]
>         (println "Time for 1M count on" (type thing))
>         (time (dotimes [i 1000000] (= (count thing) 0))))
>
> Time for 1M count on clojure.lang.PersistentVector
> "Elapsed time: 10.324226 msecs"
> Time for 1M count on clojure.lang.PersistentArrayMap
> "Elapsed time: 7.858452 msecs"
> Time for 1M count on clojure.lang.PersistentHashSet
> "Elapsed time: 10.132961 msecs"
> Time for 1M count on nil
> "Elapsed time: 4.097666 msecs"
> Time for 1M count on clojure.lang.PersistentVector
> "Elapsed time: 8.661771 msecs"
> Time for 1M count on clojure.lang.LongRange
> "Elapsed time: 26.705511 msecs"
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to