Great work. I love the Clojure state for data-structures. People often 
underestimate how much access to quality data-structures is important for a 
language. By the way, www.data.avl is not a valid link.

On Sunday, 20 November 2016 12:32:31 UTC-8, Michał Marczyk wrote:
>
> Hi,
>
> I am pleased to announce the 0.0.2 release of psq.clj, a Clojure
> priority search queue library based on Ralf Hinze's priority search
> pennants (see Ralf Hinze, "A Simple Implementation Technique for
> Priority Search Queues"):
>
>   https://github.com/michalmarczyk/psq.clj
>
>   https://clojars.org/psq.clj
>
> Priority search queues simultaneously behave like sorted maps with
> respect to their keys and like priority queues with respect to their
> entries, with the priority queue ordering based on the values, or
> priorities, attached to the keys. Additionally they support several
> operations that blend the sorted map and priority queue aspects
> together, such as "peek in key range" (retrieve a minimum-priority
> entry – NB. ties on priorities are supported – within the given key
> range) and optimized subseq-like traversals of key ranges that only
> return those entries whose priorities fall below a certain threshold
> (this is much more efficient than subseq + filter on a regular sorted
> map).
>
> psq.clj priority search queues support the full data.avl abstract data
> type (with the exception of the transient API which I am not sure
> would bring a significant return on the additional memory cost in this
> context). The full public API is as follows:
>
>   ;;; there is a single public namespace:
>   (require '[psq.clj :as psq])
>
>
>   ;;; 1. factory functions
>
>   ;; the psq.clj counterpart to clojure.core/sorted-map:
>   (psq/psqueue key priority …)
>   ;= {key priority …}
>
>   ;; a version of the above that takes keys+priorities in a seqable:
>   (psq/psqueue* [key priority …])
>
>   ;; a factory that accepts a map or seqable of map entries:
>   (psq/psq {key priority …})
>   (psq/psq [[key priority] …])
>
>   ;; versions of the above that take *two* custom comparators: the
>   ;; first one determines the PSQ's key order, the second one is used
>   ;; for priorities:
>   (psq/psqueue-by > > key priority …) ; reverse numeric order on keys
>                                       ; and priorities
>
>   psq/psqueue-by*, psq/psq-by ; like psq/psqueue*, psq/psq, but with
>                               ; custom comparators
>
>
>   ;;; 2. regular sorted map API; NB. (r)(sub)seq use key order
>
>   (seq (psq/psqueue 0 10 1 9))
>   ;= ([0 10] [1 9])
>   
>   ;; also supported: assoc, dissoc, conj, rseq, subseq, rsubseq
>
>
>   ;;; 3. nearest neighbour lookups
>   (psq/nearest (psq/psq {0 1 4 5 9 10}) > 3)
>   ;= [4 5]
>
>
>   ;;; 4. nth, rank in key order
>
>   (nth (psq/psqueue 0 3 6 -3) 0)
>   ;= [0 3]
>
>   (psq/rank (psq/psqueue 0 3 6 -3) 6) ; returns long, -1 for not found
>   ;= 1
>
>
>   ;;; 5. priority queue API based on values/priorities
>
>   (peek (psq/psqueue 0 3 1 -3))
>   ;= [1 -3]
>
>   (pop (psq/psqueue 0 3 1 -3))
>   ;= {0 3}
>
>   ;; seq over the PSQ in order of non-decreasing priorities
>   (psq/priority-seq (psq/psqueue 0 3 1 -3))
>   ;= ([1 -3] [0 3])
>
>
>   ;;; 6. splits, subranges – with structural sharing in the common
>   ;;;    parts, but no holding on to keys outside the stated range for
>   ;;;    GC purposes
>
>   ;; return a vector of
>   ;;
>   ;;   1. a fully independent PSQ comprising the entries of the
>   ;;      input PSQ to the left of the given key,
>   ;;
>   ;;   2. the entry at the given key, or nil if not present,
>   ;;
>   ;;   3. a fully independent PSQ comprising the entries of the
>   ;;      input PSQ to the right of the given key:
>   (psq/split (psq/psqueue* (range 10)) 4)
>   ;= [{0 1 2 3} [4 5] {6 7 8 9}]
>
>   ;; like subseq, but returns an independent PSQ
>   (psq/subrange (psq/psqueue* (range 10)) >= 4 < 8)
>   ;= {4 5 6 7}
>
>
>   ;;; 7. priority-bounded traversals:
>   (psq/seq<= a-psq priority-upper-bound)
>
>   ;; also supported: rseq<=, subseq<=, rsubseq<= (the latter two
>   ;; with subseq/rsubseq-like arguments to specify key bounds) and
>   ;; < variants of all of the above (seq< etc.)
>
> All of the above operations can be performed in O(log n) time, with
> the exception of peek, which takes constant time, and seq-like
> operations, which are sensitive to the number of entries actually
> produced. Priority-bounded traversals take O(r(log n - log r) + r)
> time, where r is the number of entries actually returned.
>
> In practice, psq.clj priority search queues are slower than Clojure's
> built-in sorted maps for pure sorted-map operations, typically by a
> factor of 2 to 4. In return, the PSQ operations are very performant;
> subseq<= can be over three orders of magnitude faster than the
> equivalent combination of subseq and filter applied to a regular
> sorted map, subrange takes tens of microseconds when the size of the
> input is on the order of hundreds of thousands, peek takes less than
> 10 ns etc. See end of this message for some benchmark results.
>
> Mark Engelberg's excellent data.priority-map is another natural point
> of comparison. In short, whereas data.priority-map is hash-map-like,
> with extremely fast lookups and no particular ordering of keys,
> psq.clj is sorted-map-like, with a much richer abstract data type at
> the cost of noticeably slower lookups. assoc is often faster with
> psq.clj, although there are edge cases where data.priority-map is
> faster; dissoc is generally faster with data.priority-map. In concrete
> usage scenarios, comparative benchmarks are advisable, as either data
> structure may come out ahead depending on the specific mix of
> operations. Of course psq.clj will be most beneficial in those use
> cases which benefit from the expanded abstract data type, and any
> lookup-heavy workloads where the common operations suffice will tend
> to be better served by data.priority-map.
>
> I had the pleasure of announcing the initial 0.0.1 release of psq.clj
> at EuroClojure 2016 in Bratislava:
>
>   https://www.youtube.com/watch?v=pZ9gBKO_qCA
>
> To those of you who were there – many thanks for helping make this
> year's EuroClojure such a great conference! Also, I have since
> addressed a few points from my TODO list from the final slide (notably
> psq.clj/nearest is there now, and so psq.clj really supports the full
> data.avl abstract data type).
>
> Finally I should say that at this early stage both the public API and
> the implementation strategy are subject to revision. The features that
> are there, however, I do expect to work – a thorough generative test
> suite based on collection-check
> (https://github.com/ztellman/collection-check) and test.check
> (https://github.com/clojure/test.check) is in place to ensure this.
>
> To include this version in your project:
>
>   [psq.clj "0.0.2"]
>
>   <dependency>
>     <groupId>psq.clj</groupId>
>     <artifactId>psq.clj</artifactId>
>     <version>0.0.2</version>
>   </dependency>
>
>   compile "psq.clj:psq.clj:0.0.2"
>
> Cheers,
> Michał
>
>
> Benchmark results
> =================
>
> PSQ – psq.clj
> PM – data.priority-map
> SM – clojure.core/sorted-map
>
> Results obtained using Hugo Duncan's Criterium:
>
>   https://github.com/hugoduncan/criterium
>
> NB. some extreme results are due to benchmarks hitting the best/worst
> cases for each data structure. Where Criterium reported the results in
> different units for different data structures, these have been changed
> to make results comparable at first glance.
>
> N/A is indicated where an operation cannot be reasonably
> implemented/emulated for a given data structure.
>
>
> Basic map operations
> --------------------
>
> Except where otherwise noted, these benchmarks use 5000 key/value
> pairs inserted in order of increasing keys and values:
>
>   (psq/psqueue* (range 10000))
>   (apply pm/priority-map (range 10000))
>   (apply sorted-map (range 10000))
>
>  1. (get input-map 0)
>  
>     PSQ:   1.367226 ns
>     PM:   62.910936 ns
>     SM:  159.694336 ns
>
>  2. (get input-map 5000)
>  
>     PSQ: 268.306762 ns
>     PM:   73.988311 ns
>     SM:  179.033148 ns
>
>  3. (get input-map 10000)
>  
>     PSQ: 552.692926 ns
>     PM:   56.753372 ns
>     SM:  248.005636 ns
>
>  4. (assoc input-map 0 -1)
>  
>     PSQ: 1.288109 µs
>     PM:  2.344063 µs
>     SM:  0.607151958 µs
>
>  5. (assoc input-map 5000 -1)
>  
>     PSQ: 1.822005 µs
>     PM:  2.573408 µs
>     SM:  0.646016659 µs
>
>  6. (assoc input-map 10000 -1)
>  
>     PSQ: 3.000240 µs
>     PM:  1.440096 µs
>     SM:  0.725093232 µs
>
>  7. (assoc input-map 0 10000)
>  
>     PSQ: 1.309970 µs
>     PM:  2.746458 µs
>     SM:  0.621439055 µs
>
>  8. (assoc input-map 5000 10000)
>  
>     PSQ: 1.711392 µs
>     PM:  2.834123 µs
>     SM:  0.663170561 µs
>
>  9. (assoc input-map 10000 10000)
>  
>     PSQ: 2.969847 µs
>     PM:  1.746856 µs
>     SM:  0.715915433 µs
>
> 10. (dissoc input-map 0)
>
>     PSQ: 1.398323 µs
>     PM:  1.211380 µs
>     SM:  0.653975123 µs
>
> 11. (dissoc input-map 5000)
>
>     PSQ: 1.668061 µs
>     PM:  1.367344 µs
>     SM:  0.740665097 µs
>
> 12. (dissoc input-map 10000)
>
>     PSQ: 1.509873 µs
>     PM:  0.055210586 µs
>     SM:  0.335146431 µs
>
> 13. (peek input-map)
>
>     PSQ:   4.254087 ns
>     PM:  487.922381 ns
>     SM:  N/A
>
> 14. (pop input-map)
>
>     PSQ: 0.762549879 µs
>     PM:  1.758735 µs
>     SM:  N/A
>
> 15. (last (psq/priority-seq psq))
>     (last (seq pm))
>
>     NB. data.priority-map implements seq in order of non-decreasing
>     priorities; psq.clj's seq respects key order and exposes
>     priority-ordered traversals using a separate function,
>     psq.clj/priority-seq.
>
>     PSQ: 3.603032 ms
>     PM:  3.359997 ms
>     SM:  N/A
>
> 16. (last (psq/subseq<= psq 2001 >= 2000 < 70000))
>     (last (psq/seq<= (psq/subrange psq >= 2000 < 70000) 2001))
>     (last (f sm 2001 >= 2000 < 700000))
>     (last (g sm 2001 >= 2000 < 700000))
>
>     Here f and g are defined as
>
>       (fn f [m ubound start-test start end-test end]
>         (filter #(not (pos? (.compare ^java.util.Comparator c
>                               (val %) ubound)))
>           (subseq m start-test start end-test end)))
>
>       (fn g [m ubound start-test start end-test end]
>         (filter #(<= (val %) ubound)
>           (subseq m start-test start end-test end)))
>
>     PSQ:     26.714850 µs (subseq<=)
>              25.213709 µs (seq<= + subrange)
>
>     PM:  147542.945 µs (f)
>          146825.592 µs (g)
>
>     SM:  N/A
>
> 17. Using (psq/psqueue* (range 1000000)) as input:
>
>     (psq/split psq 500000)
>
>     PSQ: 3.587069 µs
>     PM:  N/A
>     SM:  N/A
>
> 18. Using (psq/psqueue* (range 1000000)) as input:
>
>     (psq/subrange psq > 250000 <= 750000)
>
>     PSQ: 22.776483 µs
>     PM:  N/A
>     SM:  N/A
>
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to