Using a set as filter predicate is idiomatic Clojure.

If you don't know whether your filter coll is a set, you can write your 
filter-contains as

(def filter-contains (comp filter set))

E. g. use as

(into [] (filter-contains [1 2 3]) [1 2 3 4]) 
;-> [4]

I am not up to date about optimizations in the PersistentHashSet world, but 
I'd assume for very small filter colls you can gain speed with a linear 
scan instead of contains?.

If you are ok with input, output and filter being a set, I'd recommend 
looking at clojure.set/intersection for that task. It optimizes by reducing 
the smaller set. Depending on the sizes of the collections you are dealing 
with, the set conversion cost may be worth the speed up you can gain. 


On Wednesday, June 24, 2015 at 12:07:06 AM UTC+2, Sam Raker wrote:
>
> Let's say that, as part of an xf, I want to filter out everything in a 
> sequence that's also in some other sequence. Here are some ways of doing 
> that:
>
> (defn filter-contains1 [edn-file] (remove (partial contains? (set (read-
> edn-file edn-file)))))
>
> (defn filter-contains2 [coll] (remove (partial contains? (set coll))))
>
> (def filter-contains3 [coll] (let [coll-as-set (set coll)] (remove (
> partial contains? (set coll)))))
>
> I have the strong suspicion that `filter-contains3` is the best of the 3, 
> and `filter-contains1` the worst. The internal mechanics of transduce are a 
> bit of a mystery to me, however: if `filter-contains2` were to be used on a 
> collection of, say, a million items, would `coll` be cast to a set a 
> million times, or is Clojure/the JVM smarter than that? I'm also wondering 
> if anyone has any "best practices" (or whatever) they can share relating to 
> this kind of intersection of transducers/xfs and closures. It seems to me, 
> for example, that something like
>
> (defn my-thing [coll & stuff]
>   (let [s (set coll)]
>   ...
>   (comp
>     ...
>    (map foo)
>    (filter bar)
>    (remove (partial contains? s))
>    ...
>
> is awkward, but that a lot of limited-use transducer factory functions 
> (like the ones above) aren't exactly optimal, either.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to