Hi all,
I wanted to optimize a function of mine, and after doing that, I found
out that the optimized version is 4 times slower! However, I can't see
the reason for that. Probably, I can't see the forest for trees, so I'd
be happy if someone else could have a look if I'm doing something
obviously wrong.
My function `adjs-internal` gets some object (an EObject of the Eclipse
Modeling Framework) and a vector of reference names given as keywords,
and it returns a seq of all objects that can be reached by traversing
one reference after the other. For example,
(adjs-internal person :friends :relatives)
returns the seq of the relatives of the friends of person.
Ok, the original code was this (ignore the c-* atoms for now, I've used
them for measuring only):
--8<---------------cut here---------------start------------->8---
(defn ^:private eget-ref [^EObject eo ref allow-unknown-ref single-valued]
(swap! c-eget-ref inc)
(if-let [^EStructuralFeature sf (.getEStructuralFeature (.eClass eo) (name
ref))]
(if (instance? EReference sf)
(if single-valued
(let [ub (.getUpperBound sf)]
(if (== 1 ub)
(.eGet eo sf)
(errorf "Must not call adj on EReference '%s' with upper bound %s."
sf ub)))
(.eGet eo sf))
(errorf "'%s' at %s is no EReference." sf eo))
(if allow-unknown-ref
nil
(errorf "No such structural feature '%s' at %s." ref eo))))
(defn adjs-internal [this roles]
(swap! c-recursion inc)
(if (seq roles)
(when-let [a (eget-ref this (first roles) false false)]
(mapcat #(adjs-internal % (rest roles))
(if (instance? java.util.Collection a) a [a])))
[this]))
--8<---------------cut here---------------end--------------->8---
Then I've thought that I don't need to make all these checks in
`eget-ref` on every single object, because if the checks succeed for the
first friend in the example above, they'll succeed for any other friend,
too. So my idea was to refactor `eget-ref` to a function `eref-getter`
that does the checks once, and then returns just a getter closure that I
can apply on every object without any checks.
Here's the new code:
--8<---------------cut here---------------start------------->8---
(defn ^:private eref-getter ^EReference [^EObject eo ref allow-unknown-ref
single-valued]
(swap! c-helper inc)
(if-let [^EStructuralFeature sf (.getEStructuralFeature (.eClass eo) (name
ref))]
(if (instance? EReference sf)
(if single-valued
(let [ub (.getUpperBound sf)]
(if (== 1 ub)
(fn [o] (swap! c-getter inc) [(.eGet o sf)])
(errorf "Must not call adj on EReference '%s' with upper bound %s."
sf ub)))
(fn [o] (swap! c-getter inc) (.eGet o sf)))
(errorf "'%s' at %s is no EReference." sf eo))
(if allow-unknown-ref
nil
(errorf "No such structural feature '%s' at %s." ref eo))))
(defn adjs-internal [this roles]
(loop [els [this], rs roles]
(swap! c-recursion inc)
(if (seq rs)
(if (seq els)
(if-let [getter (eref-getter (first els) (first rs) false false)]
(recur (mapcat getter els) (rest rs))
[])
[])
els)))
--8<---------------cut here---------------end--------------->8---
As you can see in the code snippets above, I added some atoms to count
the number of recursive calls (c-recursion: recursive calls in the
original version, loop-iterations in the second version), and the number
of calls to the helper function (c-helper), and for the second version
also c-getter counting the number of invocations of the getter function
returned by eref-getter.
Ok, the original version on the call
(adjs p :pret :postp :pret :postp :pret :postp)
I get these stats:
1st version 2nd version
c-recursion: 53 7
c-helper: 32 6
c-getter: # 32
Since the application of the getter (either eget-ref or the getter fn
returned by eref-getter) plus the mapcatting are the most expensive
parts, my judgement was that the second version should be slightly
faster because the getter function is cheaper than eget-ref.
However, when doing the above call 100.000 times
(time (dotimes [_ 100000]
(dorun (adjs p :pret :postp :pret :postp :pret :postp))))
the first version takes ~7secs wheras the second version takes ~27secs.
So the second version is actually nearly 4 times slower.
The obvious difference is that the first version did a depth-first
navigation whereas the second version does a breadth-first navigation.
But why does that cause such a performance hit?
Bye,
Tassilo
--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.