Optimized code => now it's 4x slower. Why?

Tassilo Horn Fri, 01 Mar 2013 04:17:36 -0800

Hi all,

I wanted to optimize a function of mine, and after doing that, I found
out that the optimized version is 4 times slower!  However, I can't see
the reason for that.  Probably, I can't see the forest for trees, so I'd
be happy if someone else could have a look if I'm doing something
obviously wrong.


My function `adjs-internal` gets some object (an EObject of the Eclipse
Modeling Framework) and a vector of reference names given as keywords,
and it returns a seq of all objects that can be reached by traversing
one reference after the other.  For example,

  (adjs-internal person :friends :relatives)

returns the seq of the relatives of the friends of person.

Ok, the original code was this (ignore the c-* atoms for now, I've used
them for measuring only):

--8<---------------cut here---------------start------------->8---
(defn ^:private eget-ref [^EObject eo ref allow-unknown-ref single-valued]
  (swap! c-eget-ref inc)
  (if-let [^EStructuralFeature sf (.getEStructuralFeature (.eClass eo) (name 
ref))]
    (if (instance? EReference sf)
      (if single-valued
        (let [ub (.getUpperBound sf)]
          (if (== 1 ub)
            (.eGet eo sf)
            (errorf "Must not call adj on EReference '%s' with upper bound %s."
                    sf ub)))
        (.eGet eo sf))
      (errorf "'%s' at %s is no EReference." sf eo))
    (if allow-unknown-ref
      nil
      (errorf "No such structural feature '%s' at %s." ref eo))))

(defn adjs-internal [this roles]
  (swap! c-recursion inc)
  (if (seq roles)
    (when-let [a (eget-ref this (first roles) false false)]
      (mapcat #(adjs-internal % (rest roles))
              (if (instance? java.util.Collection a) a [a])))
    [this]))
--8<---------------cut here---------------end--------------->8---

Then I've thought that I don't need to make all these checks in
`eget-ref` on every single object, because if the checks succeed for the
first friend in the example above, they'll succeed for any other friend,
too.  So my idea was to refactor `eget-ref` to a function `eref-getter`
that does the checks once, and then returns just a getter closure that I
can apply on every object without any checks.

Here's the new code:

--8<---------------cut here---------------start------------->8---
(defn ^:private eref-getter ^EReference [^EObject eo ref allow-unknown-ref 
single-valued]
  (swap! c-helper inc)
  (if-let [^EStructuralFeature sf (.getEStructuralFeature (.eClass eo) (name 
ref))]
    (if (instance? EReference sf)
      (if single-valued
        (let [ub (.getUpperBound sf)]
          (if (== 1 ub)
            (fn [o] (swap! c-getter inc) [(.eGet o sf)])
            (errorf "Must not call adj on EReference '%s' with upper bound %s."
                    sf ub)))
        (fn [o] (swap! c-getter inc) (.eGet o sf)))
      (errorf "'%s' at %s is no EReference." sf eo))
    (if allow-unknown-ref
      nil
      (errorf "No such structural feature '%s' at %s." ref eo))))

(defn adjs-internal [this roles]
  (loop [els [this], rs roles]
    (swap! c-recursion inc)
    (if (seq rs)
      (if (seq els)
        (if-let [getter (eref-getter (first els) (first rs) false false)]
          (recur (mapcat getter els) (rest rs))
          [])
        [])
      els)))
--8<---------------cut here---------------end--------------->8---

As you can see in the code snippets above, I added some atoms to count
the number of recursive calls (c-recursion: recursive calls in the
original version, loop-iterations in the second version), and the number
of calls to the helper function (c-helper), and for the second version
also c-getter counting the number of invocations of the getter function
returned by eref-getter.

Ok, the original version on the call

  (adjs p :pret :postp :pret :postp :pret :postp)

I get these stats:

             1st version    2nd version
  c-recursion:   53             7
  c-helper:      32             6
  c-getter:       #            32

Since the application of the getter (either eget-ref or the getter fn
returned by eref-getter) plus the mapcatting are the most expensive
parts, my judgement was that the second version should be slightly
faster because the getter function is cheaper than eget-ref.

However, when doing the above call 100.000 times

  (time (dotimes [_ 100000]
          (dorun (adjs p :pret :postp :pret :postp :pret :postp))))

the first version takes ~7secs wheras the second version takes ~27secs.
So the second version is actually nearly 4 times slower.

The obvious difference is that the first version did a depth-first
navigation whereas the second version does a breadth-first navigation.
But why does that cause such a performance hit?

Bye,
Tassilo

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Optimized code => now it's 4x slower. Why?

Reply via email to