- Worrying about the performance of a small, pure function like this is almost certainly premature optimization.
- Avoid concurrency constructs like atoms if you don't need them. - Have you considered using group-by? > On Nov 11, 2015, at 13:25, Dave Tenny <dave.te...@gmail.com> wrote: > > A colleague and I are debating various things clojure as we were exploring > alternative ways to solve a problem. > > Here's the description of the problem that a particular function is trying to > solve, and the first implementation of it. > > (defn update-roles > "Given a vector of maps of the form {:project_id N :project_name S > :project_roles [...roles...]} > if there is already a map for the indicated project id/name, add new-role > to it and returned > a copy the updated input vector, otherwise return a vector with a new map > entry for the newly found > project and initial role. This function is basically aggregating tuples > from the database." > [projects project-id project-name new-role] > (let [updated? (atom nil) > > projects* (mapv (fn [m] > (if (= (:project_id m) project-id) > (do (reset! updated? true) > (assoc m :project_roles (conj (:project_roles > m) new-role))) > m)) > projects)] > (if @updated? > projects* > (conj projects {:project_id project-id :project_name project-name > :project_roles [new-role]})))) > > > ;; (update-roles [{:project_id 1 :project_name "One" :project_roles [:own]}] > 2 "Two" :edit) > ;; => [{:project_id 1, :project_name "One", :project_roles [:own]} > {:project_id 2, :project_name "Two", :project_roles [:edit]}] > ;; (update-roles [{:project_id 1 :project_name "One" :project_roles [:own]}] > 1 "Two" :edit) > ;; => [{:project_id 1, :project_name "One", :project_roles [:own :edit]}] > > > > Now here's another implementation: > > (defn update-or-insert-project-role > [prj-roles prj-role] > (let [to-insert-prj-id (:project_id prj-role) > by-pid (group-by :project_id prj-roles)] > (case (get by-pid to-insert-prj-id) > nil (conj prj-roles prj-role) > (->> (update-in by-pid [to-insert-prj-id 0 :project_roles] #(apply conj > % (:project_roles prj-role))) > (mapcat second) > (into []))))) > > ;; (def prj-roles [{:project_id 1, :project_name "One", :project_roles > [:own]} {:project_id 3 :project_name "Three" :project_roles [:edit]}]) > ;; (update-or-insert-project-role prj-roles {:project_id 2 :project_name > "Two" :project_roles [:edit]}) > ;; => [{:project_id 1, :project_name "One", :project_roles [:own]} > {:project_id 3, :project_name "Three", :project_roles [:edit]} {:project_id > 2, :project_name "Two", :project_roles [:edit]}] > ;; (update-or-insert-project-role prj-roles {:project_id 1 :project_name > "One" :project_roles [:edit]}) > ;; => [{:project_id 1, :project_name "One", :project_roles [:own :edit]} > {:project_id 3, :project_name "Three", :project_roles [:edit]}] > > > > The function is called in a loop to aggregate rows from a database, though it > isn't an overriding concern, we're not going millions of records in this case. > > The first thing my colleague and I disagreed on was the calling sequence, > arguing over which is more readable. > The second thing was whether efficiency in this context is really important, > or whether it's all moot in clojure. > > Finally, I'm sure there's a better way, probably with Zippers or something, > but neither of us have used them. Suggestions for the stylistic and > operational epitome of clojure expression on this routine are welcome! > > Superficially, and probably incorrect in multiple ways, here is a poor > attempt at breaking down efficiency in terms of search/traversal and memory > allocations. This was done by someone with no knowledge of clojure internals > (including the library implementations of the called functions). > > ;; Comparing the two routines per function call, for existing project case > (i.e. where roles are updated) > ;; > ;; Assuming each routine allocates new vector for new-role placement in > existing project > ;; and MapEntry for assoc of new vector on project_roles, so they aren't > included in allocations > ;; below since both routines have to do it. > ;; > ;; Note that x-element map allocates storage for map and map-entries or > clojure equivalent. > ;; (and more expensive than an x-element vector, of course). > ;; > ;; n == length of input project list. > ;; m == average length of input project list role vectors. > ;; > ;; Object Allocations > ;; Function call: > ;; update-roles: > ;; 1 atom > ;; 1 O(n) vector for mapv > ;; update-or-insert-project-role: > ;; 1 3-entry map + 1 single-element vector for prj-role argument input. > ;; 1 n-element map for group-by > ;; n vectors for group-by map values > ;; 1 n-element map for update-in > ;; 1 list/sequence for mapcat (+ n concat intermediaries?) > ;; 1 vector for into > ;; > ;; If we discard the second 'into' and first 'mapv' allocations the > update-or-insert-project-role routine allocates > ;; 3 additional maps (two of which are O(n)), n additional vectors, and 1 > additional list/sequence. > ;; > ;; Searches/traversals/copies > ;; update-roles: > ;; O(n) - mapv > ;; update-or-insert-project-role: > ;; O(n) - group-by > ;; O(n) - update-in > ;; O(n) - mapcat > ;; O(n) - into > ;; > ;; Here's what update-or-insert-project-role allocates (by way of assistance > in assessing the above) > ;;{1 [{:project_id 1, :project_name "One", :project_roles [:own]}], 3 > [{:project_id 3, :project_name "Three", :project_roles [:edit]}]} > -- group-by > ;;{1 [{:project_id 1, :project_name "One", :project_roles [:own :edit]}], 3 > [{:project_id 3, :project_name "Three", :project_roles [:edit]}]} -- > update-in > ;;({:project_id 1, :project_name "One", :project_roles [:own :edit]} > {:project_id 3, :project_name "Three", :project_roles [:edit]}) > -- mapcat > ;;[{:project_id 1, :project_name "One", :project_roles [:own :edit]} > {:project_id 3, :project_name "Three", :project_roles [:edit]}] > -- into > > Thanks for your thoughts and suggestions. > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with your > first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to the Google Groups > "Clojure" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to clojure+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.