Hey,
I got an assignment to implement an algorithm to calculate strongly connected components in a graph ( http://en.wikipedia.org/wiki/Kosaraju's_algorithm). The graph is rather big, it has ~900.000 vertices. In its first pass, it needs to do a depth-first search on the graph and calculate the finishing time for each node (the finishing time for a node is a number from 0…V-1 where V is the number of vertices). Potentially several depth-first search need to be launched (done in the finishing-times function below) to discover all components. The version I pasted below is the most performant. It discovers ~600.000 vertices (2/3 of all vertices). However, on subsequent dfs-by-finishing-times it becomes extremely slow (exploring a handful of additional nodes/second) and I'm not sure why. Here are a few things I tried to speed up the algorithm: * rewrite to a recursive function (not tail-recursive) and use lazy-seqs * define dfs-by-finishing-times in finishing-times so that the whole graph (G) does not have to be passed at each call. * use persistent data structures everywhere instead of transients (I know this would only slow things down but it did not hurt to try) * use a profiler (VisualVM) to learn where the bottlenecks are. I'm not very proficient with profilers but I could not extract any valuable information Here are the code snippets in question: (I also pasted it here: https://www.refheap.com/paste/3840cc772cc3a5b1d9c4f1db3 for better readability) (defn dfs-by-finishing-times ([G u] (dfs-by-finishing-times G u #{})) ([G u explored] (loop [[v & vs :as stack] (list u), explored (transient explored), lhalf [], rhalf [], iter-cnt 0] (if (seq stack) (let [neighbors (persistent! (reduce (fn [c u] (if (explored u) c (conj! c u))) (transient []) (G v)))] (cond (explored v) (recur vs explored lhalf rhalf (inc iter-cnt)) (empty? neighbors) (recur vs (conj! explored v) (conj lhalf v) rhalf (inc iter-cnt)) :else (recur (reduce (fn [stack e] (cons e stack)) vs neighbors) (conj! explored v) lhalf (cons v rhalf) (inc iter-cnt)))) (concat lhalf rhalf))) )) (defn finishing-times [G vertices] "The first pass of Kosaraju's algorithm. Scan the transpose graph of G, and mark the finishing time for each. G should already be the transposed graph" (loop [[u & vs :as stack] (seq vertices) explored #{}, finished []] (if (nil? u) finished (let [path (dfs-by-finishing-times G u explored) new-explored (into explored path)] (recur (remove new-explored vs) new-explored (into finished path)))))) Do you have any insights into what technique I could use to speed up my algorithm? I'm pretty sure I'm missing a key point but I'm not sure what. Presumably the whole algorithm runs under 10 seconds in C# on the same graph so this is rather embarrassing :) Appreciate your help, Balint -- -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.