Rather than writing a new function you could always use something like (map first (partition-by :value events)). partition-by will create lists of events where consecutive values have not changed. You could also assemble a transducer pipeline using the transducer arities of the functions: `(into [] (comp (partition-by :value) (map first)) events)`.
Hope that helps, Tom On Tue, May 17, 2016 at 10:47 AM, 'Simon Brooke' via Clojure < [email protected]> wrote: > I'm having trouble with writing a function > > 1. in idiomatic clojure > 2. which doesn't blow the stack > > The problem is I have a time series of events e.g. > > ({:idhistory 78758272, :timestamp #inst > "2016-03-31T19:34:27.313000000-00:00", :nameid 5637, :stringvalue nil, > :value 8000.0} > {:idhistory 78756591, :timestamp #inst > "2016-03-31T19:33:31.697000000-00:00", :nameid 5637, :stringvalue nil, > :value 7368.0} > {:idhistory 78754249, :timestamp #inst > "2016-03-31T19:32:17.100000000-00:00", :nameid 5637, :stringvalue nil, > :value 6316.0} > {:idhistory 78753165, :timestamp #inst > "2016-03-31T19:31:41.843000000-00:00", :nameid 5637, :stringvalue nil, > :value 5263.0} > {:idhistory 78751187, :timestamp #inst > "2016-03-31T19:30:36.213000000-00:00", :nameid 5637, :stringvalue nil, > :value 4211.0} > {:idhistory 78749476, :timestamp #inst > "2016-03-31T19:29:41.363000000-00:00", :nameid 5637, :stringvalue nil, > :value 3158.0} ...) > > which is to say, each event is a map, and each event has two critical > keys, :timestamp and :value. The series is sorted in descending order by > timestamp, i.e. most recent event first. These series are of up to millions > of events; the average length of the series is about half a million events. > However, many contain successive events at which the value does not change, > and where the value doesn't change I want to retain only the first event. > > So far what I've got is: > > (defn consolidate-events > "Return a time series like this `series`, but without those events whose > value is > identical to the value of the preceding event." > [series] > (let [[car cadr & cddr] series] > (cond > (empty? series) series > (= > (get-value-for-event car) > (get-value-for-event cadr)) (consolidate-events (rest series)) > true (cons car (consolidate-events (rest series)))))) > > > Obviously, with millions of events or even merely hundreds of thousands, a > recursive function blows the stack. Furthermore, this one isn't even tail > call optimisable. I tried creating an inner function which I naively > thought should be tail call optimisable, but it fails 'Can only recur from > tail position': > > (defn consolidate-events > "Return a time series like this `series`, but without those events whose > value is > identical to the value of the preceding event." > [series] > (remove > nil? > (let [inner (fn [series] > (let [[car cadr & cddr] series] > (if > (not (empty? series)) > ;; then > (cons > (if > (= (get-value-for-event car) > (get-value-for-event cadr)) > ;; then > nil > ;; else > car) > (if > (not (empty? series)) > (recur (rest series)))))))] > (inner series)))) > > > Test for the function is as follows: > > (deftest consolidate-events-test > (testing "consolidate-events" > (let [s1 [{:timestamp #inst "2016-03-31T19:34:27.313000000-00:00", > :value 8000.0} > {:timestamp #inst "2016-03-31T19:33:31.697000000-00:00", > :value 7368.0} > {:timestamp #inst "2016-03-31T19:32:17.100000000-00:00", > :value 6316.0} > {:timestamp #inst "2016-03-31T19:31:41.843000000-00:00", > :value 5263.0} > {:timestamp #inst "2016-03-31T19:30:36.213000000-00:00", > :value 4211.0} > {:timestamp #inst "2016-03-31T19:29:41.363000000-00:00", > :value 3158.0}] > s2 [{:timestamp #inst "2016-03-31T19:34:27.313000000-00:00", > :value 8000.0} > {:timestamp #inst "2016-03-31T19:33:31.697000000-00:00", > :value 7368.0} > {:timestamp #inst "2016-03-31T19:33:17.100000000-00:00", > :value 6316.0} > {:timestamp #inst "2016-03-31T19:32:27.100000000-00:00", > :value 6316.0} > {:timestamp #inst "2016-03-31T19:32:17.100000000-00:00", > :value 6316.0} > {:timestamp #inst "2016-03-31T19:31:41.843000000-00:00", > :value 5263.0} > {:timestamp #inst "2016-03-31T19:30:36.213000000-00:00", > :value 4211.0} > {:timestamp #inst "2016-03-31T19:29:41.363000000-00:00", > :value 3158.0}]] > (is (= s1 (consolidate-events s1)) "There are no events in s1 that > can be consolidated") > (is (= s1 (consolidate-events s2)) "When consolidated, s2 = s1") > (is (not (= s2 (consolidate-events s2))) "When consolidated, s2 no > longer equals s2")))) > > > Any help gratefully accepted! > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to [email protected] > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to the Google Groups > "Clojure" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to [email protected] Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
