ehm... this doesn't take very many ideas.
x = runif(n=10e6, min=0, max=1000) x = round(x) system.time( { y = x[-1] - x[-length(x)] }) I get about 0.5 seconds on my old laptop. HTH Peter On Fri, Jan 27, 2012 at 4:15 PM, Kevin Ummel <kevinum...@gmail.com> wrote: > Hi everyone, > > Speed is the key here. > > I need to find the difference between a vector and its one-period lag (i.e. > the difference between each value and the subsequent one in the vector). > Let's say the vector contains 10 million random integers between 0 and 1,000. > The solution vector will have 9,999,999 values, since their is no lag for the > 1st observation. > > In R we have: > > #Set up input vector > x = runif(n=10e6, min=0, max=1000) > x = round(x) > > #Find one-period difference > y = diff(x) > > Question is: How can I get the 'diff(x)' part as fast as absolutely possible? > I queried some colleagues who work with other languages, and they provided > equivalent solutions in Python and Clojure that, on their machines, appear to > be potentially much faster (I've put the code below in case anyone is > interested). However, they mentioned that the overhead in passing the data > between languages could kill any improvements. I don't have much experience > integrating other languages, so I'm hoping the community has some ideas about > how to approach this particular problem... > > Many thanks, > Kevin > > In iPython: > > In [3]: import numpy as np > In [4]: arr = np.random.randint(0, 1000, (10000000,1)).astype("int16") > In [5]: arr1 = arr[1:].view() > In [6]: timeit arr2 = arr1 - arr[:-1] > 10 loops, best of 3: 20.1 ms per loop > > In Clojure: > > (defn subtract-lag > [n] > (let [v (take n (repeatedly rand))] > (time (dorun (map - v (cons 0 v)))))) > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.