Sorry about that -- Gmail threaded things by arrival in my mailbox, not by timestamp (surprisingly)
Rcpp really is one of the coolest new things in the R ecosystem -- hope it works well for you. M On Sun, Jan 29, 2012 at 11:12 AM, Kevin Ummel <kevinum...@gmail.com> wrote: > Sorry, guys. I'm not active on the listserve, so my last post was held by the > moderator until after Dirk's solution was posted. > > Excellent stuff. > > thanks, > kevin > > On Jan 29, 2012, at 8:37 AM, R. Michael Weylandt wrote: > >> Have you not followed your own thread? Dirk is Mr. Rcpp himself and he >> gives an implementation that gives you 25x improvement here as well as >> tips for getting even more out of it: >> >> http://tolstoy.newcastle.edu.au/R/e17/help/12/01/2471.html >> >> Michael >> >> On Sat, Jan 28, 2012 at 12:28 PM, Kevin Ummel <kevinum...@gmail.com> wrote: >>> Thanks. I've played around with pure R solutions. The fastest re-write of >>> diff (for the 1 lag case) I can seem to find is this: >>> >>> diff2 = function(x) { >>> y = c(x,NA) - c(NA,x) >>> y[2:length(x)] >>> } >>> >>> #Compiling via 'cmpfun' doesn't seem to help (or hurt): >>> require(compiler) >>> diff2 = cmpfun(diff2) >>> >>> But that only gets ~10% improvement over default 'diff' on my machine. >>> Still too slow for my particular application. >>> >>> I'm inclined towards Michael's suggestion of inline+Rcpp (or some other use >>> of C under the hood). >>> >>> Could someone show me how to go about doing that? >>> >>> Thanks! >>> Kevin >>> >>> On Jan 28, 2012, at 9:14 AM, Peter Langfelder wrote: >>> >>>> ehm... this doesn't take very many ideas. >>>> >>>> >>>> x = runif(n=10e6, min=0, max=1000) >>>> x = round(x) >>>> >>>> system.time( { >>>> y = x[-1] - x[-length(x)] >>>> }) >>>> >>>> I get about 0.5 seconds on my old laptop. >>>> >>>> HTH >>>> >>>> Peter >>>> >>>> >>>> On Fri, Jan 27, 2012 at 4:15 PM, Kevin Ummel <kevinum...@gmail.com> wrote: >>>>> Hi everyone, >>>>> >>>>> Speed is the key here. >>>>> >>>>> I need to find the difference between a vector and its one-period lag >>>>> (i.e. the difference between each value and the subsequent one in the >>>>> vector). Let's say the vector contains 10 million random integers between >>>>> 0 and 1,000. The solution vector will have 9,999,999 values, since their >>>>> is no lag for the 1st observation. >>>>> >>>>> In R we have: >>>>> >>>>> #Set up input vector >>>>> x = runif(n=10e6, min=0, max=1000) >>>>> x = round(x) >>>>> >>>>> #Find one-period difference >>>>> y = diff(x) >>>>> >>>>> Question is: How can I get the 'diff(x)' part as fast as absolutely >>>>> possible? I queried some colleagues who work with other languages, and >>>>> they provided equivalent solutions in Python and Clojure that, on their >>>>> machines, appear to be potentially much faster (I've put the code below >>>>> in case anyone is interested). However, they mentioned that the overhead >>>>> in passing the data between languages could kill any improvements. I >>>>> don't have much experience integrating other languages, so I'm hoping the >>>>> community has some ideas about how to approach this particular problem... >>>>> >>>>> Many thanks, >>>>> Kevin >>>>> >>>>> In iPython: >>>>> >>>>> In [3]: import numpy as np >>>>> In [4]: arr = np.random.randint(0, 1000, (10000000,1)).astype("int16") >>>>> In [5]: arr1 = arr[1:].view() >>>>> In [6]: timeit arr2 = arr1 - arr[:-1] >>>>> 10 loops, best of 3: 20.1 ms per loop >>>>> >>>>> In Clojure: >>>>> >>>>> (defn subtract-lag >>>>> [n] >>>>> (let [v (take n (repeatedly rand))] >>>>> (time (dorun (map - v (cons 0 v)))))) >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.