Sorry about that -- Gmail threaded things by arrival in my mailbox,
not by timestamp (surprisingly)

Rcpp really is one of the coolest new things in the R ecosystem --
hope it works well for you.

M

On Sun, Jan 29, 2012 at 11:12 AM, Kevin Ummel <kevinum...@gmail.com> wrote:
> Sorry, guys. I'm not active on the listserve, so my last post was held by the 
> moderator until after Dirk's solution was posted.
>
> Excellent stuff.
>
> thanks,
> kevin
>
> On Jan 29, 2012, at 8:37 AM, R. Michael Weylandt wrote:
>
>> Have you not followed your own thread? Dirk is Mr. Rcpp himself and he
>> gives an implementation that gives you 25x improvement here as well as
>> tips for getting even more out of it:
>>
>> http://tolstoy.newcastle.edu.au/R/e17/help/12/01/2471.html
>>
>> Michael
>>
>> On Sat, Jan 28, 2012 at 12:28 PM, Kevin Ummel <kevinum...@gmail.com> wrote:
>>> Thanks. I've played around with pure R solutions. The fastest re-write of 
>>> diff (for the 1 lag case) I can seem to find is this:
>>>
>>> diff2 = function(x) {
>>>  y = c(x,NA) - c(NA,x)
>>>  y[2:length(x)]
>>> }
>>>
>>> #Compiling via 'cmpfun' doesn't seem to help (or hurt):
>>> require(compiler)
>>> diff2 = cmpfun(diff2)
>>>
>>> But that only gets ~10% improvement over default 'diff' on my machine. 
>>> Still too slow for my particular application.
>>>
>>> I'm inclined towards Michael's suggestion of inline+Rcpp (or some other use 
>>> of C under the hood).
>>>
>>> Could someone show me how to go about doing that?
>>>
>>> Thanks!
>>> Kevin
>>>
>>> On Jan 28, 2012, at 9:14 AM, Peter Langfelder wrote:
>>>
>>>> ehm... this doesn't take very many ideas.
>>>>
>>>>
>>>> x = runif(n=10e6, min=0, max=1000)
>>>> x = round(x)
>>>>
>>>> system.time( {
>>>>  y = x[-1] - x[-length(x)]
>>>> })
>>>>
>>>> I get about 0.5 seconds on my old laptop.
>>>>
>>>> HTH
>>>>
>>>> Peter
>>>>
>>>>
>>>> On Fri, Jan 27, 2012 at 4:15 PM, Kevin Ummel <kevinum...@gmail.com> wrote:
>>>>> Hi everyone,
>>>>>
>>>>> Speed is the key here.
>>>>>
>>>>> I need to find the difference between a vector and its one-period lag 
>>>>> (i.e. the difference between each value and the subsequent one in the 
>>>>> vector). Let's say the vector contains 10 million random integers between 
>>>>> 0 and 1,000. The solution vector will have 9,999,999 values, since their 
>>>>> is no lag for the 1st observation.
>>>>>
>>>>> In R we have:
>>>>>
>>>>> #Set up input vector
>>>>> x = runif(n=10e6, min=0, max=1000)
>>>>> x = round(x)
>>>>>
>>>>> #Find one-period difference
>>>>> y = diff(x)
>>>>>
>>>>> Question is: How can I get the 'diff(x)' part as fast as absolutely 
>>>>> possible? I queried some colleagues who work with other languages, and 
>>>>> they provided equivalent solutions in Python and Clojure that, on their 
>>>>> machines, appear to be potentially much faster (I've put the code below 
>>>>> in case anyone is interested). However, they mentioned that the overhead 
>>>>> in passing the data between languages could kill any improvements. I 
>>>>> don't have much experience integrating other languages, so I'm hoping the 
>>>>> community has some ideas about how to approach this particular problem...
>>>>>
>>>>> Many thanks,
>>>>> Kevin
>>>>>
>>>>> In iPython:
>>>>>
>>>>> In [3]: import numpy as np
>>>>> In [4]: arr = np.random.randint(0, 1000, (10000000,1)).astype("int16")
>>>>> In [5]: arr1 = arr[1:].view()
>>>>> In [6]: timeit arr2 = arr1 - arr[:-1]
>>>>> 10 loops, best of 3: 20.1 ms per loop
>>>>>
>>>>> In Clojure:
>>>>>
>>>>> (defn subtract-lag
>>>>>  [n]
>>>>>  (let [v (take n (repeatedly rand))]
>>>>>    (time (dorun (map - v (cons 0 v))))))
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>        [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help@r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide 
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to