thanks for the link! I should read it through. that said, I didn't find any
good general solution to the problem so here I post some attempts for
general input. maybe someone knows how to speed this up. both my solutions
are theoretically O(n) for creating a list of n elements. The function to
improve is O(n^2) which should suck tremendously - but the slow execution
of R probably blows up the constant factor of the smarter solutions.

Array doubling comes close in speed for large lists but it would be great
if it could be comparable for smaller lists. One hidden cost I see directly
is that allocating a list in R is O(n), not O(1) (or close), since it
always fills it with values. Is there a way around this? I guess by using
C, one could just malloc() and leave the content undefined - but is there
no better way?

thanks,
/Johan


################################
# the function we wish to improve

f<-function(dotot){
  v<-matrix(ncol=1,nrow=0)
  for(i in 1:dotot){
    v<-rbind(v,FALSE)
  }
  return(v)
}

##########################
# first attempt: linked lists

emptylist <- NA

addtolist <- function(x,prev){
  return(list(x,prev))
}

g<-function(dotot){
  v<-emptylist
  for(i in 1:dotot){
    v<-addtolist(FALSE,v)
  }
  return(v)
}

####################################
# second attempt: array doubling

emptyexpandlist<-list(nelem=0,l=matrix(ncol=1,nrow=0))

addexpandlist<-function(x,prev){
  if(nrow(prev$l)==prev$nelem){
    nextsize<-max(nrow(prev$l),1)
    prev$l<-rbind(prev$l,matrix(ncol=1,nrow=nextsize))
  }
  prev$nelem<-prev$nelem+1
  prev$l[prev$nelem]<-x
  return(prev)
}

compressexpandlist<-function(prev){
  return(as.vector(prev$l[1:prev$nelem]))
}

h<-function(dotot){
  v<-emptyexpandlist
  for(i in 1:dotot){
    v<-addexpandlist(FALSE,v)
  }
  return(compressexpandlist(v))
}

#########################################

dotot=100000
system.time(f(dotot))
#system.time(g(dotot))
system.time(h(dotot))








On Tue, Jul 17, 2012 at 8:42 PM, Patrick Burns <pbu...@pburns.seanet.com>wrote:

> Johan,
>
> If you don't know 'The R Inferno', it might
> help a little.  Circle 2 has an example of
> how to efficiently (relatively speaking) grow
> an object if you don't know the final length.
>
> http://www.burns-stat.com/**pages/Tutor/R_inferno.pdf<http://www.burns-stat.com/pages/Tutor/R_inferno.pdf>
>
> If you gave a simple example of how your code
> looks now and what you want it to do, then you
> might get some ideas of how to improve it.
>
>
> Pat
>
>
> On 17/07/2012 12:47, Johan Henriksson wrote:
>
>> Hello!
>> I am optimizing my code in R and for this I need to know a bit more about
>> the internals. It would help tremendously if someone could link me to a
>> page with O()-complexities of all the operations.
>>
>> In this particular case, I need something like a linked list with O(1)
>> insertLast/First ability. I can't preallocate a vector since I do not know
>> the final size of the list ahead of time.
>>
>> The classic array-doubling trick would give me O(1) amortized time but
>> it's
>> a bit messy. The other alternative I see would be to recursively store
>> lists (elem, (elem, (elem, (...)))), which I take also would work? But I'd
>> rather go for a standard R solution if there is one!
>>
>> cheers,
>> /Johan
>>
>>
> --
> Patrick Burns
> pbu...@pburns.seanet.com
> twitter: @portfolioprobe
> http://www.portfolioprobe.com/**blog <http://www.portfolioprobe.com/blog>
> http://www.burns-stat.com
> (home of 'Some hints for the R beginner'
> and 'The R Inferno')
>
>
>


-- 
-- 
-----------------------------------------------------------
Johan Henriksson, PhD
Karolinska Institutet
Ecobima AB - Custom solutions for life sciences
http://www.ecobima.se <http://www.ecobima.com>  http://mahogny.areta.org
http://www.endrov.net

<http://www.endrov.net>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to