I tried to hide the gory details as the structure of my datasets is rather complicated. Basically its a long list of lists which in turn contain character vectors, dates, numerics and dataframes, all named. While the hierarchy is fixed neither the number of elements nor their ordering is. But if I try to access a certain element, then I know it is there and contains sensible data. For a typical day of measurements the whole package weights around 1 GiB. How often and what I need to extract varies as the analyses is rather dynamic.
As far as I can see a thorough refactoring of the datasets so that everything is contained in one large dataframe might be a solution. But I wouldn't be too unhappy if I could avoid this rather tedious work. Alex Am 07.12. 18:26, schrieb William Dunlap: > To find the fastest method you need to tell more > about the constraints on your problem. > Do you always have a list of lists of scalars > or are the lists buried at various depths > or do the numeric vectors at the leaves have > various lengths? > If you always have a list of lists of scalars, > do the names always come in the same order? > (It may be faster to select by numeric position > than by name). > Do all the lists of numeric vectors contain an > element by the given name? > What is a typical size for the problem? How > many times do you typically need to repeat > the solution? > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > >> -----Original Message----- >> From: r-help-boun...@r-project.org >> [mailto:r-help-boun...@r-project.org] On Behalf Of Alexander Senger >> Sent: Tuesday, December 07, 2010 9:12 AM >> To: r-help@r-project.org >> Subject: Re: [R] fast subsetting of lists in lists >> >> Hello Gerrit, Gabor, >> >> >> thank you for your suggestion. >> >> Unfortunately unlist seems to be rather expensive. A short >> test with one >> of my datasets gives 0.01s for an extraction based on my approach and >> 5.6s for unlist alone. The reason seems to be that unlist relies on >> lapply internally and does so recursively? >> >> Maybe there is still another way to go? >> >> Alex >> >> Am 07.12.2010 15:59, schrieb Gerrit Eichner: >>> Hello, Alexander, >>> >>> does >>> >>> utest <- unlist(test) >>> utest[ names( utest) == "a"] >>> >>> come close to what you need? >>> >>> Hth, >>> >>> Gerrit >>> >>> >>> On Tue, 7 Dec 2010, Alexander Senger wrote: >>> >>>> Hello, >>>> >>>> >>>> my data is contained in nested lists (which seems not >> necessarily to be >>>> the best approach). What I need is a fast way to get >> subsets from the >>>> data. >>>> >>>> An example: >>>> >>>> test <- list(list(a = 1, b = 2, c = 3), list(a = 4, b = 5, c = 6), >>>> list(a = 7, b = 8, c = 9)) >>>> >>>> Now I would like to have all values in the named variables >> "a", that is >>>> the vector c(1, 4, 7). The best I could come up with is: >>>> >>>> val <- sapply(1:3, function (i) {test[[i]]$a}) >>>> >>>> which is unfortunately not very fast. According to >> R-inferno this is due >>>> to the fact that apply and its derivates do looping in R >> rather than >>>> rely on C-subroutines as the common [-operator. >>>> >>>> Does someone now a trick to do the same as above with the faster >>>> built-in subsetting? Something like: >>>> >>>> test[<somesubsettingmagic>] >>>> >>>> >>>> Thank you for your advice >>>> >>>> >>>> Alex >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.