On 12/30/2010 02:30 PM, Paul Rigor wrote: > Thanks gang, > I'll work with named vectors and concatenate as needed.
This might be ok for small problems, but concatenation is an inefficient R pattern -- the objects being concatenated are copied in full, so becomes longer, and the concatenation slower, with each new key. With a <- integer(); t0 <- Sys.time() for (i in seq_len(1e6)) { a <- c(a, i) if (0 == i %% 10000) print(i / as.numeric(Sys.time() - t0)) } we have, in 'appends per second' [1] 3236.76 [1] 2425.111 [1] 1757.52 [1] 1331.846 We don't really have a dictionary here, either, as the 'key' values are not stored. Phil's suggest suffers from the same type of issue, where the addition of new keys implies growing (reallocating) the vector. a <- integer(); t0 <- Sys.time() for (i in seq_len(1e6)) { key <- as.character(i) a[[key]] <- i if (0 == i %% 10000) print(i / as.numeric(Sys.time() - t0)) } [1] 12659.18 [1] 9516.288 [1] 6821.47 [1] 5907.782 Better to use an environment (and live with reference semantics) e <- new.env(parent=emptyenv()); t0 <- Sys.time() for (i in seq_len(1e6)) { key <- as.character(i) e[[key]] <- i if (0 == i %% 10000) print(i / as.numeric(Sys.time() - t0)) } with [1] 20916.56 [1] 21421.85 [1] 21762.04 [1] 21207.69 [1] 21239.19 The usual alternative to the concatenation pattern is 'pre-allocate-and-fill' x <- integer(1e6); t0 <- Sys.time() for (i in seq_len(1e6)) { ??? } but this doesn't work with key/value pairs because there is no sense (or is there?) in which the keys can be 'pre-allocated'. Creating the dictionary in one go is very efficient > system.time(d <- + structure(seq_len(1e6), .Names=as.character(seq_len(1e6)))) user system elapsed 0.417 0.002 0.419 Martin > Paul > > On Thu, Dec 23, 2010 at 7:39 AM, Seth Falcon <s...@userprimary.net > <mailto:s...@userprimary.net>> wrote: > > On Wed, Dec 22, 2010 at 7:05 PM, Martin Morgan <mtmor...@fhcrc.org > <mailto:mtmor...@fhcrc.org>> wrote: > > On 12/22/2010 05:49 PM, Paul Rigor wrote: > >> Hi, > >> > >> I was wondering if anyone has played around this this package called > >> "rdict"? It attempts to implement a hash table in R using skip > lists. Just > >> came across it while trying to look for simpler text manipulation > methods: > >> > >> > http://userprimary.net/posts/2010/05/29/rdict-skip-list-hash-table-for-R/ > > > > kind of an odd question, so kind of an odd answer. > > > > I'd say this was an implementation of skip lists in C with an R > > interface. > > I had to play around with the rdict package in order to write it, but > haven't used it much since :-P > Be sure to look at R's native environment objects which provide a hash > table structure and are suitable for many uses. > > + seth > > -- > Seth Falcon | @sfalcon | http://userprimary.net/ > > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.