I dont understand your code. But I do have suggestion. Run the functions in the profiler, maybe differences will point at the enemy.
Know what I mean? Rprof('check.out') #run code Rprof(NULL) summaryRprof('check.out') Do that for each method. That may be uninformative. I wondered if you tried to compile your functions? In some cases it helps erase differences like this. Norman Matloff has examples like that in Art of R Programming. I keep a list of things that are slow, if we can put finger on problem, I will add to list. I suspect slow here is in runtime object lookup. The environment ones have info located more quickly by the runtime, I expect. Also, passing info back and forth from the R runtime system using [ is a common cause of slow. It is why everybody yells 'vectorize' and 'use lapply' all the time. Then again, I'm guessing because I dont understand your code:) Good luck, PJ On Apr 11, 2017 7:44 PM, "Thomas Mailund" <thomas.mail...@gmail.com> wrote: Hi y’all, I’m working on a book on how to implement functional data structures in R, and in particular on a chapter on implementing queues. You get get the current version here https://www.dropbox.com/s/9c2yk3a67p1ypmr/book.pdf?dl=0 and the relevant pages are 50-59. I’ve implemented three versions of the same idea, implementing a queue using two linked lists. One list contains the elements you add to the end of a list, the other contains the elements at the front of the list, and when you try to get an element from a list and the front-list is empty you move elements from the back-list to the front. The asymptotic analysis is explained in this figure https://www.dropbox.com/s/tzi84zmyq16hdx0/queue- amortized-linear-bound.png?dl=0 and all my implementations do get a linear time complexity when I evaluate them on a linear number of operations. However, the two implementations that uses environments seem to be almost twice as fast as the implementation that gives me a persistent data structure (see https://www.dropbox.com/s/i9dyab9ordkm0xj/queue- comparisons.png?dl=0), and I cannot figure out why. The code below contains the implementation of all three versions of the queue plus the code I use to measure their performances. I’m sorry it is a little long, but it is a minimal implementation of all three variants, the comments just make it look longer than it really is. Since the three implementations are doing basically the same things, I am a little stumped about why the performance is so consistently different. Can anyone shed some light on this, or help me figure out how to explore this further? Cheers Thomas ## Implementations of queues ################## #' Test if a data structure is empty #' @param x The data structure #' @return TRUE if x is empty. #' @export is_empty <- function(x) UseMethod("is_empty") #' Add an element to a queue #' @param x A queue #' @param elm An element #' @return an updated queue where the element has been added #' @export enqueue <- function(x, elm) UseMethod("enqueue") #' Get the front element of a queue #' @param x A queue #' @return the front element of the queue #' @export front <- function(x) UseMethod("front") #' Remove the front element of a queue #' @param x The queue #' @return The updated queue #' @export dequeue <- function(x) UseMethod("dequeue") ## Linked lists ######################### #' Add a head item to a linked list. #' @param elem The item to put at the head of the list. #' @param lst The list -- it will become the tail of the new list. #' @return a new linked list. #' @export list_cons <- function(elem, lst) structure(list(head = elem, tail = lst), class = "linked_list") list_nil <- list_cons(NA, NULL) #' @method is_empty linked_list #' @export is_empty.linked_list <- function(x) identical(x, list_nil) #' Create an empty linked list. #' @return an empty linked list. #' @export empty_list <- function() list_nil #' Get the item at the head of a linked list. #' @param lst The list #' @return The element at the head of the list. #' @export list_head <- function(lst) lst$head #' Get the tail of a linked list. #' @param lst The list #' @return The tail of the list #' @export list_tail <- function(lst) lst$tail #' Reverse a list #' @param lst A list #' @return the reverse of lst #' @export list_reverse <- function(lst) { acc <- empty_list() while (!is_empty(lst)) { acc <- list_cons(list_head(lst), acc) lst <- list_tail(lst) } acc } ## Environment queues ################################################# queue_environment <- function(front, back) { e <- new.env(parent = emptyenv()) e$front <- front e$back <- back class(e) <- c("env_queue", "environment") e } #' Construct an empty closure based queue #' @return an empty queue #' @export empty_env_queue <- function() queue_environment(empty_list(), empty_list()) #' @method is_empty env_queue #' @export is_empty.env_queue <- function(x) is_empty(x$front) && is_empty(x$back) #' @method enqueue env_queue #' @export enqueue.env_queue <- function(x, elm) { x$back <- list_cons(elm, x$back) x } #' @method front env_queue #' @export front.env_queue <- function(x) { if (is_empty(x$front)) { x$front <- list_reverse(x$back) x$back <- empty_list() } list_head(x$front) } #' @method dequeue env_queue #' @export dequeue.env_queue <- function(x) { if (is_empty(x$front)) { x$front <- list_reverse(x$back) x$back <- empty_list() } x$front <- list_tail(x$front) x } ## Closure queues ##################################################### queue <- function(front, back) list(front = front, back = back) queue_closure <- function() { q <- queue(empty_list(), empty_list()) get_queue <- function() q queue_is_empty <- function() is_empty(q$front) && is_empty(q$back) enqueue <- function(elm) { q <<- queue(q$front, list_cons(elm, q$back)) } front <- function() { if (queue_is_empty()) stop("Taking the front of an empty list") if (is_empty(q$front)) { q <<- queue(list_reverse(q$back), empty_list()) } list_head(q$front) } dequeue <- function() { if (queue_is_empty()) stop("Taking the front of an empty list") if (is_empty(q$front)) { q <<- queue(list_tail(list_reverse(q$back)), empty_list()) } else { q <<- queue(list_tail(q$front), q$back) } } structure(list(is_empty = queue_is_empty, get_queue = get_queue, enqueue = enqueue, front = front, dequeue = dequeue), class = "closure_queue") } #' Construct an empty closure based queue #' @return an empty queue #' @export empty_closure_queue <- function() queue_closure() #' @method is_empty closure_queue #' @export is_empty.closure_queue <- function(x) x$is_empty() #' @method enqueue closure_queue #' @export enqueue.closure_queue <- function(x, elm) { x$enqueue(elm) x } #' @method front closure_queue #' @export front.closure_queue <- function(x) x$front() #' @method dequeue closure_queue #' @export dequeue.closure_queue <- function(x) { x$dequeue() x } ## Extended (purely functional) queues ################################ queue_extended <- function(x, front, back) structure(list(x = x, front = front, back = back), class = "extended_queue") #' Construct an empty extended queue #' #' This is just a queue that doesn't use a closure to be able to update #' the data structure when front is called. #' #' @return an empty queue #' @export empty_extended_queue <- function() queue_extended(NA, empty_list(), empty_list()) #' @method is_empty extended_queue #' @export is_empty.extended_queue <- function(x) is_empty(x$front) && is_empty(x$back) #' @method enqueue extended_queue #' @export enqueue.extended_queue <- function(x, elm) queue_extended(ifelse(is_empty(x$back), elm, x$x), x$front, list_cons(elm, x$back)) #' @method front extended_queue #' @export front.extended_queue <- function(x) { if (is_empty(x)) stop("Taking the front of an empty list") if (is_empty(x$front)) x$x else list_head(x$front) } #' @method dequeue extended_queue #' @export dequeue.extended_queue <- function(x) { if (is_empty(x)) stop("Taking the front of an empty list") if (is_empty(x$front)) x <- queue_extended(NA, list_reverse(x$back), empty_list()) queue_extended(x$x, list_tail(x$front), x$back) } ## Performance experiments ###################### library(microbenchmark) library(tibble) library(ggplot2) get_performance_n <- function( algo , n , setup , evaluate , times , ...) { config <- setup(n) benchmarks <- microbenchmark(evaluate(n, config), times = times) tibble(algo = algo, n = n, time = benchmarks$time / 1e9) # time in sec } get_performance <- function( algo , ns , setup , evaluate , times = 10 , ...) { f <- function(n) get_performance_n(algo, n, setup, evaluate, times = times, ...) results <- Map(f, ns) do.call('rbind', results) } setup <- function(n) n evaluate <- function(empty) function(n, x) { elements <- 1:n queue <- empty for (elm in elements) { queue <- enqueue(queue, elm) } for (i in seq_along(elements)) { queue <- dequeue(queue) } } ns <- seq(5000, 10000, by = 1000) performance <- rbind(get_performance("explicity environment", ns, setup, evaluate(empty_env_queue())), get_performance("closure environment", ns, setup, evaluate(empty_closure_queue())), get_performance("functional queue", ns, setup, evaluate(empty_extended_queue()))) ggplot(performance, aes(x = as.factor(n), y = time / n, fill = algo)) + geom_boxplot() + scale_fill_grey("Data structure") + xlab(quote(n)) + ylab(expression(Time / n)) + theme_minimal() [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.