On 20/11/2012 12:39 PM, Omphalodes Verna wrote:
Dear list!
I have question of 'correct function formation'. Which function (fun1 or fun2; see below) is written more correctly? Using ''structure'' as output or creating empty ''data.frame'' and then transform it as output? (fun1 and fun1 is just for illustration).
Thanks a lot, OV
code:
input <- data.frame(x1 = rnorm(20), x2 = rnorm(20), x3 = rnorm(20))
fun1 <- function(x) {
ID <- NULL; minimum <- NULL; maximum <- NULL
for(i in seq_along(names(x))) {
ID[i] <- names(x)[i]
minimum[i] <- min(x[, names(x)[i]])
maximum[i] <- max(x[, names(x)[i]])
}
output <- structure(list(ID, minimum, maximum), row.names = seq_along(names(x)), .Names = c("ID",
"minimum", "maximum"), class = "data.frame")
return(output)
}
fun1 above relies on the internal implementation of the data.frame
class. That's really unlikely to change, but you still shouldn't rely
on it.
fun2 <- function(x) {
output <- data.frame(ID = character(), minimum = numeric(), maximum =
numeric(), stringsAsFactors = FALSE)
for(i in seq_along(names(x))) {
output[i, "ID"] <-names(x)[i]
output[i, "minimum"] <- min(x[, names(x)[i]])
output[i, "maximum"] <- max(x[, names(x)[i]])
}
return(output)
}
This one is going to be really slow, because it does so much indexing of
the output dataframe.
I would combine the approaches: assign to local variables in the loop
the way fun1 does, then construct a dataframe at the end. That is,
output <- data.frame(ID, minimum, maximum)
return(output)
One other change: don't initialize the local variables to NULL,
initialize them to their final size, e.g.
ID <- character(ncol(x))
minimum <- numeric(ncol(x))
maximum <- numeric(ncol(x))
(And if the contents are as simple as in the example, you don't need the
loop, but I assume the real case is more complicated.)
Duncan Murdoch
fun1(input)
fun2(input)
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.