Hello David, On Sat, 23 Nov 2019 11:58:42 -0500 David Disabato <ddisa...@gmail.com> wrote:
> For example, if I want to add a new column to a data.frame, I can do > something like `myDataFrame[, "newColumn"] <- NA`. <Opinion> Arguably, iterative growth of data structures is not the "R style", since it may lead to costly reallocations, resulting in the worst case scenario of quadratic behaviour for linear operations. If iterative processing is unavoidable, it might help to store partial results in a list, then build the final matrix with a single call to do.call(cbind, results). </Opinion> > However, with a matrix, this syntax does not work and I have to use a > call to `cbind` and create a new object. For example, `mymatrix2 <- > cbind(mymatrix, "newColumn" = NA)`. > Is there a programming reason that base R does not have a matrix > method for `[<-` or is it something that arguably should be added? A data frame is a list of columns, so adding a new column is relatively cheap: allocate enough memory for one column and append (roughly speaking) a pointer to the list of pointers-to-column-data. This results in reallocation of the *latter* list, but, since that list is small in comparison to the whole data frame, it's okay. Note that this operation does not affect any of the other columns belonging to the same data frame. A matrix, on the other hand, is a vector containing the whole matrix with array dimensions stored as an attribute. Since R matrices are stored by column [*], adding a new column to the matrix means resizing the buffer to hold length(matrix) + nrow(matrix) elements, then appending the new column to the end of the buffer. If the allocator cannot enlarge the buffer in place (because the buffer is followed in memory by another buffer), it has to allocate the new buffer elsewhere, copy the memory, then free the old buffer. To build a matrix by appending columns, one needs to perform this O(n) operation O(n) times, resulting in O(n^2) performance. Adding rows is even worse because memory has to be copied in parts, not as a whole. Disclaimer: this is one reason I can think about why doesn't R offer subassignment to non-existent matrix columns by default. The actual reason might be different. -- Best regards, Ivan [*] https://github.com/wch/r-source/blob/bac4cd3013ead1379e20127d056ee036278b47ff/src/main/duplicate.c#L443 ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.