Justin Talbot <jtalbot <at> stanford.edu> writes: > > > Because that's inconsistent with pmin and pmax when two NAs are summed. > > > > x = c(1,3,NA,NA,5) > > y = c(2,NA,4,NA,1) > > colSums(rbind(x, y), na.rm = TRUE) > > [1] 3 3 4 0 6 # actual > > [1] 3 3 4 NA 6 # desired > > But your desired result would be inconsistent with sum: > sum(NA,NA,na.rm=TRUE) > [1] 0 > > >From a language definition perspective I think having psum return 0 > here is right choice.
Ok, you've sold me. psum(NA,NA,na.rm=TRUE) returning 0 sounds good. And pprod(NA,NA,na.rm=TRUE) returning 1, consistent with prod then. Then the case for psum is more for convenience and speed -vs- colSums(rbind(x,y), na.rm=TRUE)), since rbind will copy x and y into a new matrix. The case for pprod is similar, plus colProds doesn't exist. > Thus, + should have the signature: `+`(..., na.rm=FALSE), which would > allow you to do things like: > > `+`(c(1,2),c(1,2),c(1,2),NA, na.rm=TRUE) = c(3,6) > > If you don't like typing `+`, you could always alias psum to `+`. But there would be a cost, wouldn't there? `+` is a dyadic .Primitive. Changing that to take `...` and `na.rm` could slow it down (iiuc), and any changes to the existing language are risky. For example : `+`(1,2,3) is currently an error. Changing that to do something might have implications for some of the 4,000 packages (some might rely on that being an error), with a possible speed cost too. In contrast, adding two functions that didn't exist before: psum and pprod, seems to be a safer and simpler proposition. Matthew ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel