On 10 Nov 2007, at 07:32, Prof Brian Ripley wrote: > On Fri, 9 Nov 2007, Robin Hankin wrote: > >> Hi >> >> [R-2.6.0, macOSX 10.4.10]. >> >> The helppage says that rowSums() and colSums() >> are equivalent to 'apply' with 'FUN = sum'. >> >> But I came across this: >> >> > a <- matrix(1:30,5,6) >> > is.integer(apply(a,1,sum)) >> [1] TRUE >> > is.integer(rowSums(a)) >> [1] FALSE >> > > > 'equivalent' does not mean 'identical': the wording was deliberate. > >> so rowSums() returns a float. > > And that is what the help page says it does (albeit more > accurately: there is no 'float' type, but there is numeric aka > double and the result could be complex). > >> Why is this? > > You seem to be asking why R works as documented! >
Yes, that's exactly what I was asking [perhaps this should have been R-devel?]. What is the thinking behind converting to double? I expect that part of the answer is speed: # First define an integer matrix: a <- matrix(as.integer(rpois(1e6,3)),1000,1000) > system.time(rowSums(a)) user system elapsed 0.049 0.000 0.050 > system.time(rowSums(a)) user system elapsed 0.050 0.000 0.051 > system.time(rowSums(a)) user system elapsed 0.050 0.001 0.052 > system.time(colSums(a)) user system elapsed 0.043 0.001 0.046 > system.time(colSums(a)) user system elapsed 0.043 0.000 0.044 About the same speed. Now use apply() to see whether integer summation is faster than double summation for this kind of problem: > system.time(ignore <- apply(a,1,sum)) user system elapsed 0.085 0.009 0.094 > system.time(ignore <- apply(a,1,sum)) user system elapsed 0.086 0.010 0.095 > system.time(ignore <- apply(a,1,sum)) user system elapsed 0.089 0.010 0.104 > system.time(ignore <- apply(a,2,sum)) user system elapsed 0.071 0.008 0.078 > system.time(ignore <- apply(a,2,sum)) user system elapsed 0.069 0.007 0.076 > system.time(ignore <- apply(a,2,sum)) user system elapsed 0.070 0.008 0.081 # Now convert to double: > a <- a+0 > system.time(ignore <- apply(a,1,sum)) user system elapsed 0.127 0.019 0.151 > system.time(ignore <- apply(a,1,sum)) user system elapsed 0.121 0.017 0.139 > system.time(ignore <- apply(a,1,sum)) user system elapsed 0.130 0.022 0.175 > system.time(ignore <- apply(a,2,sum)) user system elapsed 0.084 0.015 0.098 > system.time(ignore <- apply(a,2,sum)) user system elapsed 0.085 0.015 0.105 > system.time(ignore <- apply(a,2,sum)) user system elapsed 0.087 0.016 0.107 [can anyone comment on the difference between the first three and the last three double precision summations?] perhaps a little bit faster for the integers, but there's not much in it. So, why does rowSums() coerce to double (behaviour that is undesirable for me)? -- Robin Hankin Uncertainty Analyst National Oceanography Centre, Southampton European Way, Southampton SO14 3ZH, UK tel 023-8059-7743 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.