>>>>> Dirk Eddelbuettel >>>>> on Sun, 24 Aug 2025 08:33:58 -0500 writes:
> In SVN commit r88444, Martin made a change following Mikael's PR #18918. The > one-line synopsis is 'subassignment <complex>[i] <- NA should only touch the > real part' and you can see it all at [1]. > Imaginary parts now get a zero. Indeed, and this in itself is a bit dubious: >From the commit message you cite above one could/would expect that the imaginary parts should *stay* unchanged, i.e., remain '2' in your example below (and remain '0' when they are, as in the PR#18919...) > I am wondering if that cause rowMeans and colMeans to be off? Well, I can argue they are not off ... {but I will eventually agree with you that we *have* a problem!} One clue is that the complex 'x' matrices now *differ* between R-release (*and* R-patched) and R-devel *but* they print identically.... and that is part of the confusion in this case. (or *was* adding to the confusion at least). format() and print() should and do go hand-in-hand, here as well, and for (probably mostly historical reasons), R&R and then R-core had decided to format/print all complex NAs the same ... the reasoning being that 'NA means "Not Available"' and for complex data (one complex seen as "one complex number" rather than "two real numbers") why should on be bothered about the Re/Im representation of a complex. ... [We have been on that topic before, notably in bugzilla, as well]. So in your example below, >> x <- matrix(1:9 + 2i, 3) >> x[c(2,4,6,8)] <- NA >> >> x > [,1] [,2] [,3] > [1,] 1+2i NA 7+2i > [2,] NA 5+2i NA > [3,] 3+2i NA 9+2i >> in R 4.5.1, > Im(x) [,1] [,2] [,3] [1,] 2 NA 2 [2,] NA 2 NA [3,] 2 NA 2 > whereas in R-devel > Im(x) [,1] [,2] [,3] [1,] 2 0 2 [2,] 0 2 0 [3,] 2 0 2 > .... and indeed, you *did* implicitly acknowledge this difference, above. Consequently, of course, rowMeans(x) or colMeans(x) and many other matrix functions/functionals of 'x' will differ, between R-release (& -patched) and R-devel ... as the 'x' differ .. in their imaginary parts. ... but hang on ... rowMeans() and colMeans() work "separately" for the real and imaginary parts, and (as seen above) the imaginary part has no NA's and the number of obs per row/column in the imaginary part is always 3, such that the Im() parts of the colSums() result are divided by 3, here: >> rowMeans(x, TRUE) # this now differs from R-release > [1] 4+1.333333i 5+0.666667i 6+1.333333i >> > But in R 4.5.1 we get the (here constant) imaginary part as constant just as > we do when we do this 'by hand' as rowSum() appears fine: >> rowSums(x, TRUE) > [1] 8+4i 5+2i 12+4i >> apply(x, 1, \(x) sum(is.finite(x))) # row count of finite elems > [1] 2 1 2 >> >> rowSums(x, TRUE) / apply(x, 1, \(x) sum(is.finite(x))) > [1] 4+2i 5+2i 6+2i >> > I could be off my rocker here as I don't use complex variables much and am a > little rustic but a rudimentary check suggests my reasoning applies: means of > real and imaginary parts (taken across rows or columns) should be the sum > divided by the number of non-NA elements. Right now they aren't. well, see above,they *are* __if__ you look at "number of non-NA elements" "coordinate-wisely" or separately for Re() and Im(). I still agree we should address this: We do have a discrepancy with mean() i.e., mean.default() which does "exactly" what you do "by hand" above, and hence using is.na() for the full complex vector, and not *separately* for Re() and Im() parts; ... and I do tend to agree that colMeans(*, na.rm=TRUE) etc probably should be adapted to *not* work coordinate-wise but drop all "complex NAs" both for Re and Im. In addition, back to the original PR #18918 (--> https://bugs.r-project.org/show_bug.cgi?id=18918 ), I will *also* take up my "is a bit dubious" from above. Martin ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel