>>>>> "MH" == Michael Hahsler <mhahs...@lyle.smu.edu> >>>>> on Thu, 19 Mar 2015 20:15:37 -0500 writes:
MH> Hi Martin, MH> package arules heavily relies on ngCMatrix and uses multiplication and MH> addition for logical operations. I think it makes sense that in a mixed MH> operation with one dgCMatrix and one ngCMatrix the ngCMatrix should be MH> "promoted" to a dgCMatrix. MH> The current behavior of %*% and friends is in deed confusing: >> m <- matrix(sample(c(0,1), 5*5, replace=TRUE), nrow=5) >> x <- as(m, "dgCMatrix") >> y <- as(m, "ngCMatrix") >> x %*% y MH> 5 x 5 sparse Matrix of class "ngCMatrix" MH> [1,] | | | . | MH> [2,] | | | . | MH> [3,] . . | | . MH> [4,] . . . | . MH> [5,] | | | | | >> x %*% x MH> 5 x 5 sparse Matrix of class "dgCMatrix" MH> [1,] 1 2 1 . 2 MH> [2,] 1 3 1 . 3 MH> [3,] . . 1 2 . MH> [4,] . . . 1 . MH> [5,] 1 2 2 1 2 Indeed, that is not what one should expect. MH> We even explicitly coerce in our code ngCMatrix to dgCMatrix to avoid MH> this behavior. I think all these operations probably should result MH> consistently in a dgCMatrix. Eventually. As I said, it *is* useful to work with boolean arithmetic in some cases here, so I do want to provide that .. hopefully entirely consistently as well in the future, but longer term not via '%*%' MH> I would love to see | and & for position-wise AND and OR for ngCMatrix. Well, why don't you look? ;-) These have worked for a long time already! (I checked a version from 2008) Thanks a lot, Michael, for your valuable feedback. Martin MH> Thanks, MH> -Michael MH> On 03/19/2015 05:02 PM, Martin Maechler wrote: >> This is a Request For Comment, also BCCed to 390 package maintainers >> of reverse dependencies of the Matrix package. >> >> Most users and package authors working with our 'Matrix' package will >> be using it for numerical computations, and so will be using >> "dMatrix" (d : double precision) matrix objects M, and indirectly, e.g., for >> M >= c will also use "lMatrix" (l: logical i.e. TRUE/FALSE/NA). >> All the following is **not** affecting those numerical / logical >> computations. >> >> A few others will know that we also have "pattern" matrices (purely >> binary: TRUE/FALSE, no NA) notably sparse ones, those "ngCMatrix" etc, >> all starting with "n" (from ``patter[n]``) which do play a prominent >> role in the internal sparse matrix algorithms, notably of the >> (underlying C code) CHOLMOD library in the so-called "symbolic" >> cholesky decomposition and other such operations. Another reason you >> may use them because they are equivalent to incidence matrices of >> unweighted (directed or undirected) graphs. >> >> Now, as the subject says, I'm bringing up the topic of what should >> happen when these matrices appear in matrix multiplications. >> Somewhat by design, but also partly by coincidence, the *sparse* >> pattern matrices multiplication in the Matrix package mostly builds on >> the CHOLMOD library `cholmod_ssmult()` function which implements >> "Boolean arithmetic" for them, instead of regular arithmetic: >> "+" is logical "or" >> "*" is logical "and". >> Once we map TRUE <-> 1 and FALSE <-> 0, the only difference between >> boolean and regular arithmetic is that "1+1 = 1" in the (mapped) >> boolean arithmetic, because "TRUE | TRUE" is TRUE in original logic. >> >> The drawback of using the boolean arithmetic here is the "clash" with >> the usual numeric arithmetic, and arithmetic in R where logical is >> coerced to integer (and that to "double") when certain numerical >> functions/operations are used. >> >> A more severe problem --- which I had not been aware of until >> relatively recently -- is the fact that the CHOLMD function >> cholmod_ssdmult(A, B) >> treats *both* A and B as "pattern" as soon as one of them is a >> (sparse) pattern matrix. >> And this is - I say - in clear contrast to what R users would expect: >> If you multiply a numeric with a "kind of logical" matrix (a pattern >> one), you will expect that the >> TRUE/FALSE matrix will be treated as a 1/0 matrix because it is >> combined with a numeric matrix. >> So we could say that in this case, the Matrix package behavior is >> clearly bugous .... but still it has been the behavior for the last 10 >> years or so. >> >> RFC 1: "Change 1": >> I currently propose to change this behavior for the upcoming release >> of Matrix (version 1.2-0), though I have no idea if dependent >> packages would partly fail their checks or otherwise have changed >> behavior subsequently. >> The change seems sensible, since I think if your package relied on >> this behavior, it was inadvertent and accidental. >> Still you may differ in your opinion about this change nr.1 >> >> RFC 2: "Change 2": >> This change would be more radical, and something I would not plan for >> the upcoming release of Matrix, but possibly for an update say one or >> two months later or so: It concerns the matrix products when *both* >> matrices are pattern. A situation where the boolean arithmetic may >> really make sense and where indeed packages may have depended on the >> current behavior ("T + T |--> T"). ... although that is currently >> only used for *sparse* pattern matrices, not for dense ones. >> >> Further, it may still seem surprising that matrix multiplication does >> not behave numerically for a pair of such matrices, and by the >> principle of "least surprise" we should provide the boolean arithmetic >> matrix products in another way than by the standard %*%, >> crossprod() and tcrossprod() functions. >> So one possibility could be to change the standard functions to behave >> numerically, >> and e.g., use %&% (replace the numeric "*" by a logical "&") and >> crossprod(A,B, boolean=TRUE), tcrossprod(A,B, boolean=TRUE) >> for the three boolean arithmetic version of matrix multiplications. >> >> What do you think about this? I'm particularly interested to hear >> from authors and users of packages such as 'arules' which IIRC >> explicitly work with sparse pattern matrices. >> >> Thank you for your thoughts and creative ideas, >> Martin Maechler, ETH Zurich >> MH> -- MH> Michael Hahsler, Assistant Professor MH> Department of Engineering Management, Information, and Systems MH> Department of Computer Science and Engineering (by courtesy) MH> Bobby B. Lyle School of Engineering MH> Southern Methodist University, Dallas, Texas MH> office: Caruth Hall, suite 337, room 311 MH> email: mhahs...@lyle.smu.edu MH> web: http://lyle.smu.edu/~mhahsler ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel