Sent from my iPhone
> On Dec 5, 2024, at 4:11 PM, Sokol Serguei <serguei.so...@gmail.com> wrote: > > Luc, > > There can be many reasons explaining the difference in compiled code > performances. Tuning such code to achieve a pick performance is generally a > fine art. > Optimizations techniques can include but are not limited to: > - SIMD instructions (and memory alignment for their optimal use); > - instruction level parallelism; > - unrolling loops; > - cache level (mis-)hits; > - multi-thread parallelism; > - ... > Approaches in optimization are not the same depending on kind of application: > CPU-bound, memory-bound or IO-bound. > Many of this techniques can be directly used (or not) by compiler depending > on chosen options. Are you sure to use the same options and compiler that > were used during R compilation? > And finally, the compared code could be plainly not the same. R can use BLAS > call, e.g. OpenBLAS to multiply two matrices. This latter is heavily > optimized for such operations and can achieve x10 acceleration compared to > plain "naive" BLAS. > The R code you cite can be just the code for a fallback in case no BLAS was > found during R compilation. > Look at what your sessionInfo() says about used BLAS. That doesn’t always work. I build R on Windows (10) linking to a pre-compiled static OpenBLAS (3.28) and my sessionInfo has an empty string for BLAS. I reckon that is because I’m using Rblas.dll, it’s just that my Rblas isn’t vanilla. Avi > > Best, > Serguei. > >> Le 05/12/2024 à 14:21, Luc De Wilde a écrit : >> Dear package developers, >> >> in creating a package lavaanC for use in lavaan, I need to perform some >> matrix computations involving matrix products and crossproducts. As far as I >> see I cannot directly call the C code in the R core. So I copied the code in >> the R core, but the same C/C++ code in a package is 2.5 à 3 times slower >> than executed directly in R : >> >> C code in package : >> SEXP prod0(SEXP mat1, SEXP mat2) { >> SEXP u1 = Rf_getAttrib(mat1, R_DimSymbol); >> int m1 = INTEGER(u1)[0]; >> int n1 = INTEGER(u1)[1]; >> SEXP u2 = Rf_getAttrib(mat2, R_DimSymbol); >> int m2 = INTEGER(u2)[0]; >> int n2 = INTEGER(u2)[1]; >> if (n1 != m2) Rf_error("matrices not conforming"); >> SEXP retval = PROTECT(Rf_allocMatrix(REALSXP, m1, n2)); >> double* left = REAL(mat1); >> double* right = REAL(mat2); >> double* ret = REAL(retval); >> double werk = 0.0; >> for (int j = 0; j < n2; j++) { >> for (int i = 0; i < m1; i++) { >> werk = 0.0; >> for (int k = 0; k < n1; k++) werk += (left[i + m1 * k] * right[k + >> m2 * j]); >> ret[j * m1 + i] = werk; >> } >> } >> UNPROTECT(1); >> return retval; >> } >> >> Test script : >> m1 <- matrix(rnorm(300000), nrow = 60) >> m2 <- matrix(rnorm(300000), ncol = 60) >> print(microbenchmark::microbenchmark( >> m1 %*% m2, .Call("prod0", m1, m2), times = 100 >> )) >> >> Result on my pc: >> Unit: milliseconds >> expr min lq mean median uq max >> neval >> m1 %*% m2 10.5650 10.8967 11.13434 10.9449 11.02965 15.8397 >> 100 >> .Call("prod0", m1, m2) 29.3336 30.7868 32.05114 31.0408 33.85935 45.5321 >> 100 >> >> >> Can anyone explain why the compiled code in the package is so much slower >> than in R core? >> >> and >> >> Is there a way to improve the performance in R package? >> >> >> Best regards, >> >> Luc De Wilde >> >> >> >> ______________________________________________ >> R-package-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-package-devel > > ______________________________________________ > R-package-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-package-devel ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel