Sent from my iPhone

> On Dec 5, 2024, at 4:11 PM, Sokol Serguei <serguei.so...@gmail.com> wrote:
> 
> Luc,
> 
> There can be many reasons explaining the difference in compiled code 
> performances. Tuning such code to achieve a pick performance is generally a 
> fine art.
> Optimizations techniques can include but are not limited to:
>  - SIMD instructions (and memory alignment for their optimal use);
>  - instruction level parallelism;
>  - unrolling loops;
>  - cache level (mis-)hits;
>  - multi-thread parallelism;
>  - ...
> Approaches in optimization are not the same depending on kind of application: 
> CPU-bound, memory-bound or IO-bound.
> Many of this techniques can be directly used (or not) by compiler depending 
> on chosen options. Are you sure to use the same options and compiler that 
> were used during R compilation?
> And finally, the compared code could be plainly not the same. R can use BLAS 
> call, e.g. OpenBLAS to multiply two matrices. This latter is heavily 
> optimized for such operations and can achieve x10 acceleration compared to 
> plain "naive" BLAS.
> The R code you cite can be just the code for a fallback in case no BLAS was 
> found during R compilation.
> Look at what your sessionInfo() says about used BLAS.

That doesn’t always work. I build R on Windows (10) linking to a pre-compiled 
static OpenBLAS (3.28) and my sessionInfo has an empty string for BLAS. I 
reckon that is because I’m using Rblas.dll, it’s just that my Rblas isn’t 
vanilla. 

Avi

> 
> Best,
> Serguei.
> 
>> Le 05/12/2024 à 14:21, Luc De Wilde a écrit :
>> Dear package developers,
>> 
>> in creating a package lavaanC for use in lavaan, I need to perform some 
>> matrix computations involving matrix products and crossproducts. As far as I 
>> see I cannot directly call the C code in the R core. So I copied the code in 
>> the R core, but the same C/C++ code in a package is 2.5 à 3 times slower 
>> than executed directly in R :
>> 
>> C code in package :
>>   SEXP prod0(SEXP mat1, SEXP mat2) {
>>     SEXP u1 = Rf_getAttrib(mat1, R_DimSymbol);
>>     int m1 = INTEGER(u1)[0];
>>     int n1 = INTEGER(u1)[1];
>>     SEXP u2 = Rf_getAttrib(mat2, R_DimSymbol);
>>     int m2 = INTEGER(u2)[0];
>>     int n2 = INTEGER(u2)[1];
>>     if (n1 != m2) Rf_error("matrices not conforming");
>>     SEXP retval = PROTECT(Rf_allocMatrix(REALSXP, m1, n2));
>>     double* left = REAL(mat1);
>>     double* right = REAL(mat2);
>>     double* ret = REAL(retval);
>>     double werk = 0.0;
>>     for (int j = 0; j < n2; j++) {
>>       for (int i = 0; i < m1; i++) {
>>           werk = 0.0;
>>         for (int k = 0; k < n1; k++) werk += (left[i + m1 * k] * right[k + 
>> m2 * j]);
>>         ret[j * m1 + i] =  werk;
>>       }
>>     }
>>     UNPROTECT(1);
>>     return retval;
>>   }
>> 
>> Test script :
>> m1 <- matrix(rnorm(300000), nrow = 60)
>> m2 <- matrix(rnorm(300000), ncol = 60)
>> print(microbenchmark::microbenchmark(
>>   m1 %*% m2, .Call("prod0", m1, m2), times = 100
>> ))
>> 
>> Result on my pc:
>> Unit: milliseconds
>>                    expr     min      lq     mean  median       uq     max 
>> neval
>>               m1 %*% m2 10.5650 10.8967 11.13434 10.9449 11.02965 15.8397   
>> 100
>>  .Call("prod0", m1, m2) 29.3336 30.7868 32.05114 31.0408 33.85935 45.5321   
>> 100
>> 
>> 
>> Can anyone explain why the compiled code in the package is so much slower 
>> than in R core?
>> 
>> and
>> 
>> Is there a way to improve the performance in R package?
>> 
>> 
>> Best regards,
>> 
>> Luc De Wilde
>> 
>> 
>> 
>> ______________________________________________
>> R-package-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> 
> ______________________________________________
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Reply via email to