On 8/10/2010 9:21 PM, Ralf W. Grosse-Kunstleve wrote:
Most of the time is spent in this function...

   str_cref side,
   str_cref pivot,
   str_cref direct,
   int const&  m,
   int const&  n,
   arr_cref<double>  c,
   arr_cref<double>  s,
   arr_ref<double, 2>  a,
   int const&  lda)

in this loop:

         FEM_DOSTEP(j, n - 1, 1, -1) {
           ctemp = c(j);
           stemp = s(j);
           if ((ctemp != one) || (stemp != zero)) {
             FEM_DO(i, 1, m) {
               temp = a(i, j + 1);
               a(i, j + 1) = ctemp * temp - stemp * a(i, j);
               a(i, j) = stemp * temp + ctemp * a(i, j);

a(i, j) is implemented as

   T* elems_; // member

     T const&
       ssize_t i1,
       ssize_t i2) const
       return elems_[dims_.index_1d(i1, i2)];


   ssize_t all[Ndims]; // member
   ssize_t origin[Ndims]; // member

       ssize_t i1,
       ssize_t i2) const
           (i2 - origin[1]) * all[0]
         + (i1 - origin[0]);

The array pointer is buried as elems_ member in the arr_ref<>  class template.
How can I apply __restrict in this case?

Do you mean you are adding an additional level of functions and hoping for efficient in-lining? Your programming style is elusive, and your insistence on top posting will make this thread difficult to deal with. The conditional inside the loop likely is even more difficult for C++ to optimize than Fortran. As already discussed, if you don't optimize otherwise, you will need __restrict to overcome aliasing concerns among a,c, and s. If you want efficient C++, you will need a lot of hand optimization, and verification of the effect of each level of obscurity which you add. How is this topic appropriate to gcc mail list?

Tim Prince

Reply via email to