https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91875

            Bug ID: 91875
           Summary: Performance drop with mt19937 with -O2/-O3/-Ofast
                    compared to -O1
           Product: gcc
           Version: 7.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hehaochen at hotmail dot com
  Target Milestone: ---

Compile the following code and run:

   > g++ -O0/-O1/-O2/-O3/-Ofast test.cpp
   > ./a.out

---------------------------------------------------
#include <iostream>
#include <vector>
#include <chrono>
#include <random>

template< typename T_rng, typename T_dist>
double timer(T_rng& rng, T_dist& dist, int n){
    std::vector< typename T_dist::result_type > vec(n, 0);
    auto t1 = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < n; ++i)
        vec[i] = dist(rng);
    auto t2 = std::chrono::high_resolution_clock::now();
    auto runtime =
std::chrono::duration_cast<std::chrono::microseconds>(t2-t1).count()/1000.0;
    return runtime;
}

int main(){
    const int n = 10000000;
    std::default_random_engine rng_default(1);
    std::mt19937 rng_mt (1);
    std::mt19937_64 rng_mt_64 (1);
    std::uniform_int_distribution<int> dist_int(0,1000);
    std::uniform_real_distribution<float> dist_float(0.0, 1.0);

    std::cout << "float_default: " << timer(rng_default, dist_float, n) <<
std::endl;
    std::cout << "float_mt: " << timer(rng_mt, dist_float, n) << std::endl;
    std::cout << "float_mt_64: " << timer(rng_mt_64, dist_float, n) <<
std::endl;
}
---------------------------------------------------

We get the following result:

=========+===============+===========+===========+==========================
 Compile |               | gcc 6.5.0 | gcc 7.1.0 | Comments                   
 With    |               |           |           |                            
=========+===============+===========+===========+==========================
         | float_default | 1680.99   | 470.338   |                            
         +---------------+-----------+-----------+      Good to see the       
   -O0   | float_mt      | 1777.31   | 559.226   |  performance improvement   
         +---------------+-----------+-----------+                            
         | float_mt_64   | 1886.36   | 649.42    |                            
---------+---------------+-----------+-----------+-------------------------
         | float_default | 67.532    | 69.127    |                         
         +---------------+-----------+-----------+                         
   -O1   | float_mt      | 60.976    | 60.864    |  Always good and stable 
         +---------------+-----------+-----------+                         
         | float_mt_64   | 126.056   | 128.175   |                         
---------+---------------+-----------+-----------+-------------------------
         | float_default | 69.31     | 67.628    |                        
         +---------------+-----------+-----------+                        
   -O2   | float_mt      | 54.214    | 101.032   |                        
         +---------------+-----------+-----------+                        
         | float_mt_64   | 124.479   | 167.792   |  float_mt drops by ~85%
---------+---------------+-----------+-----------+                        
         | float_default | 67.513    | 58.234    |float_mt_64 drops by ~35%
         +---------------+-----------+-----------+                        
   -O3   | float_mt      | 57.296    | 100.051   |   *********************
         +---------------+-----------+-----------+   ***  Even slower  ***
         | float_mt_64   | 125.185   | 168.787   |   ***  than '-O1'!  ***
---------+---------------+-----------+-----------+   *********************
         | float_default | 58.442    | 60.228    |                        
         +---------------+-----------+-----------+                        
 -Ofast  | float_mt      | 47.885    | 92.448    |                        
         +---------------+-----------+-----------+                        
         | float_mt_64   | 126.016   | 170.85    |                        
=========+===============+===========+===========+=========================

gcc with former version than 7 has almost same result as gcc 6.5.0.

I think the performance with -O2/-O3/-Ofast shouldn't worse than that with -O1,
that is counter-intuitive.

Thank you.

Reply via email to