https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91875
Bug ID: 91875 Summary: Performance drop with mt19937 with -O2/-O3/-Ofast compared to -O1 Product: gcc Version: 7.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: hehaochen at hotmail dot com Target Milestone: --- Compile the following code and run: > g++ -O0/-O1/-O2/-O3/-Ofast test.cpp > ./a.out --------------------------------------------------- #include <iostream> #include <vector> #include <chrono> #include <random> template< typename T_rng, typename T_dist> double timer(T_rng& rng, T_dist& dist, int n){ std::vector< typename T_dist::result_type > vec(n, 0); auto t1 = std::chrono::high_resolution_clock::now(); for (int i = 0; i < n; ++i) vec[i] = dist(rng); auto t2 = std::chrono::high_resolution_clock::now(); auto runtime = std::chrono::duration_cast<std::chrono::microseconds>(t2-t1).count()/1000.0; return runtime; } int main(){ const int n = 10000000; std::default_random_engine rng_default(1); std::mt19937 rng_mt (1); std::mt19937_64 rng_mt_64 (1); std::uniform_int_distribution<int> dist_int(0,1000); std::uniform_real_distribution<float> dist_float(0.0, 1.0); std::cout << "float_default: " << timer(rng_default, dist_float, n) << std::endl; std::cout << "float_mt: " << timer(rng_mt, dist_float, n) << std::endl; std::cout << "float_mt_64: " << timer(rng_mt_64, dist_float, n) << std::endl; } --------------------------------------------------- We get the following result: =========+===============+===========+===========+========================== Compile | | gcc 6.5.0 | gcc 7.1.0 | Comments With | | | | =========+===============+===========+===========+========================== | float_default | 1680.99 | 470.338 | +---------------+-----------+-----------+ Good to see the -O0 | float_mt | 1777.31 | 559.226 | performance improvement +---------------+-----------+-----------+ | float_mt_64 | 1886.36 | 649.42 | ---------+---------------+-----------+-----------+------------------------- | float_default | 67.532 | 69.127 | +---------------+-----------+-----------+ -O1 | float_mt | 60.976 | 60.864 | Always good and stable +---------------+-----------+-----------+ | float_mt_64 | 126.056 | 128.175 | ---------+---------------+-----------+-----------+------------------------- | float_default | 69.31 | 67.628 | +---------------+-----------+-----------+ -O2 | float_mt | 54.214 | 101.032 | +---------------+-----------+-----------+ | float_mt_64 | 124.479 | 167.792 | float_mt drops by ~85% ---------+---------------+-----------+-----------+ | float_default | 67.513 | 58.234 |float_mt_64 drops by ~35% +---------------+-----------+-----------+ -O3 | float_mt | 57.296 | 100.051 | ********************* +---------------+-----------+-----------+ *** Even slower *** | float_mt_64 | 125.185 | 168.787 | *** than '-O1'! *** ---------+---------------+-----------+-----------+ ********************* | float_default | 58.442 | 60.228 | +---------------+-----------+-----------+ -Ofast | float_mt | 47.885 | 92.448 | +---------------+-----------+-----------+ | float_mt_64 | 126.016 | 170.85 | =========+===============+===========+===========+========================= gcc with former version than 7 has almost same result as gcc 6.5.0. I think the performance with -O2/-O3/-Ofast shouldn't worse than that with -O1, that is counter-intuitive. Thank you.