[Some time ago] Jeff Hammond <jeff.scie...@gmail.com> writes: > If you want to keep long-waiting MPI processes from clogging your CPU > pipeline and heating up your machines, you can turn blocking MPI > collectives into nicer ones by implementing them in terms of MPI-3 > nonblocking collectives using something like the following.
I see sleeping for ‘0s’ typically taking ≳50μs on Linux (measured on RHEL 6 or 7, without specific tuning, on recent Intel). It doesn't look like something you want in paths that should be low latency, but maybe there's something you can do to improve that? (sched_yield takes <1μs.) > I typed this code straight into this email, so you should validate it > carefully. ... > #elif USE_CPU_RELAX > cpu_relax(); /* > http://linux-kernel.2935.n7.nabble.com/x86-cpu-relax-why-nop-vs-pause-td398656.html > */ Is cpu_relax available to userland? (GCC has an x86-specific intrinsic __builtin_ia32_pause in fairly recent versions, but it's not in RHEL6's gcc-4.4.) _______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users