[Some time ago]
Jeff Hammond <jeff.scie...@gmail.com> writes:

> If you want to keep long-waiting MPI processes from clogging your CPU
> pipeline and heating up your machines, you can turn blocking MPI
> collectives into nicer ones by implementing them in terms of MPI-3
> nonblocking collectives using something like the following.

I see sleeping for ‘0s’ typically taking ≳50μs on Linux (measured on
RHEL 6 or 7, without specific tuning, on recent Intel).  It doesn't look
like something you want in paths that should be low latency, but maybe
there's something you can do to improve that?  (sched_yield takes <1μs.)

> I typed this code straight into this email, so you should validate it
> carefully.

...

> #elif USE_CPU_RELAX
>     cpu_relax(); /*
> http://linux-kernel.2935.n7.nabble.com/x86-cpu-relax-why-nop-vs-pause-td398656.html
> */

Is cpu_relax available to userland?  (GCC has an x86-specific intrinsic
__builtin_ia32_pause in fairly recent versions, but it's not in RHEL6's
gcc-4.4.)
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to