Jeff Hammond <jeff.scie...@gmail.com> writes: >> I see sleeping for ‘0s’ typically taking ≳50μs on Linux (measured on >> RHEL 6 or 7, without specific tuning, on recent Intel). It doesn't look >> like something you want in paths that should be low latency, but maybe >> there's something you can do to improve that? (sched_yield takes <1μs.) > > I demonstrated a bunch of different implementations with the instruction to > "pick one of these...", where establishing the relationship between > implementation and performance was left as an exercise for the reader :-)
The point was that only the one seemed available on RHEL6 to this exercised reader. No complaints about the useful list of possibilities. > Note that MPI implementations may be interested in taking advantage of > https://software.intel.com/en-us/blogs/2016/10/06/intel-xeon-phi-product-family-x200-knl-user-mode-ring-3-monitor-and-mwait. Is that really useful if it's KNL-specific and MSR-based, with a setup that implementations couldn't assume? >> Is cpu_relax available to userland? (GCC has an x86-specific intrinsic >> __builtin_ia32_pause in fairly recent versions, but it's not in RHEL6's >> gcc-4.4.) > > The pause instruction is available in ring3. Just use that if cpu_relax > wrapper is not implemented. [OK; I meant in a userland library.] Are there published measurements of the typical effects of spinning and ameliorations on some sort of "representative" system? _______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users