Re: [OMPI users] busy waiting and oversubscriptions

Tim Prince Wed, 26 Mar 2014 08:04:21 -0400 (EDT)


On 3/26/2014 6:45 AM, Andreas Schäfer wrote:

On 10:27 Wed 26 Mar     , Jeff Squyres (jsquyres) wrote:

Be aware of a few facts, though:


1. There is a fundamental difference between disabling
hyperthreading in the BIOS at power-on time and simply running one
MPI process per core.  Disabling HT at power-on allocates more
hardware resources to the remaining HT that is left is each core
(e.g., deeper queues).

Oh, I didn't know that. That's interesting! Do you have any links with
in-depth info on that?

On certain Intel CPUs, the full size instruction TLB was available to aprocess when HyperThreading was disabled on the BIOS setup menu, andthat was the only way to make all the Write Combine buffers available toa single process. Those CPUs are no longer in widespread use.

At one time, at Intel, we did a study to evaluate the net effect (on alater CPU where this did not recover ITLB size). The result was buriedafterwards; possibly it didn't meet an unspecified marketing goal.Typical applications ran 1% faster with HyperThreading disabled by BIOSmenu even with affinities carefully set to use just one process percore. Not all applications showed a loss on all data sets when leavingHT enabled.There are a few MPI applications with specialized threading which couldgain 10% or more by use of HT.

In my personal opinion, SMT becomes less interesting as the number ofindependent cores increases.Intel(r) Xeon Phi(tm) is an exception, as the vector processing unitissues instructions from a single thread only on alternate cycles. Thiscapability is used more effectively by running OpenMP threads under MPI,e.g. 6 ranks per coprocessor of 30 threads each, spread across 10 coresper rank (exact optimum depending on the application; MKL libraries useall available hardware threads for sufficiently large data sets).


--
Tim Prince

Re: [OMPI users] busy waiting and oversubscriptions

Reply via email to