Douglas Roberts wrote:
> I forgot to mention:  you have to tinker the snot out of a NUMA 
> application to get optimal performance.  NUMA means that you have to 
> pay close attention to what parts of your calculation are using which 
> memory, location-wise.  Non-uniform means different latency/bandwith 
> for different memory locations relative to any cpu in the system.  IMO 
> it actually takes longer to develop an effective NUMA app than it does 
> to field a distributed memory app.
To make almost any interesting operation execute fast on a modern CPU 
means paying attention to what memory is being called upon and in what 
order.  That's unavoidable whether or not you admit defeat by using 
message passing for the sake of scaling.  (I'm avoiding the term 
"distributed memory app" to avoid confusion with "distributed shared 
memory".)


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
lectures, archives, unsubscribe, maps at http://www.friam.org

Reply via email to