Hi,

I am getting strange performance results for allgatherv operation for the
same number of procs and data, but with varying binding width. For example
here are two cases with about 180x difference in performance.

Each machine has 4 sockets each with 6 cores totaling 24 cores per node
(topology attached).

Case 1
----
12 procs per node each bound to 1 core times 30 nodes --> 1929 ms

Case 2
----
12 procs per node each bound to 2 cores times 30 nodes --> 357209 ms


Another set of variations for 2 procs per node and 4 procs per node is
given below in the chart. Is such variation expected with binding width? I
am a bit puzzled and would appreciate any help to understand this.

[image: Inline image 1]

Thank you,
Saliya

-- 
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Cell 812-391-4914
http://saliya.org

Reply via email to