Re: [OMPI users] Bad performance when scattering big size of data?

Ralph Castain Mon, 4 Oct 2010 17:42:46 -0400

On Oct 4, 2010, at 1:48 PM, Storm Zhang wrote:

> Thanks a lot, Ralgh. As I said, I also tried to use SGE(also showing 1024 
> available for parallel tasks) which only assign 34-38 compute nodes which 
> only has 272-304 real cores for 500 procs running. The running time is 
> consistent with 100 procs and not a lot fluctuations due to the number of 
> machines' changing.


Afraid I don't understand your statement. If you have 500 procs running on < 
500 cores, then the performance relative to a high-performance job (#procs <= 
#cores) will be worse. We deliberately dial down the performance when 
oversubscribed to ensure that procs "play nice" in situations where the node is 
oversubscribed.

>  So I guess it is not related to hyperthreading. Correct me if I'm wrong.

Has nothing to do with hyperthreading - OMPI has no knowledge of hyperthreads 
at this time.

> 
> BTW, how to bind the proc to the core? I tried --bind-to-core or 
> -bind-to-core but neither works. Is it for OpenMP, not for OpenMPI? 

Those should work. You might try --report-bindings to see what OMPI thought it 
did.

> 
> Linbao
> 
> 
> On Mon, Oct 4, 2010 at 12:27 PM, Ralph Castain <r...@open-mpi.org> wrote:
> Some of what you are seeing is the natural result of context 
> switching....some thoughts regarding the results:
> 
> 1. You didn't bind your procs to cores when running with #procs < #cores, so 
> you're performance in those scenarios will also be less than max. 
> 
> 2. Once the number of procs exceeds the number of cores, you guarantee a lot 
> of context switching, so performance will definitely take a hit.
> 
> 3. Sometime in the not-too-distant-future, OMPI will (hopefully) become 
> hyperthread aware. For now, we don't see them as separate processing units. 
> So as far as OMPI is concerned, you only have 512 computing units to work 
> with, not 1024.
> 
> Bottom line is that you are running oversubscribed, so OMPI turns down your 
> performance so that the machine doesn't hemorrhage as it context switches.
> 
> 
> On Oct 4, 2010, at 11:06 AM, Doug Reeder wrote:
> 
>> In my experience hyperthreading can't really deliver two cores worth of 
>> processing simultaneously for processes expecting sole use of a core. Since 
>> you really have 512 cores I'm not surprised that you see a performance hit 
>> when requesting > 512 compute units. We should really get input from a 
>> hyperthreading expert, preferably form intel.
>> 
>> Doug Reeder
>> On Oct 4, 2010, at 9:53 AM, Storm Zhang wrote:
>> 
>>> We have 64 compute nodes which are dual qual-core and hyperthreaded CPUs. 
>>> So we have 1024 compute units shown in the ROCKS 5.3 system. I'm trying to 
>>> scatter an array from the master node to the compute nodes using mpiCC and 
>>> mpirun using C++. 
>>> 
>>> Here is my test:
>>> 
>>> The array size is 18KB * Number of compute nodes and is scattered to the 
>>> compute nodes 5000 times repeatly. 
>>> 
>>> The average running time(seconds):
>>> 
>>> 100 nodes: 170,
>>> 400 nodes: 690,
>>> 500 nodes: 855,
>>> 600 nodes: 2550,
>>> 700 nodes: 2720,
>>> 800 nodes: 2900,
>>> 
>>> There is a big jump of running time from 500 nodes to 600 nodes. Don't know 
>>> what's the problem. 
>>> Tried both in OMPI 1.3.2 and OMPI 1.4.2. Running time is a little faster 
>>> for all the tests in 1.4.2 but the jump still exists. 
>>> Tried using either Bcast function or simply Send/Recv which give very close 
>>> results. 
>>> Tried both in running it directly or using SGE and got the same results.
>>> 
>>> The code and ompi_info are attached to this email. The direct running 
>>> command is :
>>> /opt/openmpi/bin/mpirun --mca btl_tcp_if_include eth0 --machinefile 
>>> ../machines -np 600 scatttest
>>> 
>>> The ifconfig of head node for eth0 is:
>>> eth0      Link encap:Ethernet  HWaddr 00:26:B9:56:8B:44  
>>>           inet addr:192.168.1.1  Bcast:192.168.1.255  Mask:255.255.255.0
>>>           inet6 addr: fe80::226:b9ff:fe56:8b44/64 Scope:Link
>>>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>>           RX packets:1096060373 errors:0 dropped:2512622 overruns:0 frame:0
>>>           TX packets:513387679 errors:0 dropped:0 overruns:0 carrier:0
>>>           collisions:0 txqueuelen:1000 
>>>           RX bytes:832328807459 (775.1 GiB)  TX bytes:250824621959 (233.5 
>>> GiB)
>>>           Interrupt:106 Memory:d6000000-d6012800 
>>> 
>>> A typical ifconfig of a compute node is:
>>> eth0      Link encap:Ethernet  HWaddr 00:21:9B:9A:15:AC  
>>>           inet addr:192.168.1.253  Bcast:192.168.1.255  Mask:255.255.255.0
>>>           inet6 addr: fe80::221:9bff:fe9a:15ac/64 Scope:Link
>>>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>>           RX packets:362716422 errors:0 dropped:0 overruns:0 frame:0
>>>           TX packets:349967746 errors:0 dropped:0 overruns:0 carrier:0
>>>           collisions:0 txqueuelen:1000 
>>>           RX bytes:139699954685 (130.1 GiB)  TX bytes:338207741480 (314.9 
>>> GiB)
>>>           Interrupt:82 Memory:d6000000-d6012800 
>>> 
>>> 
>>> Does anyone help me out of this? It bothers me a lot.
>>> 
>>> Thank you very much.
>>> 
>>> Linbao
>>> <scatttest.cpp><ompi_info>_______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Bad performance when scattering big size of data?

Reply via email to