On Oct 4, 2010, at 1:48 PM, Storm Zhang wrote: > Thanks a lot, Ralgh. As I said, I also tried to use SGE(also showing 1024 > available for parallel tasks) which only assign 34-38 compute nodes which > only has 272-304 real cores for 500 procs running. The running time is > consistent with 100 procs and not a lot fluctuations due to the number of > machines' changing.
Afraid I don't understand your statement. If you have 500 procs running on < 500 cores, then the performance relative to a high-performance job (#procs <= #cores) will be worse. We deliberately dial down the performance when oversubscribed to ensure that procs "play nice" in situations where the node is oversubscribed. > So I guess it is not related to hyperthreading. Correct me if I'm wrong. Has nothing to do with hyperthreading - OMPI has no knowledge of hyperthreads at this time. > > BTW, how to bind the proc to the core? I tried --bind-to-core or > -bind-to-core but neither works. Is it for OpenMP, not for OpenMPI? Those should work. You might try --report-bindings to see what OMPI thought it did. > > Linbao > > > On Mon, Oct 4, 2010 at 12:27 PM, Ralph Castain <r...@open-mpi.org> wrote: > Some of what you are seeing is the natural result of context > switching....some thoughts regarding the results: > > 1. You didn't bind your procs to cores when running with #procs < #cores, so > you're performance in those scenarios will also be less than max. > > 2. Once the number of procs exceeds the number of cores, you guarantee a lot > of context switching, so performance will definitely take a hit. > > 3. Sometime in the not-too-distant-future, OMPI will (hopefully) become > hyperthread aware. For now, we don't see them as separate processing units. > So as far as OMPI is concerned, you only have 512 computing units to work > with, not 1024. > > Bottom line is that you are running oversubscribed, so OMPI turns down your > performance so that the machine doesn't hemorrhage as it context switches. > > > On Oct 4, 2010, at 11:06 AM, Doug Reeder wrote: > >> In my experience hyperthreading can't really deliver two cores worth of >> processing simultaneously for processes expecting sole use of a core. Since >> you really have 512 cores I'm not surprised that you see a performance hit >> when requesting > 512 compute units. We should really get input from a >> hyperthreading expert, preferably form intel. >> >> Doug Reeder >> On Oct 4, 2010, at 9:53 AM, Storm Zhang wrote: >> >>> We have 64 compute nodes which are dual qual-core and hyperthreaded CPUs. >>> So we have 1024 compute units shown in the ROCKS 5.3 system. I'm trying to >>> scatter an array from the master node to the compute nodes using mpiCC and >>> mpirun using C++. >>> >>> Here is my test: >>> >>> The array size is 18KB * Number of compute nodes and is scattered to the >>> compute nodes 5000 times repeatly. >>> >>> The average running time(seconds): >>> >>> 100 nodes: 170, >>> 400 nodes: 690, >>> 500 nodes: 855, >>> 600 nodes: 2550, >>> 700 nodes: 2720, >>> 800 nodes: 2900, >>> >>> There is a big jump of running time from 500 nodes to 600 nodes. Don't know >>> what's the problem. >>> Tried both in OMPI 1.3.2 and OMPI 1.4.2. Running time is a little faster >>> for all the tests in 1.4.2 but the jump still exists. >>> Tried using either Bcast function or simply Send/Recv which give very close >>> results. >>> Tried both in running it directly or using SGE and got the same results. >>> >>> The code and ompi_info are attached to this email. The direct running >>> command is : >>> /opt/openmpi/bin/mpirun --mca btl_tcp_if_include eth0 --machinefile >>> ../machines -np 600 scatttest >>> >>> The ifconfig of head node for eth0 is: >>> eth0 Link encap:Ethernet HWaddr 00:26:B9:56:8B:44 >>> inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0 >>> inet6 addr: fe80::226:b9ff:fe56:8b44/64 Scope:Link >>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 >>> RX packets:1096060373 errors:0 dropped:2512622 overruns:0 frame:0 >>> TX packets:513387679 errors:0 dropped:0 overruns:0 carrier:0 >>> collisions:0 txqueuelen:1000 >>> RX bytes:832328807459 (775.1 GiB) TX bytes:250824621959 (233.5 >>> GiB) >>> Interrupt:106 Memory:d6000000-d6012800 >>> >>> A typical ifconfig of a compute node is: >>> eth0 Link encap:Ethernet HWaddr 00:21:9B:9A:15:AC >>> inet addr:192.168.1.253 Bcast:192.168.1.255 Mask:255.255.255.0 >>> inet6 addr: fe80::221:9bff:fe9a:15ac/64 Scope:Link >>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 >>> RX packets:362716422 errors:0 dropped:0 overruns:0 frame:0 >>> TX packets:349967746 errors:0 dropped:0 overruns:0 carrier:0 >>> collisions:0 txqueuelen:1000 >>> RX bytes:139699954685 (130.1 GiB) TX bytes:338207741480 (314.9 >>> GiB) >>> Interrupt:82 Memory:d6000000-d6012800 >>> >>> >>> Does anyone help me out of this? It bothers me a lot. >>> >>> Thank you very much. >>> >>> Linbao >>> <scatttest.cpp><ompi_info>_______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users