My best guess is that you are seeing differences in scheduling behavior with respect to memory locale. I notice that you are not binding your processes, and so they are free to move around the various processors on the node. I would guess that your thread is winding up on a processor that is non-local to your memory in one case, but local to your memory in the other. This is an OS-related scheduler decision.
You might try binding your processes to see if it helps. With threads, you don't really want to bind to a core, but binding to a socket should help. Try adding --bind-to-socket to your mpirun cmd line (you can't do this if you run it as a singleton - have to use mpirun). On Oct 25, 2011, at 2:45 AM, 吕慧伟 wrote: > Thanks, Ralph. Yes, I have taking that into account. The problem is not to > compare two proc with one proc, but the "multi-threading effect". > Multi-threading is good on the first machine for one and two proc, but on the > second machine, it disappears for two proc. > > To narrow down the problem, I reinstalled the operating system on the second > machine from SUSE 11(kernel 2.6.32.12, gcc 4.3.4) to Red Hat 5.4 (kernel > 2.6.18, gcc 4.1.2) which is similar to the first machine (Cent OS 5.3, kernel > 2.6.18, gcc 4.1.2). Then the problem disappears. So the problem must lies > somewhere in OS kernel or GCC version. Any suggestions? Thanks. > > -- > Huiwei Lv > > On Tue, Oct 25, 2011 at 3:11 PM, Ralph Castain <r...@open-mpi.org> wrote: > Okay - thanks for testing it. > > Of course, one obvious difference is that there isn't any communication when > you run only one proc, but there is when you run two or more, assuming your > application has MPI send/recv (or calls collective and other functions that > communicate) calls in it. Communication to yourself is very fast as no bits > actually move - sending messages to another proc is considerably slower. > > Are you taking that into account? > > > On Oct 24, 2011, at 8:47 PM, 吕慧伟 wrote: > >> No. There's a difference between "mpirun -np 1 ./my_hybrid_app..." and >> "mpirun -np 2 ./...". >> >> Run "mpirun -np 1 ./my_hybrid_app..." will increase the performance with >> more number of threads, but run "mpirun -np 2 ./..." decrease the >> performance. >> >> -- >> Huiwei Lv >> >> On Tue, Oct 25, 2011 at 12:00 AM, <users-requ...@open-mpi.org> wrote: >> >> Date: Mon, 24 Oct 2011 07:14:21 -0600 >> From: Ralph Castain <r...@open-mpi.org> >> Subject: Re: [OMPI users] Hybrid MPI/Pthreads program behaves >> differently on two different machines with same hardware >> To: Open MPI Users <us...@open-mpi.org> >> Message-ID: <42c53d0b-1586-4001-b9d2-d77af0033...@open-mpi.org> >> Content-Type: text/plain; charset="utf-8" >> >> Does the difference persist if you run the single process using mpirun? In >> other words, does "mpirun -np 1 ./my_hybrid_app..." behave the same as >> "mpirun -np 2 ./..."? >> >> There is a slight difference in the way procs start when run as singletons. >> It shouldn't make a difference here, but worth testing. >> >> On Oct 24, 2011, at 12:37 AM, ??? wrote: >> >> > Dear List, >> > >> > I have a hybrid MPI/Pthreads program named "my_hybrid_app", this program >> > is memory-intensive and take advantage of multi-threading to improve >> > memory throughput. I run "my_hybrid_app" on two machines, which have same >> > hardware configuration but different OS and GCC. The problem is: when I >> > run "my_hybrid_app" with one process, two machines behaves the same: the >> > more number of threads, the better the performance; however, when I run >> > "my_hybrid_app" with two or more processes. The first machine still >> > increase performance with more threads, the second machine degrades in >> > performance with more threads. >> > >> > Since running "my_hybrid_app" with one process behaves correctly, I >> > suspect my linking to MPI library has some problem. Would somebody point >> > me in the right direction? Thanks in advance. >> > >> > Attached are the commandline used, my machine informantion and link >> > informantion. >> > p.s. 1: Commandline >> > single process: ./my_hybrid_app <number of threads> >> > multiple process: mpirun -np 2 ./my_hybrid_app <number of threads> >> > >> > p.s. 2: Machine Informantion >> > The first machine is CentOS 5.3 with GCC 4.1.2: >> > Target: x86_64-redhat-linux >> > Configured with: ../configure --prefix=/usr --mandir=/usr/share/man >> > --infodir=/usr/share/info --enable-shared --enable-threads=posix >> > --enable-checking=release --with-system-zlib --enable-__cxa_atexit >> > --disable-libunwind-exceptions --enable-libgcj-multifile >> > --enable-languages=c,c++,objc,obj-c++,java,fortran,ada >> > --enable-java-awt=gtk --disable-dssi --enable-plugin >> > --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre >> > --with-cpu=generic --host=x86_64-redhat-linux >> > Thread model: posix >> > gcc version 4.1.2 20080704 (Red Hat 4.1.2-44) >> > The second machine is SUSE Enterprise Server 11 with GCC 4.3.4: >> > Target: x86_64-suse-linux >> > Configured with: ../configure --prefix=/usr --infodir=/usr/share/info >> > --mandir=/usr/share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64 >> > --enable-languages=c,c++,objc,fortran,obj-c++,java,ada >> > --enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.3 >> > --enable-ssp --disable-libssp --with-bugurl=http://bugs.opensuse.org/ >> > --with-pkgversion='SUSE Linux' --disable-libgcj --disable-libmudflap >> > --with-slibdir=/lib64 --with-system-zlib --enable-__cxa_atexit >> > --enable-libstdcxx-allocator=new --disable-libstdcxx-pch >> > --enable-version-specific-runtime-libs --program-suffix=-4.3 >> > --enable-linux-futex --without-system-libunwind --with-cpu=generic >> > --build=x86_64-suse-linux >> > Thread model: posix >> > gcc version 4.3.4 [gcc-4_3-branch revision 152973] (SUSE Linux) >> > >> > p.s. 3: ldd Informantion >> > The first machine: >> > $ ldd my_hybrid_app >> > libm.so.6 => /lib64/libm.so.6 (0x000000358d400000) >> > libmpi.so.0 => /usr/local/openmpi/lib/libmpi.so.0 >> > (0x00002af0d53a7000) >> > libopen-rte.so.0 => /usr/local/openmpi/lib/libopen-rte.so.0 >> > (0x00002af0d564a000) >> > libopen-pal.so.0 => /usr/local/openmpi/lib/libopen-pal.so.0 >> > (0x00002af0d5895000) >> > libdl.so.2 => /lib64/libdl.so.2 (0x000000358d000000) >> > libnsl.so.1 => /lib64/libnsl.so.1 (0x000000358f000000) >> > libutil.so.1 => /lib64/libutil.so.1 (0x000000359a600000) >> > libgomp.so.1 => /usr/lib64/libgomp.so.1 (0x00002af0d5b07000) >> > libpthread.so.0 => /lib64/libpthread.so.0 (0x000000358d800000) >> > libc.so.6 => /lib64/libc.so.6 (0x000000358cc00000) >> > /lib64/ld-linux-x86-64.so.2 (0x000000358c800000) >> > librt.so.1 => /lib64/librt.so.1 (0x000000358dc00000) >> > The second machine: >> > $ ldd my_hybrid_app >> > linux-vdso.so.1 => (0x00007fff3eb5f000) >> > libmpi.so.0 => /root/opt/openmpi/lib/libmpi.so.0 >> > (0x00007f68627a1000) >> > libm.so.6 => /lib64/libm.so.6 (0x00007f686254b000) >> > libopen-rte.so.0 => /root/opt/openmpi/lib/libopen-rte.so.0 >> > (0x00007f68622fc000) >> > libopen-pal.so.0 => /root/opt/openmpi/lib/libopen-pal.so.0 >> > (0x00007f68620a5000) >> > libdl.so.2 => /lib64/libdl.so.2 (0x00007f6861ea1000) >> > libnsl.so.1 => /lib64/libnsl.so.1 (0x00007f6861c89000) >> > libutil.so.1 => /lib64/libutil.so.1 (0x00007f6861a86000) >> > libgomp.so.1 => /usr/lib64/libgomp.so.1 (0x00007f686187d000) >> > libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f6861660000) >> > libc.so.6 => /lib64/libc.so.6 (0x00007f6861302000) >> > /lib64/ld-linux-x86-64.so.2 (0x00007f6862a58000) >> > librt.so.1 => /lib64/librt.so.1 (0x00007f68610f9000) >> > I installed openmpi-1.4.2 to a user directory /root/opt/openmpi and use >> > "-L/root/opt/openmpi -Wl,-rpath,/root/opt/openmpi" when linking. >> > -- >> > Huiwei Lv >> > PhD. student at Institute of Computing Technology, >> > Beijing, China >> > http://asg.ict.ac.cn/lhw > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users