Re: [OMPI users] how to get mpirun to scale from 16 to 64 cores

2014-06-16 Thread Zehan Cui
Hi Yuping, Maybe using multi-threads inside a socket, and MPI among sockets is better choice for such NUMA platform. Multi-threads can exploit the benefit of share memory, and MPI can alleviate the cost of non-uniform memory access. regards, Zehan On Tue, Jun 17, 2014 at 6:19 AM, Yuping Sun

Re: [OMPI users] how to get mpirun to scale from 16 to 64 cores

2014-06-16 Thread Yuping Sun
Hi Ralph: Is the following correct command to you: mpirun -np 32 --bysocket --bycore ~ysun/Codes/NASA/fun3d-12.3-66687/Mpi/FUN3D_90/nodet_mpi --time_timestep_loop --animation_freq -1 I run above command, still do not improve. Would you give me a detailed command with options? Thank you. Be

Re: [OMPI users] deprecated cuptiActivityEnqueueBuffer

2014-06-16 Thread jcabello
Ok that works Thanks!! > Do you need the vampire support in your build? If not, you could add this > to configure. > --disable-vt > >>-Original Message- >>From: users [mailto:users-boun...@open-mpi.org] On Behalf Of >>jcabe...@computacion.cs.cinvestav.mx >>Sent: Monday, June 16, 2014 1:4

Re: [OMPI users] how to get mpirun to scale from 16 to 64 cores

2014-06-16 Thread Ralph Castain
Well, for one, there is never any guarantee of linear scaling with the number of procs - that is very application dependent. You can actually see performance decrease with number of procs if the application doesn't know how to exploit them. One thing that stands out is your mapping and binding

[OMPI users] how to get mpirun to scale from 16 to 64 cores

2014-06-16 Thread Yuping Sun
Dear All: I bought a 64 core workstation and installed NASA fun3d with open mpi 1.6.5. Then I started to test run fun3d using 16, 32, 48 cores. However the performance of the fun3d run is bad. I got data below: the run command is (it is for 32 core as an example) mpiexec -np 32 --bysocket --bi

Re: [OMPI users] deprecated cuptiActivityEnqueueBuffer

2014-06-16 Thread Rolf vandeVaart
Do you need the vampire support in your build? If not, you could add this to configure. --disable-vt >-Original Message- >From: users [mailto:users-boun...@open-mpi.org] On Behalf Of >jcabe...@computacion.cs.cinvestav.mx >Sent: Monday, June 16, 2014 1:40 PM >To: us...@open-mpi.org >Sub

[OMPI users] deprecated cuptiActivityEnqueueBuffer

2014-06-16 Thread jcabello
Hi all: I'm having trouble to compile OMPI from trunk svn with the new 6.0 nvidia SDK because deprecated cuptiActivityEnqueueBuffer this is the problem: CC libvt_la-vt_cupti_activity.lo CC libvt_la-vt_iowrap_helper.lo CC libvt_la-vt_libwrap.lo CC libvt_la-vt_mallo

Re: [OMPI users] Bind multiple cores to rank - OpenMPI 1.8.1

2014-06-16 Thread Ralph Castain
Just to wrap this up for the user list: this has now been fixed and added to 1.8.2 in the nightly tarball. The problem proved to be an edge case when partial allocations were combined with coprocessor existence (hit a slightly different code path). On Jun 12, 2014, at 9:04 AM, Dan Dietz wrote