Thanks for the advice. Our jobs vary in size, from just a few MPI processes to about 64. Jobs are submitted at random, which is why I want to map by socket. If the cluster is empty, and someone submits a job with 16 MPI processes, I would think it would run most efficiently if it used 8 nodes, 2 processes per node. If we just fill up two nodes as you suggest, we overload the RAM on those two nodes.
-----Original Message----- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of tmish...@jcity.maeda.co.jp Sent: Friday, August 29, 2014 5:24 PM To: Open MPI Users Subject: Re: [OMPI users] How does binding option affect network traffic? Hi, Your cluster is very similar to ours where Torque and OpenMPI is installed. I would use this cmd line: #PBS -l nodes=2:ppn=12 mpirun --report-bindings -np 16 <executable file name> Here --map-by socket:pe=1 and -bind-to core is assumed as default setting. Then, you can run 10 jobs independently and simultaneously beacause you have 20 nodes totally. While each node in your cluser has 12 cores, the nprocs per node running on a node is 8, which means 66.666 % use, not 100%. I think this loss can not be avoided as long as you use 16*N MPI per job. It's a kind of mismatch with your cluster which has 12 cores per node. If you can use 12*N MPI per job, then it's most effective. Is there any reason why you use 16*N MPI per job? Tetsuya _______________________________________________ users mailing list us...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2014/08/25201.php