Hi Farid I’m not sure I understand what you are asking here. If your point is that OMPI isn’t placing and binding procs per the LSF directives, then you are quite correct. The LSF folks never provided that level of integration, nor the info by which we might have derived it (e.g., how the pattern is communicated).
If someone from IBM would like to provide that code, we’d be happy to help answer questions as to how to perform the integration. > On Apr 18, 2016, at 10:13 AM, Farid Parpia <par...@us.ibm.com> wrote: > > Greetings! > > The following batch script will successfully demo the use of LSF's task > geometry feature using IBM Parallel Environment: > #BUB -J "task_geometry" > #BSUB -n 9 > #BSUB -R "span[ptile=3]" > #BSUB -network "type=sn_single:mode=us" > #BSUB -R "affinity[core]" > #BSUB -e "task_geometry.stderr.%J" > #BSUB -o "task_geometry.stdout.%J" > #BSUB -q "normal" > #BSUB -M "800" > #BSUB -R "rusage[mem=800]" > #BSUB -x > > export LSB_PJL_TASK_GEOMETRY="{(5)(4,3)(2,1,0)}" > > ldd /gpfs/gpfs_stage1/parpia/PE_tests/reporter/bin/reporter_MPI > > /gpfs/gpfs_stage1/parpia/PE_tests/reporter/bin/reporter_MPI > The reporter_MPIutility simply reports the hostname and affinitization for > each MPI process, and is what I use to verify that the job is distributed to > allocated nodes and on them with the affinitization expected. Typical output > is > , > > To adapt the above batch script to use OpenMPI, I modify it to > #BSUB -J "task_geometry" > #BSUB -n 9 > #BSUB -R "span[ptile=3]" > #BSUB -m "p10a30 p10a33 p10a35 p10a55 p10a58" > #BSUB -R "affinity[core]" > #BSUB -e "task_geometry.stderr.%J" > #BSUB -o "task_geometry.stdout.%J" > #BSUB -q "normal" > #BSUB -M "800" > #BSUB -R "rusage[mem=800]" > #BSUB -x > > export PATH=/usr/local/OpenMPI/1.10.2/bin:${PATH} > export LD_LIBRARY_PATH=/usr/local/OpenMPI/1.10.2/lib:${PATH} > > export LSB_PJL_TASK_GEOMETRY="{(5)(4,3)(2,1,0)}" > > echo "=== LSB_DJOB_HOSTFILE ===" > cat ${LSB_DJOB_HOSTFILE} > echo "=== LSB_AFFINITY_HOSTFILE ===" > cat ${LSB_AFFINITY_HOSTFILE} > echo "=== LSB_DJOB_RANKFILE ===" > cat ${LSB_DJOB_RANKFILE} > echo "=========================" > > ldd /gpfs/gpfs_stage1/parpia/OpenMPI_tests/reporter/bin/reporter_MPI > > mpirun /gpfs/gpfs_stage1/parpia/OpenMPI_tests/reporter/bin/reporter_MPI > There are additional lines of scripting that I have inserted to help with > debugging this failing job. Here are the output files from the job: > , > If I change the last line of the immediately above job script to > mpirun -bind-to core:overload-allowed > /gpfs/gpfs_stage1/parpia/OpenMPI_tests/reporter/bin/reporter_MPI > the job runs through, but the host selection and affinization is completely > wrong (you can extract the relevant information with grep "can be sched" > *.stdout.* | sort -n -k 9): > , > OpenMPI 1.10.2 was built using this script: > > It was installed with > make install > executed from the top if the build tree. Here > > is the output of > ompi_info --all > > Regards, > > Farid Parpia IBM Corporation: 710-2-RF28, 2455 South Road, > Poughkeepsie, NY 12601, USA; Telephone: (845) 433-8420 = Tie Line 293-8420 > <task_geometry.stdout.43915.gz><task_geometry.stderr.43915.gz><task_geometry.stderr.43918.gz><task_geometry.stdout.43918.gz><task_geometry.stderr.43953.gz><task_geometry.stdout.43953.gz><build_OpenMPI.sh><ompi_info--all.gz>_______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/04/28955.php