Bingo - you said the magic word. This is a terminology issue. When we say "core", we mean the old definition of "core", not "hyperthreads". If you want to use HTs as your base processing unit and bind to them, then you need to specify --bind-to hwthread. That warning should then go away.
We don't require a swap region be mounted - I didn't see anything in your original message indicating that OMPI had actually crashed, but just wasn't launching due to the above issue. Were you actually seeing crashes as well? On Wed, Apr 1, 2015 at 8:31 AM, Lane, William <william.l...@cshs.org> wrote: > Ralph, > > Here's the associated hostfile: > > #openMPI hostfile for csclprd3 > #max slots prevents oversubscribing csclprd3-0-9 > csclprd3-0-0 slots=12 max-slots=12 > csclprd3-0-1 slots=6 max-slots=6 > csclprd3-0-2 slots=6 max-slots=6 > csclprd3-0-3 slots=6 max-slots=6 > csclprd3-0-4 slots=6 max-slots=6 > csclprd3-0-5 slots=6 max-slots=6 > csclprd3-0-6 slots=6 max-slots=6 > csclprd3-0-7 slots=32 max-slots=32 > csclprd3-0-8 slots=32 max-slots=32 > csclprd3-0-9 slots=32 max-slots=32 > csclprd3-0-10 slots=32 max-slots=32 > csclprd3-0-11 slots=32 max-slots=32 > csclprd3-0-12 slots=12 max-slots=12 > csclprd3-0-13 slots=24 max-slots=24 > csclprd3-0-14 slots=16 max-slots=16 > csclprd3-0-15 slots=16 max-slots=16 > csclprd3-0-16 slots=24 max-slots=24 > csclprd3-0-17 slots=24 max-slots=24 > csclprd3-6-1 slots=4 max-slots=4 > csclprd3-6-5 slots=4 max-slots=4 > > The number of slots also includes hyperthreading > cores. > > One more question, would not having defined swap > partitions on all the nodes in the ring cause OpenMPI > to crash? Because no swap partitions are defined > for any of the above systems. > > -Bill L. > > > ------------------------------ > *From:* users [users-boun...@open-mpi.org] on behalf of Ralph Castain [ > r...@open-mpi.org] > *Sent:* Wednesday, April 01, 2015 5:04 AM > *To:* Open MPI Users > *Subject:* Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3 > > The warning about binding to memory is due to not having numactl-devel > installed on the system. The job would still run, but we are warning you > that we cannot bind memory to the same domain as the core where we bind the > process. Can cause poor performance, but not fatal. I forget the name of > the param, but you can tell us to "shut up" :-) > > The other warning/error indicates that we aren't seeing enough cores on > the allocation you gave us via the hostile to support one proc/core - i.e., > we didn't at least 128 cores in the sum of the nodes you told us about. I > take it you were expecting that there were that many or more? > > Ralph > > > On Wed, Apr 1, 2015 at 12:54 AM, Lane, William <william.l...@cshs.org> > wrote: > >> I'm having problems running OpenMPI jobs >> (using a hostfile) on an HPC cluster running >> ROCKS on CentOS 6.3. I'm running OpenMPI >> outside of Sun Grid Engine (i.e. it is not submitted >> as a job to SGE). The program being run is a LAPACK >> benchmark. The commandline parameter I'm >> using to run the jobs is: >> >> $MPI_DIR/bin/mpirun -np $NSLOTS -bind-to-core -report-bindings --hostfile >> hostfile --mca btl_tcp_if_include eth0 --prefix $MPI_DIR >> $BENCH_DIR/$APP_DIR/$APP_BIN >> >> Where MPI_DIR=/hpc/apps/mpi/openmpi/1.8.2/ >> NSLOTS=128 >> >> I'm getting errors of the form and OpenMPI never runs the LAPACK >> benchmark: >> >> >> -------------------------------------------------------------------------- >> WARNING: a request was made to bind a process. While the system >> supports binding the process itself, at least one node does NOT >> support binding memory to the process location. >> >> Node: csclprd3-0-11 >> >> This usually is due to not having the required NUMA support installed >> on the node. In some Linux distributions, the required support is >> contained in the libnumactl and libnumactl-devel packages. >> This is a warning only; your job will continue, though performance may >> be degraded. >> >> -------------------------------------------------------------------------- >> >> >> -------------------------------------------------------------------------- >> A request was made to bind to that would result in binding more >> processes than cpus on a resource: >> >> Bind to: CORE >> Node: csclprd3-0-11 >> #processes: 2 >> #cpus: 1 >> >> You can override this protection by adding the "overload-allowed" >> option to your binding directive. >> >> -------------------------------------------------------------------------- >> >> The only installed numa packages are: >> numactl.x86_64 >> 2.0.7-3.el6 @centos6.3-x86_64-0/$ >> >> When I search for the available NUMA packages I find: >> >> yum search numa | less >> >> Loaded plugins: fastestmirror >> Loading mirror speeds from cached hostfile >> ============================== N/S Matched: numa >> =============================== >> numactl-devel.i686 : Development package for building >> Applications that use numa >> numactl-devel.x86_64 : Development package for building >> Applications that use >> : numa >> numad.x86_64 : NUMA user daemon >> numactl.i686 : Library for tuning for Non Uniform Memory Access >> machines >> numactl.x86_64 : Library for tuning for Non Uniform Memory Access >> machines >> >> Do I need to install additional and/or different NUMA packages in order >> to get OpenMPI to work >> on this cluster? >> >> -Bill Lane >> IMPORTANT WARNING: This message is intended for the use of the person or >> entity to which it is addressed and may contain information that is >> privileged and confidential, the disclosure of which is governed by >> applicable law. If the reader of this message is not the intended >> recipient, or the employee or agent responsible for delivering it to the >> intended recipient, you are hereby notified that any dissemination, >> distribution or copying of this information is strictly prohibited. Thank >> you for your cooperation. >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Searchable archives: >> http://www.open-mpi.org/community/lists/users/2015/04/index.php >> > > IMPORTANT WARNING: This message is intended for the use of the person > or entity to which it is addressed and may contain information that is > privileged and confidential, the disclosure of which is governed by > applicable law. If the reader of this message is not the intended > recipient, or the employee or agent responsible for delivering it to the > intended recipient, you are hereby notified that any dissemination, > distribution or copying of this information is strictly prohibited. Thank > you for your cooperation. > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/04/26589.php >