-----Original Message----- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of users-requ...@open-mpi.org Sent: Wednesday, September 25, 2013 11:00 AM To: us...@open-mpi.org Subject: users Digest, Vol 2689, Issue 1
Send users mailing list submissions to us...@open-mpi.org To subscribe or unsubscribe via the World Wide Web, visit http://www.open-mpi.org/mailman/listinfo.cgi/users or, via email, send a message with subject or body 'help' to users-requ...@open-mpi.org You can reach the person managing the list at users-ow...@open-mpi.org When replying, please edit your Subject line so it is more specific than "Re: Contents of users digest..." Today's Topics: 1. OpenMPI 1.6.3 problem (Bryan, Clifton W ERDC-RDE-MSRC-MS Contractor) 2. Re: OpenMPI 1.6.3 problem (Ralph Castain) 3. Re: OpenMPI 1.6.3 problem (Jeff Squyres (jsquyres)) ---------------------------------------------------------------------- Message: 1 List-Post: users@lists.open-mpi.org Date: Tue, 24 Sep 2013 19:20:36 +0000 From: "Bryan, Clifton W ERDC-RDE-MSRC-MS Contractor" <clifton.w.br...@erdc.dren.mil> To: "'us...@open-mpi.org'" <us...@open-mpi.org> Subject: [OMPI users] OpenMPI 1.6.3 problem Message-ID: <8cccc747fd74954ab8e26b1f2efba6e2078e7...@ms-ex2vks.erdc.dren.mil> Content-Type: text/plain; charset="us-ascii" Hi, We are having problems with OpenMPI 1.6.3 - it gives the below error message when trying to run: $ mpirun -np 32 ./mpi_test.x -------------------------------------------------------------------------- WARNING: It appears that your OpenFabrics subsystem is configured to only allow registering part of your physical memory. This can cause MPI jobs to run with erratic performance, hang, and/or crash. This may be caused by your OpenFabrics vendor limiting the amount of physical memory that can be registered. You should investigate the relevant Linux kernel module parameters that control how much physical memory can be registered, and increase them to allow registering all physical memory on your machine. See this Open MPI FAQ item for more information on these Linux kernel module parameters: http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages <http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages> Local host: akutilm-0006.ors.hpc.mil Registerable memory: 131072 MiB Total memory: 258542 MiB Your MPI job will continue, but may be behave poorly and/or hang. -------------------------------------------------------------------------- akutilm-0006.ors.hpc.mil akutilm-0006.ors.hpc.mil [akutilm-0006.ors.hpc.mil:10970] 31 more processes have sent help message help-mpi-btl-openib.txt / reg mem limit low [akutilm-0006.ors.hpc.mil:10970] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages Openmpi 1.4.3 works fine. Any help would be greatly appreciated. Thanks, Clif -------------- next part -------------- HTML attachment scrubbed and removed ------------------------------ Message: 2 List-Post: users@lists.open-mpi.org Date: Tue, 24 Sep 2013 12:36:46 -0700 From: Ralph Castain <r...@open-mpi.org> To: Open MPI Users <us...@open-mpi.org> Subject: Re: [OMPI users] OpenMPI 1.6.3 problem Message-ID: <b4dd6235-b7fd-42de-9d9d-d15d82460...@open-mpi.org> Content-Type: text/plain; charset="windows-1252" Just to be clear - are you saying the job fails to run? Or just that it emits this warning (not error) and then runs to completion? This is a warning we added at some point because jobs were hanging due to exhausting registered memory, and people didn't know why. If you check out the link, I believe we tell you how to turn off the warning if you are sure your system is correctly configured. On Sep 24, 2013, at 12:20 PM, "Bryan, Clifton W ERDC-RDE-MSRC-MS Contractor" <clifton.w.br...@erdc.dren.mil> wrote: > Hi, > > We are having problems with OpenMPI 1.6.3 ? it gives the below error message > when trying to run: > > > $ mpirun -np 32 ./mpi_test.x > > ---------------------------------------------------------------------- > ---- > > WARNING: It appears that your OpenFabrics subsystem is configured to only > allow registering part of your physical memory. This can cause MPI jobs to > run with erratic performance, hang, and/or crash. > > > This may be caused by your OpenFabrics vendor limiting the amount of physical > memory that can be registered. You should investigate the relevant Linux > kernel module parameters that control how much physical memory can be > registered, and increase them to allow registering all physical memory on > your machine. > > > See this Open MPI FAQ item for more information on these Linux kernel > module > > parameters: > > > http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages > <http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages> > > > Local host: akutilm-0006.ors.hpc.mil > > Registerable memory: 131072 MiB > > Total memory: 258542 MiB > > > Your MPI job will continue, but may be behave poorly and/or hang. > > ---------------------------------------------------------------------- > ---- > > akutilm-0006.ors.hpc.mil > > akutilm-0006.ors.hpc.mil > > [akutilm-0006.ors.hpc.mil:10970] 31 more processes have sent help > message help-mpi-btl-openib.txt / reg mem limit low > [akutilm-0006.ors.hpc.mil:10970] Set MCA parameter > "orte_base_help_aggregate" to 0 to see all help / error messages > > > Openmpi 1.4.3 works fine. > > > Any help would be greatly appreciated. > > > Thanks, > > Clif > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -------------- next part -------------- HTML attachment scrubbed and removed ------------------------------ Message: 3 List-Post: users@lists.open-mpi.org Date: Tue, 24 Sep 2013 19:38:50 +0000 From: "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> To: Open MPI Users <us...@open-mpi.org> Subject: Re: [OMPI users] OpenMPI 1.6.3 problem Message-ID: <ef66bbeb19badc41ac8ccf5f684f07fc4f8c5...@xmb-rcd-x01.cisco.com> Content-Type: text/plain; charset="Windows-1252" Have you visited the URL that is cited? :-) It talks all about the issue, and describes how to fix it. Let us know if there's something unclear in that FAQ text. On Sep 24, 2013, at 3:20 PM, "Bryan, Clifton W ERDC-RDE-MSRC-MS Contractor" <clifton.w.br...@erdc.dren.mil> wrote: > Hi, > > We are having problems with OpenMPI 1.6.3 ? it gives the below error message > when trying to run: > > > $ mpirun -np 32 ./mpi_test.x > > ---------------------------------------------------------------------- > ---- > > WARNING: It appears that your OpenFabrics subsystem is configured to only > allow registering part of your physical memory. This can cause MPI jobs to > run with erratic performance, hang, and/or crash. > > > This may be caused by your OpenFabrics vendor limiting the amount of physical > memory that can be registered. You should investigate the relevant Linux > kernel module parameters that control how much physical memory can be > registered, and increase them to allow registering all physical memory on > your machine. > > > See this Open MPI FAQ item for more information on these Linux kernel > module > > parameters: > > > http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages > <http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages> > > > Local host: akutilm-0006.ors.hpc.mil > > Registerable memory: 131072 MiB > > Total memory: 258542 MiB > > > Your MPI job will continue, but may be behave poorly and/or hang. > > ---------------------------------------------------------------------- > ---- > > akutilm-0006.ors.hpc.mil > > akutilm-0006.ors.hpc.mil > > [akutilm-0006.ors.hpc.mil:10970] 31 more processes have sent help > message help-mpi-btl-openib.txt / reg mem limit low > [akutilm-0006.ors.hpc.mil:10970] Set MCA parameter > "orte_base_help_aggregate" to 0 to see all help / error messages > > > Openmpi 1.4.3 works fine. > > > Any help would be greatly appreciated. > > > Thanks, > > Clif > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ------------------------------ Subject: Digest Footer _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ------------------------------ End of users Digest, Vol 2689, Issue 1 **************************************