How can I fix the error if all processes open their backing files for mmap on NSF like you said?
Vincent On Thu, Oct 23, 2014 at 10:35 PM, Joshua Ladd <[email protected]> wrote: > It's not coming from OSHMEM but from the OPAL "shmem" framework. You are > going to get terrible performance - possibly slowing to a crawl having all > processes open their backing files for mmap on NSF. I think that's the > error that he's getting. > > > Josh > > On Thu, Oct 23, 2014 at 6:06 AM, Vinson Leung <[email protected]> > wrote: > >> HI, Thanks for your reply:) >> I really run an MPI program (compile with OpenMPI and run with "mpirun -n >> 8 ......"). My OpenMPI version is 1.8.3 and my program is Gromacs. BTW, >> what is OSHMEM ? >> >> Best >> Vincent >> >> On Thu, Oct 23, 2014 at 12:21 PM, Ralph Castain <[email protected]> wrote: >> >>> From your error message, I gather you are not running an MPI program, >>> but rather an OSHMEM one? Otherwise, I find the message strange as it only >>> would be emitted from an OSHMEM program. >>> >>> What version of OMPI are you trying to use? >>> >>> On Oct 22, 2014, at 7:12 PM, Vinson Leung <[email protected]> >>> wrote: >>> >>> Thanks for your reply:) >>> Follow your advice I tried to set the TMPDIR to /var/tmp and /dev/shm >>> and even reset to /tmp (I get the system permission), the problem still >>> occur (CPU utilization still lower than 20%). I have no idea why and ready >>> to give up OpenMPI instead of using other MPI library. >>> >>> --------Old Message------------- >>> >>> Date: Tue, 21 Oct 2014 22:21:31 -0400 >>> From: Brock Palen <[email protected]> >>> To: Open MPI Users <[email protected]> >>> Subject: Re: [OMPI users] low CPU utilization with OpenMPI >>> Message-ID: <[email protected]> >>> Content-Type: text/plain; charset=us-ascii >>> >>> Doing special files on NFS can be weird, try the other /tmp/ locations: >>> >>> /var/tmp/ >>> /dev/shm (ram disk careful!) >>> >>> Brock Palen >>> www.umich.edu/~brockp >>> CAEN Advanced Computing >>> XSEDE Campus Champion >>> [email protected] >>> (734)936-1985 >>> >>> >>> >>> > On Oct 21, 2014, at 10:18 PM, Vinson Leung <[email protected]> >>> wrote: >>> > >>> > Because of permission reason (OpenMPI can not write temporary file to >>> the default /tmp directory), I change the TMPDIR to my local directory >>> (export TMPDIR=/home/user/tmp ) and then the MPI program can run. But the >>> CPU utilization is very low under 20% (8 MPI rank running in Intel Xeon >>> 8-core CPU). >>> > >>> > And I also got some message when I run with OpenMPI: >>> > [cn3:28072] 9 more processes have sent help message >>> help-opal-shmem-mmap.txt / mmap on nfs >>> > [cn3:28072] Set MCA parameter "orte_base_help_aggregate" to 0 to see >>> all help / error messages >>> > >>> > Any idea? >>> > Thanks >>> > >>> > VIncent >>> > _______________________________________________ >>> > users mailing list >>> > [email protected] >>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> > Link to this post: >>> http://www.open-mpi.org/community/lists/users/2014/10/25548.php >>> _______________________________________________ >>> users mailing list >>> [email protected] >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2014/10/25555.php >>> >>> >>> >>> _______________________________________________ >>> users mailing list >>> [email protected] >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2014/10/25556.php >>> >> >> >> _______________________________________________ >> users mailing list >> [email protected] >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/10/25558.php >> > > > _______________________________________________ > users mailing list > [email protected] > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/10/25560.php >
