Thanks very much, exactly what I wanted to hear. How big is /tmp? -----Original Message----- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of David Turner Sent: Thursday, November 03, 2011 6:36 PM To: us...@open-mpi.org Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage
I'm not a systems guy, but I'll pitch in anyway. On our cluster, all the compute nodes are completely diskless. The root file system, including /tmp, resides in memory (ramdisk). OpenMPI puts these session directories therein. All our jobs run through a batch system (torque). At the conclusion of each batch job, an epilogue process runs that removes all files belonging to the owner of the current batch job from /tmp (and also looks for and kills orphan processes belonging to the user). This epilogue had to written by our systems staff. I believe this is a fairly common configuration for diskless clusters. On 11/3/11 4:09 PM, Blosch, Edwin L wrote: > Thanks for the help. A couple follow-up-questions, maybe this starts to go outside OpenMPI: > > What's wrong with using /dev/shm? I think you said earlier in this thread that this was not a safe place. > > If the NFS-mount point is moved from /tmp to /work, would a /tmp magically appear in the filesystem for a stateless node? How big would it be, given that there is no local disk, right? That may be something I have to ask the vendor, which I've tried, but they don't quite seem to get the question. > > Thanks > > > > > -----Original Message----- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain > Sent: Thursday, November 03, 2011 5:22 PM > To: Open MPI Users > Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage > > > On Nov 3, 2011, at 2:55 PM, Blosch, Edwin L wrote: > >> I might be missing something here. Is there a side-effect or performance loss if you don't use the sm btl? Why would it exist if there is a wholly equivalent alternative? What happens to traffic that is intended for another process on the same node? > > There is a definite performance impact, and we wouldn't recommend doing what Eugene suggested if you care about performance. > > The correct solution here is get your sys admin to make /tmp local. Making /tmp NFS mounted across multiple nodes is a major "faux pas" in the Linux world - it should never be done, for the reasons stated by Jeff. > > >> >> Thanks >> >> >> -----Original Message----- >> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Eugene Loh >> Sent: Thursday, November 03, 2011 1:23 PM >> To: us...@open-mpi.org >> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage >> >> Right. Actually "--mca btl ^sm". (Was missing "btl".) >> >> On 11/3/2011 11:19 AM, Blosch, Edwin L wrote: >>> I don't tell OpenMPI what BTLs to use. The default uses sm and puts a session file on /tmp, which is NFS-mounted and thus not a good choice. >>> >>> Are you suggesting something like --mca ^sm? >>> >>> >>> -----Original Message----- >>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Eugene Loh >>> Sent: Thursday, November 03, 2011 12:54 PM >>> To: us...@open-mpi.org >>> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage >>> >>> I've not been following closely. Why must one use shared-memory >>> communications? How about using other BTLs in a "loopback" fashion? >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Best regards, David Turner User Services Group email: dptur...@lbl.gov NERSC Division phone: (510) 486-4027 Lawrence Berkeley Lab fax: (510) 486-4316 _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users