You are right, Ralph. There is no surprise behavior. I had forgotten that I had been testing --mca orte_tmpdir_base /dev/shm to see if it worked (and obviously it doesn't). Before that, without any MCA options, OpenMPI had tried /tmp, and gave me the warning about /tmp being NFS mounted, and so I had been exploring options.
I accept your point - I need "a good local directory - anything you have permission to write in will work fine". How would one do this on a stateless node? And can I beat the vendor over the head for not knowing how to set up the node image so that OpenMPI could function properly? Thanks -----Original Message----- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain Sent: Thursday, November 03, 2011 11:33 AM To: Open MPI Users Subject: EXTERNAL: Re: [OMPI users] Shared-memory problems I'm afraid this isn't correct. You definitely don't want the session directory in /dev/shm as this will almost always cause problems. We look thru a progression of envars to find where to put the session directory: 1. the MCA param orte_tmpdir_base 2. the envar OMPI_PREFIX_ENV 3. the envar TMPDIR 4. the envar TEMP 5. the envar TMP Check all those to see if one is set to /dev/shm. If so, you have a problem to resolve. For performance reasons, you probably don't want the session directory sitting on a network mounted location. What you need is a good local directory - anything you have permission to write in will work fine. Just set one of the above to point to it. On Nov 3, 2011, at 10:04 AM, Durga Choudhury wrote: > Since /tmp is mounted across a network and /dev/shm is (always) local, > /dev/shm seems to be the right place for shared memory transactions. > If you create temporary files using mktemp is it being created in > /dev/shm or /tmp? > > > On Thu, Nov 3, 2011 at 11:50 AM, Bogdan Costescu <bcoste...@gmail.com> wrote: >> On Thu, Nov 3, 2011 at 15:54, Blosch, Edwin L <edwin.l.blo...@lmco.com> >> wrote: >>> - /dev/shm is 12 GB and has 755 permissions >>> ... >>> % ls -l output: >>> >>> drwxr-xr-x 2 root root 40 Oct 28 09:14 shm >> >> This is your problem: it should be something like drwxrwxrwt. It might >> depend on the distribution, f.e. the following show this to be a bug: >> >> https://bugzilla.redhat.com/show_bug.cgi?id=533897 >> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=317329 >> >> and surely you can find some more on the subject with your favorite >> search engine. Another source could be a paranoid sysadmin who has >> changed the default (most likely correct) setting the distribution >> came with - not only OpenMPI but any application using shmem would be >> affected.. >> >> Cheers, >> Bogdan >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users