On Mon, Oct 27, 2014 at 02:15:45PM +0000, michael.rach...@dlr.de wrote: > Dear Gilles, > > This is the system response on the login node of cluster5: > > cluster5:~/dat> mpirun -np 1 df -h > Filesystem Size Used Avail Use% Mounted on > /dev/sda31 228G 5.6G 211G 3% / > udev 32G 232K 32G 1% /dev > tmpfs 32G 0 32G 0% /dev/shm > /dev/sda11 291M 39M 237M 15% /boot > /dev/gpfs10 495T 280T 216T 57% /gpfs10 > /dev/loop1 3.2G 3.2G 0 100% /media > cluster5:~/dat> mpirun -np 1 df -hi > Filesystem Inodes IUsed IFree IUse% Mounted on > /dev/sda31 15M 253K 15M 2% / > udev 0 0 0 - /dev > tmpfs 7.9M 3 7.9M 1% /dev/shm > /dev/sda11 76K 41 76K 1% /boot > /dev/gpfs10 128M 67M 62M 53% /gpfs10 > /dev/loop1 0 0 0 - /media > cluster5:~/dat> > > > And this the system response on the compute node of cluster5: > > rachner@r5i5n13:~> mpirun -np 1 df -h > Filesystem Size Used Avail Use% Mounted on > tmpfs 63G 1.4G 62G 3% / > udev 63G 92K 63G 1% /dev > tmpfs 63G 0 63G 0% /dev/shm > tmpfs 150M 12M 139M 8% /tmp
This is the problem right here. /tmp can only be used to back a total of 139M of shared memory. /dev/shm can back up to 63G so using that will solve your problem. Try setting adding -mca shmem_mmap_relocate_backing_file true to your mpirun line or add shmem_mmap_relocate_backing_file = true to your installation's <openmpi_prefix>/etc/openmpi-mca-params.conf -Nathan
pgpOl0hwQ3Qey.pgp
Description: PGP signature