I think we bumped up a default value in Open MPI 1.8.5.  To go back to the old 
64Mbyte value try running with:

--mca mpool_sm_min_size 67108864

Rolf

From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Aurélien Bouteiller
Sent: Tuesday, May 26, 2015 10:10 AM
To: Open MPI Users
Subject: Re: [OMPI users] Problems running linpack benchmark on old Sunfire 
opteron nodes

* PGP Signed by an unknown key
You can also change the location of tmp files with the following mca option:
-mca orte_tmpdir_base /some/place

ompi_info --param all all -l 9 | grep tmp
                MCA orte: parameter "orte_tmpdir_base" (current value: "", data 
source: default, level: 9 dev/all, type: string)
                MCA orte: parameter "orte_local_tmpdir_base" (current value: 
"", data source: default, level: 9 dev/all, type: string)
                MCA orte: parameter "orte_remote_tmpdir_base" (current value: 
"", data source: default, level: 9 dev/all, type: string)

--
Aurélien Bouteiller ~~ https://icl.cs.utk.edu/~bouteill/

Le 23 mai 2015 à 03:55, Gilles Gouaillardet 
<gilles.gouaillar...@gmail.com<mailto:gilles.gouaillar...@gmail.com>> a écrit :

Bill,

the root cause is likely there is not enough free space in /tmp.

the simplest, but slowest, option is to run mpirun --mac btl tcp ...
if you cannot make enough space under /tmp (maybe you run diskless)
there are some options to create these kind of files under /dev/shm

Cheers,

Gilles


On Saturday, May 23, 2015, Lane, William 
<william.l...@cshs.org<mailto:william.l...@cshs.org>> wrote:
I've compiled the linpack benchmark using openMPI 1.8.5 libraries
and include files on CentOS 6.4.

I've tested the binary on the one Intel node (some
sort of 4-core Xeon) and it runs, but when I try to run it on any of
the old Sunfire opteron compute nodes it appears to hang (although
top indicates CPU and memory usage) and eventually terminates
by itself. I'm also getting the following openMPI error messages/warnings:

mpirun -np 16 --report-bindings --hostfile hostfile --prefix 
/hpc/apps/mpi/openmpi/1.8.5-dev --mca btl_tcp_if_include eth0 xhpl

[cscld1-0-6:24370] create_and_attach: unable to create shared memory BTL 
coordinating structure :: size 134217728
[cscld1-0-3:24734] create_and_attach: unable to create shared memory BTL 
coordinating structure :: size 134217728
[cscld1-0-7:25152] create_and_attach: unable to create shared memory BTL 
coordinating structure :: size 134217728
[cscld1-0-4:18079] create_and_attach: unable to create shared memory BTL 
coordinating structure :: size 134217728
[cscld1-0-8:21443] create_and_attach: unable to create shared memory BTL 
coordinating structure :: size 134217728
[cscld1-0-2:19704] create_and_attach: unable to create shared memory BTL 
coordinating structure :: size 134217728
[cscld1-0-5:13481] create_and_attach: unable to create shared memory BTL 
coordinating structure :: size 134217728
[cscld1-0-0:21884] create_and_attach: unable to create shared memory BTL 
coordinating structure :: size 134217728
[cscld1:24240] 7 more processes have sent help message help-opal-shmem-mmap.txt 
/ target full

Note these errors also occur when I try to run the linpack benchmark on a single
node as well.

Does anyone know what's going on here? Google came up w/nothing and I have no
idea what a BTL coordinating structure is.

-Bill L.
IMPORTANT WARNING: This message is intended for the use of the person or entity 
to which it is addressed and may contain information that is privileged and 
confidential, the disclosure of which is governed by applicable law. If the 
reader of this message is not the intended recipient, or the employee or agent 
responsible for delivering it to the intended recipient, you are hereby 
notified that any dissemination, distribution or copying of this information is 
strictly prohibited. Thank you for your cooperation.
_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/05/26907.php

* Unknown Key
* 0xBF250A1F

* PGP Unprotected
* text/plain body


-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

Reply via email to