[OMPI users] How to override default hostfile to specify host

2011-10-28 Thread Saurabh T
Hi, If I use "orterun -H " and does not belong in the default hostfile ("etc/openmpi-default-hostfile"), openmpi gives an error. Is there an easy way to get the aforementioned command to work without specifying a different hostfile with in it? Thank you.

[OMPI users] OpenMPI w valgrind: need to recompile?

2010-01-06 Thread Saurabh T
Hi, I am building libraries against OpenMPI, and then applications using those libraries. It was unclear from the FAQ at http://www.open-mpi.org/faq/?category=debugging#memchecker_how whether the libraries need to be recompiled and the application relinked using valgrind-enabled mpicc etc,

[OMPI users] Problems running 1.8.8 and compiling 1.10.1 on Redhat EL7

2015-11-06 Thread Saurabh T
Hi, On Redhat Enterprise Linux 7, I am facing the following problems. 1. With OpenMPI 1.8.8, everything builds, but the following error appears on running: orterun -np 2 hello_cxx hello_cxx: route/tc.c:973: rtnl_tc_register: Assertion `0' failed. hello_cxx: route/tc.c:973: rtnl_tc_register: Asse

Re: [OMPI users] Problems running 1.8.8 and compiling 1.10.1 on Redhat EL7

2015-11-06 Thread Saurabh T
> From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden]) > Date: 2015-11-06 18:02:42 > > Both of these seem to be issues with libnl, which is a dependent library > that Open MPI uses. Based on your email, I found this message and thread: https://www.open-mpi.org/community/lists/devel/2015/08/17812.p

[OMPI users] Propagate current shell's environment

2015-11-09 Thread Saurabh T
Hi, Is there any way with OpenMPI to propagate the current shell's environment to the parallel program? I am looking for an equivalent way to how MPICH handles environment variables (https://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions#Q:_How_do_I_pass_environment_variables_to_the_

Re: [OMPI users] Propagate current shell's environment

2015-11-09 Thread Saurabh T
I meant different from the current shell, not different for different processes, sorry. Also I am aware of -x but it's not the right solution in this case because (a) it's manual (b) it appears that anything set in bashrc that was unset in the shell would be set for the program which I do not wa

Re: [OMPI users] Propagate current shell's environment

2015-11-13 Thread Saurabh T
I'd appreciate a response, even a simple no if this is not possible. Thank you. saurabh > From: saur...@hotmail.com > To: us...@open-mpi.org > Subject: RE: Propagate current shell's environment > Date: Mon, 9 Nov 2015 11:45:07 -0500 > > I meant different from the current shell, not different for

[OMPI users] OpenMPI 1.10.1 crashes with file size limit <= 131072

2015-11-19 Thread Saurabh T
Here's what I find: > cd examples > make hello_cxx > ulimit -f 131073 > orterun -np 3 hello_cxxHello, world! [Etc] > ulimit -f 131072 > orterun -np 3 hello_cxx

[OMPI users] Openmpi 1.10.1 fails with SIGXFSZ on file limit <= 131072

2015-11-19 Thread Saurabh T
Hi, Sorry my previous email was garbled, sending it again. > cd examples > make hello_cxx > ulimit -f 131073 > orterun -np 3 hello_cxx Hello, world (etc) > ulimit -f 131072 > orterun -np 3 hello_cxx -- orterun noticed that

Re: [OMPI users] Openmpi 1.10.1 fails with SIGXFSZ on file limit <= 131072

2015-11-19 Thread Saurabh T
An "strace" showed something related to shared memory use was causing the signal. Sticking btl = ^sm into the openmpi-mca-params.conf file fixed this issue. saurabh From: saur...@hotmail.com To: us...@open-mpi.org Subject: Openmpi 1.10.1 fails with SIGXFSZ on file limit <= 131072 List-Post: us

Re: [OMPI users] Openmpi 1.10.1 fails with SIGXFSZ on file limit <= 131072

2015-11-19 Thread Saurabh T
> Could you please provide a little more info regarding the environment you > are running under (which resource mgr or not, etc), how many nodes you had > in the allocation, etc? > There is no reason why something should behave that way. So it would help > if we could understand the setup.

Re: [OMPI users] Openmpi 1.10.1 fails with SIGXFSZ on file limit <= 131072

2015-11-19 Thread Saurabh T
I apologize, I have the wrong lines from strace for the initial file there (of course). The file with fd = 11 which causes the problem is called shared_mem_pool.[host] and fruncate(11, 134217736) is called on it. (This is exactly 1024 times the ulimit of 131072 which makes sense as the ulimit is

Re: [OMPI users] Openmpi 1.10.1 fails with SIGXFSZ on file limit <= 131072

2015-11-20 Thread Saurabh T
> For what it's worth, that's open MPI creating a chunk of shared memory for use with on-server > communication. It shows up as a "file", but it's really shared memory. > You can disable sm and/or Vader, but your on-server message passing > performance will be significantly > lower. > Is

[OMPI users] Possible to exclude a hwloc_base_binding_policy?

2018-04-20 Thread Saurabh T
Hi, Switching to OpenMPI 3, I was getting error messages of the form "No objects of the specified type were found on at least one node: Type: NUMANode ... ORTE has lost communication with a remote daemon. ..." After some research, I found that hwloc_base_binding_policy (for np > 2) switched to

[OMPI users] Memory leak with pmix_finalize not being called

2018-05-04 Thread Saurabh T
This is with valgrind 3.0.1 on a Centos 6 system. It appears pmix_finalize isnt called and this reports leaks from valgrind despite the provided suppression file being used. A cursory check reveals MPI_Finalize calls pmix_rte_finalize which decrements pmix_initialized to 0 before calling pmix_cl

[OMPI users] Re: Avoiding localhost as rank 0 with openmpi-default-hostfile

2025-02-27 Thread Saurabh T
by machine order in hostfile? Thanks. From: Saurabh T Sent: Monday, November 6, 2023 10:44 AM To: Openmpi Subject: Avoiding localhost as rank 0 with openmpi-default-hostfile My openmpi-default-hostfile has host1 slots=4 host2 slots=4 host0 slots=4 and my openmpi

[OMPI users] Avoiding localhost as rank 0 with openmpi-default-hostfile

2023-11-06 Thread Saurabh T via users
My openmpi-default-hostfile has host1 slots=4 host2 slots=4 host0 slots=4 and my openmpi-mca-params.conf has rmaps_base_mapping_policy = node rmaps_base_oversubscribe = 1 If I invoke orterun -np 3 on host0, it puts rank0 on host0, rank1 on host1, rank2 on host2. I want it to put rank0 on host1,