+1 to everything so far. Also, look in your shell startup files (e.g., $HOME/.bashrc) to see if certain parts of it are not executed for non-interactive logins. A common mistake we see is a shell startup file like this:
---- # ... do setup for all logins ... if (this is a non-interactive login) exit 0 # ... do setup for interactive logins, including Open MPI setup ... ---- > On May 16, 2016, at 10:34 AM, David Shrader <dshra...@lanl.gov> wrote: > > Hey Rob, > > I don't know if this is what is going on, but in general, when a package is > installed via a distro's package management system, it ends up in system > locations such as /usr/bin and /usr/lib that are automatically searched when > looking for executables and libraries. So, it isn't necessarily that the > package maintainers did much of anything different when putting together the > package; instead, they may have put files in locations that are more > accessible from a system-tool point of view. For example, the runtime linker > knows to search in several system-defined directories such as /usr/lib. This > might explain why everything worked after installing openmpi-bin: the > binaries and libraries all ended up in system locations that are > automatically a part of the environment on the remote node, so remote > execution worked as it could find everything. > > Thanks, > > David > > > On 05/14/2016 05:37 AM, Rob Malpass wrote: >> Hi all >> >> I posted about a fortnight ago to this list as I was having some trouble >> getting my nodes to be controlled by my master node. Perceived wisdom at >> the time was to compile with the –enable-orterun-prefix-by-default. >> >> For some time I’d been getting cannot open libopen-rte.so.7 which points to >> a problem with LD_LIBRARY_PATH. I had been able to run it on nodes 3 and 4 >> even though (from headnode) if I do >> >> ssh node4 ‘echo $LD_LIBRARY_PATH’ >> >> returns a blank line. However – as I say it’s working on nodes 3 and 4. >> >> I had been hacking for ages on nodes 1 and 2 getting the same error but >> still with LD_LIBRARY_PATH apparently not set for an interactive login. >> >> Almost in desperation, I cheated: >> >> sudo apt-get install openmpi-bin >> >> and hey presto. I can now do (from head node) >> >> mpirun –H node2,node3,node4 –n 10 foo >> >> and it works fine. So clearly apt-get install has set something that I’d >> not done (and it’s seemingly not LD_LIBRARY_PATH) as ssh node2 ‘echo >> $LD_LIBRARY_PATH’ still returns a blank line. >> >> Can anyone tell me what might be in the install script so I can get a clue? >> >> Thanks >> >> >> _______________________________________________ >> users mailing list >> >> us...@open-mpi.org >> >> Subscription: >> https://www.open-mpi.org/mailman/listinfo.cgi/users >> >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/05/29201.php > > -- > David Shrader > HPC-ENV High Performance Computer Systems > Los Alamos National Lab > Email: dshrader <at> > lanl.gov > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/05/29209.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/