Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Jeff Squyres (jsquyres)
On Jun 9, 2014, at 7:00 PM, Vineet Rawat wrote: > We actually do ship the /share and /etc directories. We set > OPAL_PREFIX to a sub-directory of our installation and make sure those things > are in our PATH/LD_LIBRARY_PATH. > > I can try adding the additional shared libs but it doesn't sound

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Vineet Rawat
On Mon, Jun 9, 2014 at 3:40 PM, Jeff Squyres (jsquyres) wrote: > On Jun 9, 2014, at 6:36 PM, Vineet Rawat wrote: > > > No, we only included what seemed necessary (from ldd output and > experience on other clusters). The only things in my /lib/openmpi > are libompi_dbg_msgq*. Is that what you're

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Vineet Rawat
On Mon, Jun 9, 2014 at 3:31 PM, Jeff Squyres (jsquyres) wrote: > On Jun 9, 2014, at 5:41 PM, Vineet Rawat wrote: > > > We've deployed OpenMPI on a small cluster but get a SEGV in orted. Debug > information is very limited as the cluster is at a remote customer site. > They have a network card wi

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Ralph Castain
There is one new "feature" in 1.8 - it now checks to see if the version on the backend matches the version on the frontend. In other words, mpirun checks to see if the orted connecting to it is from the same version - if not, the orted will die. Shouldn't segfault, though - just abort. You cou

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Jeff Squyres (jsquyres)
On Jun 9, 2014, at 6:36 PM, Vineet Rawat wrote: > No, we only included what seemed necessary (from ldd output and experience on > other clusters). The only things in my /lib/openmpi are > libompi_dbg_msgq*. Is that what you're referring to? In /lib for > 12.8.1 (ignoring the VampirTrace libs)

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Vineet Rawat
On Mon, Jun 9, 2014 at 3:21 PM, Ralph Castain wrote: > > On Jun 9, 2014, at 2:41 PM, Vineet Rawat wrote: > > Hi, > > We've deployed OpenMPI on a small cluster but get a SEGV in orted. Debug > information is very limited as the cluster is at a remote customer site. > They have a network card with

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Jeff Squyres (jsquyres)
On Jun 9, 2014, at 5:41 PM, Vineet Rawat wrote: > We've deployed OpenMPI on a small cluster but get a SEGV in orted. Debug > information is very limited as the cluster is at a remote customer site. They > have a network card with which I'm not familiar (Cisco Systems Inc VIC P81E > PCIe Ethern

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Ralph Castain
On Jun 9, 2014, at 2:41 PM, Vineet Rawat wrote: > Hi, > > We've deployed OpenMPI on a small cluster but get a SEGV in orted. Debug > information is very limited as the cluster is at a remote customer site. They > have a network card with which I'm not familiar (Cisco Systems Inc VIC P81E > P

[OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Vineet Rawat
Hi, We've deployed OpenMPI on a small cluster but get a SEGV in orted. Debug information is very limited as the cluster is at a remote customer site. They have a network card with which I'm not familiar (Cisco Systems Inc VIC P81E PCIe Ethernet NIC) and it seems capable of using the usNIC BTL. I'm