Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-14 Thread Steve Kargl
On Wed, Jul 13, 2011 at 08:27:13AM -0400, Jeff Squyres wrote: > On Jul 12, 2011, at 1:37 PM, Steve Kargl wrote: > > > (many lines removed) > > checking prefix for function in .type... @ > > checking if .size is needed... yes > > checking if .align directive takes logarithmic value... no > > config

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-13 Thread Jeff Squyres
On Jul 12, 2011, at 3:26 PM, Steve Kargl wrote: > % /usr/local/ompi/bin/mpiexec -machinefile mf --mca btl self,tcp \ > --mca btl_base_verbose 30 ./z > > with mf containing > > node11 slots=1 (node11 contains a single bge0=168.192.0.11) > node16 slots=1 (node16 contains a single bge0=168.19

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-13 Thread Jeff Squyres
On Jul 12, 2011, at 1:37 PM, Steve Kargl wrote: > (many lines removed) > checking prefix for function in .type... @ > checking if .size is needed... yes > checking if .align directive takes logarithmic value... no > configure: error: No atomic primitives available for amd64-unknown-freebsd9.0 Hmm

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-12 Thread Steve Kargl
On Tue, Jul 12, 2011 at 11:03:42AM -0700, Steve Kargl wrote: > On Tue, Jul 12, 2011 at 10:37:14AM -0700, Steve Kargl wrote: > > On Fri, Jul 08, 2011 at 07:03:13PM -0400, Jeff Squyres wrote: > > > Sorry -- I got distracted all afternoon... > > > > > > In addition to what Ralph said (i.e., I'm not s

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-12 Thread Steve Kargl
On Tue, Jul 12, 2011 at 10:37:14AM -0700, Steve Kargl wrote: > On Fri, Jul 08, 2011 at 07:03:13PM -0400, Jeff Squyres wrote: > > Sorry -- I got distracted all afternoon... > > > > In addition to what Ralph said (i.e., I'm not sure if the CIDR > > notation stuff made it over to the v1.5 branch or n

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-12 Thread Steve Kargl
On Fri, Jul 08, 2011 at 07:03:13PM -0400, Jeff Squyres wrote: > Sorry -- I got distracted all afternoon... > > In addition to what Ralph said (i.e., I'm not sure if the CIDR > notation stuff made it over to the v1.5 branch or not, but it > is available from the nightly SVN trunk tarballs: > http:/

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Jeff Squyres
On Jul 8, 2011, at 7:34 PM, Steve Kargl wrote: >> We unfortunately don't have access to any BSD machines to test this >> on, ourselves. It works on other OS's, so I'm curious as to why it >> doesn't seem to work for you. :-( > > I can arrange access on the cluster in question. ;-) Actually, we

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Steve Kargl
On Fri, Jul 08, 2011 at 07:03:13PM -0400, Jeff Squyres wrote: > Sorry -- I got distracted all afternoon... No problem. We all have obligations that we prioritize. > In addition to what Ralph said (i.e., I'm not sure if the > CIDR notation stuff made it over to the v1.5 branch or not, > but it is

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Jeff Squyres
Sorry -- I got distracted all afternoon... In addition to what Ralph said (i.e., I'm not sure if the CIDR notation stuff made it over to the v1.5 branch or not, but it is available from the nightly SVN trunk tarballs: http://www.open-mpi.org/nightly/trunk/), here's a few points from other mails

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Ralph Castain
We've been moving to provide support for including values as CIDR notation instead of names - e.g., 192.168.0/16 instead of bge0 or bge1 - but I don't think that has been put into the 1.4 release series. If you need it now, you might try using the developer's trunk - I know it works there. On

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Steve Kargl
On Fri, Jul 08, 2011 at 04:26:35PM -0400, Gus Correa wrote: > Steve Kargl wrote: > >On Fri, Jul 08, 2011 at 02:19:27PM -0400, Jeff Squyres wrote: > >>The easiest way to fix this is likely to use the btl_tcp_if_include > >>or btl_tcp_if_exclude MCA parameters -- i.e., tell OMPI exactly > >>which int

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Gus Correa
Steve Kargl wrote: On Fri, Jul 08, 2011 at 02:19:27PM -0400, Jeff Squyres wrote: The easiest way to fix this is likely to use the btl_tcp_if_include or btl_tcp_if_exclude MCA parameters -- i.e., tell OMPI exactly which interfaces to use: http://www.open-mpi.org/faq/?category=tcp#tcp-selecti

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Steve Kargl
On Fri, Jul 08, 2011 at 12:09:09PM -0700, Steve Kargl wrote: > On Fri, Jul 08, 2011 at 02:19:27PM -0400, Jeff Squyres wrote: > > > > The easiest way to fix this is likely to use the btl_tcp_if_include > > or btl_tcp_if_exclude MCA parameters -- i.e., tell OMPI exactly > > which interfaces to use:

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Steve Kargl
On Fri, Jul 08, 2011 at 02:19:27PM -0400, Jeff Squyres wrote: > > The easiest way to fix this is likely to use the btl_tcp_if_include > or btl_tcp_if_exclude MCA parameters -- i.e., tell OMPI exactly > which interfaces to use: > > http://www.open-mpi.org/faq/?category=tcp#tcp-selection > Pe

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Steve Kargl
On Fri, Jul 08, 2011 at 02:19:27PM -0400, Jeff Squyres wrote: > On Jul 8, 2011, at 1:31 PM, Steve Kargl wrote: > > > It seems that openmpi-1.4.4 compiled code is trying to use the > > wrong nic. My /etc/hosts file has > > > > 10.208.78.111 hpc.apl.washington.edu hpc > > 192.168.0.10

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Jeff Squyres
On Jul 8, 2011, at 1:31 PM, Steve Kargl wrote: > It seems that openmpi-1.4.4 compiled code is trying to use the > wrong nic. My /etc/hosts file has > > 10.208.78.111 hpc.apl.washington.edu hpc > 192.168.0.10node10.cimu.org node10 n10 master > 192.168.0.11node11.

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Steve Kargl
On Thu, Jul 07, 2011 at 08:38:56PM -0400, Jeff Squyres wrote: > On Jul 5, 2011, at 4:24 PM, Steve Kargl wrote: > > On Tue, Jul 05, 2011 at 01:14:06PM -0700, Steve Kargl wrote: > >> I have an application that appears to function as I expect > >> when compiled with openmpi-1.4.2 on FreeBSD 9.0. But,

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-07 Thread Jeff Squyres
Are you able to run simple MPI applications with 1.4.3 or 1.4.4 on your OS? E.g., the "ring_c" program in the example/ directory? This might be a good test to see if OMPI's TCP is working at all. Assuming that works... Have you tried attaching debuggers to see where your process is hanging?

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-05 Thread Steve Kargl
On Tue, Jul 05, 2011 at 01:14:06PM -0700, Steve Kargl wrote: > I have an application that appears to function as I expect > when compiled with openmpi-1.4.2 on FreeBSD 9.0. But, it > appears to hang during communication between nodes. What > follows is the long version. Argh I messed up. It sho

[OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-05 Thread Steve Kargl
I have an application that appears to function as I expect when compiled with openmpi-1.4.2 on FreeBSD 9.0. But, it appears to hang during communication between nodes. What follows is the long version. I configure 1.4.2 with ./configure --prefix=/usr/local/openmpi-1.4.2 \ --enable-mpirun-prefi