Re: [OMPI users] IB to some nodes but TCP for others

2015-07-01 Thread Tim Miller
is still interested in testing this and, if so, try it out. Thanks, Tim On Tue, Jun 16, 2015 at 7:15 PM, Jeff Squyres (jsquyres) wrote: > Do you have different IB subnet IDs? That would be the only way for Open > MPI to tell the two IB subnets apart. > > > > > On Jun 16,

[OMPI users] IB to some nodes but TCP for others

2015-06-16 Thread Tim Miller
Hi All, We have a set of nodes which are all connected via InfiniBand, but all are mutually connected. For example, nodes 1-32 are connected to IB switch A and 33-64 are connected to switch B, but there is no IB connection between switches A and B. However, all nodes are mutually routable over TCP

Re: [OMPI users] OPENIB unknown transport errors

2014-06-12 Thread Tim Miller
On Sat, Jun 7, 2014 at 2:21 AM, Mike Dubman wrote: > could you please attach output of "ibv_devinfo -v" and "ofed_info -s" > Thx > > > On Sat, Jun 7, 2014 at 12:53 AM, Tim Miller wrote: > >> Hi Josh, >> >> I asked one of our more advance

Re: [OMPI users] OPENIB unknown transport errors

2014-06-06 Thread Tim Miller
5, 2014 at 7:32 PM, Tim Miller wrote: > Hi Josh, > > Thanks for attempting to sort this out. In answer to your questions: > > 1. Node allocation is done by TORQUE, however we don't use the TM API to > launch jobs (long story). Instead, we just pass a hostfile to mpirun,

Re: [OMPI users] OPENIB unknown transport errors

2014-06-05 Thread Tim Miller
x4_0:1" > (assuming you have a ConnectX-3 HCA and port 1 is configured to run over > IB.) > > Josh > > > On Wed, Jun 4, 2014 at 12:47 PM, Tim Miller wrote: > >> Hi, >> >> I'd like to revive this thread, since I am still periodically getting >

Re: [OMPI users] OPENIB unknown transport errors

2014-06-04 Thread Tim Miller
quot;ibstat" on each host: >> >> 1. Make sure the adapters are alive and active. >> >> 2. Look at the Link Layer settings for host w34. Does it match host w4's? >> >> >> Josh >> >> >> On Fri, May 9, 2014 at 1:18 PM, Tim Miller

Re: [OMPI users] OPENIB unknown transport errors

2014-05-09 Thread Tim Miller
On Fri, May 9, 2014 at 6:26 PM, Joshua Ladd wrote: > Hi, Tim > > Run "ibstat" on each host: > > 1. Make sure the adapters are alive and active. > > 2. Look at the Link Layer settings for host w34. Does it match host w4's? > > > Josh > > > On Fr

[OMPI users] OPENIB unknown transport errors

2014-05-09 Thread Tim Miller
Hi All, We're using OpenMPI 1.7.3 with Mellanox ConnectX InfiniBand adapters, and periodically our jobs abort at start-up with the following error: === Open MPI detected two different OpenFabrics transport types in the same Infiniband network. Such mixed network trasport configuration is not supp

Re: [OMPI users] MPI_Comm_spawn and exported variables

2013-12-19 Thread Tim Miller
Hi Ralph, That's correct. All of the original processes see the -x values, but spawned ones do not. Regards, Tim On Thu, Dec 19, 2013 at 6:09 PM, Ralph Castain wrote: > > On Dec 19, 2013, at 2:57 PM, Tim Miller wrote: > > > Hi All, > > > > I have a questi

[OMPI users] MPI_Comm_spawn and exported variables

2013-12-19 Thread Tim Miller
Hi All, I have a question similar (but not identical to) the one asked by Tom Fogel a week or so back... I have a code that uses MPI_Comm_spawn to launch different processes. The executables for these use libraries in non-standard locations, so what I've done is add the directories containing the

Re: [OMPI users] OpenMPI fails to run with -np larger than 10

2012-04-14 Thread Tim Miller
This may or may not be related, but I've had similar issues on RHEL 6.x and clones when using the SSH job launcher and running more than 10 processes per node. It sounds like you're only distributing 6 processes per node, so it doesn't sound like your problem, but you might want to check your hostf

Re: [OMPI users] Problems compiling OpenMPI 1.4 with PGI 9.0-3

2010-01-07 Thread Tim Miller
CPPFLAGS) should make it work for you. > > > On Jan 7, 2010, at 6:17 AM, Ake Sandgren wrote: > > > On Thu, 2010-01-07 at 11:57 +0100, Peter Kjellstrom wrote: > > > On Wednesday 06 January 2010, Tim Miller wrote: > > > > Hi All, > > > > > > &

[OMPI users] Problems compiling OpenMPI 1.4 with PGI 9.0-3

2010-01-06 Thread Tim Miller
All Rights Reserved. Copyright 2000-2009, STMicroelectronics, Inc. All Rights Reserved. I'm not sure what's wrong here as other people have reported being able to build OpenMPI with PGI 9. Does anyone have any ideas? Thanks, Tim Miller

Re: [OMPI users] Problem with repeatedly spawning a few processes

2009-08-31 Thread Tim Miller
't know if the fix propagated to the 1.3 branch. > > > > On Aug 26, 2009, at 3:40 PM, Tim Miller wrote: > > Hello Everyone, >> >> I have a problem that I can't seem to figure out from searching the >> mailing list archive. I have a code that repeated

[OMPI users] Problem with repeatedly spawning a few processes

2009-08-26 Thread Tim Miller
Hello Everyone, I have a problem that I can't seem to figure out from searching the mailing list archive. I have a code that repeatedly spawns (via MPI_COMM_SPAWN) a group of 8 processes and then waits for them to finish. The problem is that OpenMPI (I've tried 1.3.1 and 1.3.3) opens a pipe each t