> have a user whos code at scale dies reliably with the errors (new hosts each
> time):
>
> We have been using for this code:
> -mca btl_openib_receive_queues X,4096,128:X,12288,128:X,65536,12
>
> Without that option it dies with an out of memory message reliably.
>
> Note this code runs fine
have a user whos code at scale dies reliably with the errors (new hosts each
time):
We have been using for this code:
-mca btl_openib_receive_queues X,4096,128:X,12288,128:X,65536,12
Without that option it dies with an out of memory message reliably.
Note this code runs fine at the same scale
>>> You sound like our vendors, "what is your app"
>>
>> ;-) I used to be one.
>>
>> Ideally OMPI should do the switch between MXM/RC/XRC internally in the
>> transport layer. Unfortunately,
>> we don't have such smart selection logic. Hopefully IB vendors will fix some
>> day.
>
> I actua
I suspect the problem is that the rsh/ssh launcher is attempting to use a tree
pattern for launching the apps - i.e., mpirun launches a daemon on the first
couple of nodes, and then those daemons launch daemons on the next level. If
rsh/ssh isn't supported on those backend nodes, then this won't
Yes I can but with at most two machines as slave and one machine as master,
If I try to add another one as slave I get those errors.
Il giorno 23/gen/2013 14:38, "Jeff Squyres (jsquyres)"
ha scritto:
> I'm not sure I understand you. Does Open MPI work across multiple
> machines? I.e., can you d
I'm not sure I understand you. Does Open MPI work across multiple machines?
I.e., can you do all three of those steps across multiple machines?
On Jan 23, 2013, at 8:16 AM, Ada Mancuso
wrote:
> I'm sure that openmpi works, morever my problem happens only with more than 2
> slaves (on differ
I'm sure that openmpi works, morever my problem happens only with more than
2 slaves (on different machines while in local it greatly works with any
number of slaves).
Thanks
Ada
Il giorno 23/gen/2013 14:04, "Jeff Squyres (jsquyres)"
ha scritto:
> Are you able to run the C examples in the example
Are you able to run the C examples in the examples/ directory from the tarball?
Our README suggests the following:
-
When verifying a new Open MPI installation, we recommend running three
tests:
1. Use "mpirun" to launch a non-MPI program (e.g., hostname or uptime)
across multiple nodes.
Hi,
I've installed the latest snapshot taken from svn developer's trunk but I
had the same problems. This is my configuration:
- Ubuntu 2.6.38-8 kernel
- Openssh_5.8p1 openssl 0.9.8o
- Libtool version 2.4
- Open mpi 1.7 rc5 and latest snapshots.
Do you think my problem could be relate
Some more info:
The MOFED that you will download will have MXM in it, but it is an older
version of it (v1.1). A new version of MXM (v1.5) is available.
So, after installing MOFED, please erase the MXM in it (rpm -e mxm) and
download the new MXM (v1.5) from:
http://www.mellanox.com/page/products_
Hello Francesco,
Please download and install MOFED from:
http://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers
(the one that matches to your OS)
Then MXM will be compatible to your OS.
Thanks,
Alina.
On Mon, Jan 21, 2013 at 5:00 PM, Francesco Simula <
francesco.sim.
11 matches
Mail list logo