at first I recommend you test 7 cases
- one network only (3 cases)
- two networks ony (3 cases)
- three networks (1 case)
and see when things hang
you might also want to
mpirun --mca oob_tcp_if_include 10.1.10.0/24 ...
to ensure no hang will happen in oob
as usual, double check no firewall is ru
Hello Gilles
Thanks for your prompt follow up. It looks this this issue is somehow
specific to the Broadcom NIC. If I take it out, the rest of them work in
any combination. On further investigation, I found that the name that
'ifconfig' shows for this intterface is different from what it is named
iirc, ompi internally uses networks and not interface names.
what did you use in your tests ?
can you try with networks ?
Cheers,
Gilles
On Saturday, May 14, 2016, dpchoudh . wrote:
> Hello Gilles
>
> Thanks for your prompt follow up. It looks this this issue is somehow
> specific to the Broad
No, I used IP addresses in all my tests. What I found that if I used the IP
address of the Broadcom NIC in hostfile and used that network exclusively
(btl_tcp_if_include), the mpirun command hung silently. If I used the IP
address of another NIC in the host file (and Broadcom NIC exclusively),
mpir
You might want to try a pure TCP benchmark across this problematic NIC (e.g.,
NetpipeTCP or iperf).
That will take MPI out of the equation and see if you are able to pass TCP
traffic correctly. Make sure to test sizes both smaller and larger than your
MTU.
> On May 14, 2016, at 1:25 AM, dpch
Hi all
I posted about a fortnight ago to this list as I was having some trouble
getting my nodes to be controlled by my master node. Perceived wisdom at
the time was to compile with the -enable-orterun-prefix-by-default.
For some time I'd been getting cannot open libopen-rte.so.7 which poin
Rob,
I do not know how Debian packaged openmpi, and they should be asked instead
of openmpi.
an other option to get things work is to add the path to openmpi libraries
in the ld conf.
for example, append
/opt/openmpi/lib
to /etc/ld.so.conf
(or into a new file called /etc/ld.so.conf.d/openmpi, tha
Hi all --
I am having a weird problem on a cluster of Raspberry Pi model 2 machines
running the Debian/Raspbian version of OpenMPI, 1.6.5.
I apologize for the length of this message, but I am trying to include all
the pertinent details, but of course can't reliably discriminate between
pertinent
I think I might have fixed this, but I still don't really understand it.
In setting up the RPi machines, I followed a config guide that suggested
switching the SSH service in systemd to "ssh.socket" instead of
"ssh.service". It's supposed to be lighter weight and get you cleaner
shut-downs, and I'
> On May 7, 2016, at 1:13 AM, Siegmar Gross
> wrote:
>
> Hi,
>
> yesterday I installed openmpi-v1.10.2-176-g9d45e07 on my "SUSE Linux
> Enterprise Server 12 (x86_64)" with Sun C 5.13 and gcc-5.3.0. The
> following programs don't run anymore.
>
>
> loki hello_2 112 ompi_info | grep -e "OPAL
10 matches
Mail list logo