Thanks Adrian - that's a useful suggestion, I'll explore that.
Jonathan.
On Tue, Jun 12, 2007 at 08:37:38PM +0100, Jonathan Underwood wrote:
> > > Presumably switching the two interfaces on the frontend (eth0<->eth1)
> > > would also solve this problem?
> > If you have root privileges this seems to be a another good approach.
> I don't, but will explain the issue to sy
On 12/06/07, George Bosilca wrote:
Jonathan Underwood wrote:
> Presumably switching the two interfaces on the frontend (eth0<->eth1)
> would also solve this problem?
>
If you have root privileges this seems to be a another good approach.
I don't, but will explain the issue to sysadmin. Thanks
Jonathan Underwood wrote:
Presumably switching the two interfaces on the frontend (eth0<->eth1)
would also solve this problem?
If you have root privileges this seems to be a another good approach.
george.
On 12/06/07, George Bosilca wrote:
Jonathan,
It will be difficult to make it works in this configuration. The problem
is that on the head node the network interface that have to be used is
eth1 while on the compute nodes is eth0. Therefore, the tcp_if_include
will not help ...
Now, if you only
Jonathan,
It will be difficult to make it works in this configuration. The problem
is that on the head node the network interface that have to be used is
eth1 while on the compute nodes is eth0. Therefore, the tcp_if_include
will not help ...
Now, if you only start processes on the compute n
On 11/06/07, Adrian Knoth wrote:
What's the exact problem? compute-node -> frontend? I don't think you
have two processes on the frontend node, and even if you do, they should
use shared memory.
I stopped there being more than a single process on the frontend node
- this had no effect on the
Hi Adrian,
On 11/06/07, Adrian Knoth wrote:
Which OMPI version?
1.2.2
> $ perl -e 'die$!=110'
> Connection timed out at -e line 1.
Looks pretty much like a routing issue. Can you sniff on eth1 on the
frontend node?
I don't have root access, so am afraid not.
> This error message occu
On Mon, Jun 11, 2007 at 10:55:17PM +0100, Jonathan Underwood wrote:
> Hi,
Hi!
> I am seeing problems with a small linux cluster when running OpenMPI
> jobs. The error message I get is:
Which OMPI version?
> $ perl -e 'die$!=110'
> Connection timed out at -e line 1.
Looks pretty much like a ro
Hi,
I am seeing problems with a small linux cluster when running OpenMPI
jobs. The error message I get is:
[frontend][0,1,0][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect]
connect() failed with errno=110
Following the FAQ, I looked to see what this error code corresponds to:
$ p
10 matches
Mail list logo