On May 28, 2010, at 3:29 PM, Rahul Nabar wrote:
> Each of our servers has twin eth cards: 1GigE and 10GigE. How does
> openmpi decide which card to use while sending messages on? One of the
> cards is on a 10.0. IP address subnet whereas the other cards are on a
> 192.168 adress subnet. Can I sel
Open MPI is very aggressive about looking for and using any tcp
communications device it can find. In your case it will use both the
10.0.. network and the 192.168.. network at the same time. Open MPI
does not pay attention to the hosts names for the communications
channel. You want to do somet
Each of our servers has twin eth cards: 1GigE and 10GigE. How does
openmpi decide which card to use while sending messages on? One of the
cards is on a 10.0. IP address subnet whereas the other cards are on a
192.168 adress subnet. Can I select one or the other by specifying the
--host option with
On Fri, May 28, 2010 at 3:53 PM, Ralph Castain wrote:
> What environment are you running on the cluster, and what version of OMPI?
> Not sure that error message is coming from us.
openmpi-1.4.1
The cluster runs PBS-Torque. So I guess, that could be the other error source.
--
Rahul
What environment are you running on the cluster, and what version of OMPI? Not
sure that error message is coming from us.
On May 28, 2010, at 1:18 PM, Rahul Nabar wrote:
> Often when I try and run larger jobs on our cluster I get the error of
> the sort from some of the compute-servers:
>
>
Often when I try and run larger jobs on our cluster I get the error of
the sort from some of the compute-servers:
eu260 - daemon did not report back when launched
It does not happen every time; but pretty often. Any ideas what could
be wrong? The node seems pingable and I could log in suc