So you are saying the test worked, but you are still encountering an error when
executing an MPI job? Or are you saying things now work?
> On Dec 28, 2014, at 5:58 PM, Saliya Ekanayake wrote:
>
> Thank you Ralph. This produced the warning on memory limits similar to [1]
> and setting ulimit -
Thank you Ralph. This produced the warning on memory limits similar to [1]
and setting ulimit -l unlimited worked.
[1] http://lists.openfabrics.org/pipermail/general/2007-June/036941.html
Saliya
On Sun, Dec 28, 2014 at 5:57 PM, Ralph Castain wrote:
> Have the admin try running the ibv_ud_pingp
Have the admin try running the ibv_ud_pingpong test - that will exercise the
portion of the system under discussion.
> On Dec 28, 2014, at 2:31 PM, Saliya Ekanayake wrote:
>
> What I heard from the administrator is that,
>
> "The tests that work are the simple utilities ib_read_lat and ib_re
What I heard from the administrator is that,
"The tests that work are the simple utilities ib_read_lat and ib_read_bw
that measures latency and bandwith between two nodes. They are part of
the "perftest" repo package."
On Dec 28, 2014 10:20 AM, "Saliya Ekanayake" wrote:
> This happens at MPI_Ini
This happens at MPI_Init. I've attached the full error message.
The sys admin mentioned Infiniband utility tests ran OK. I'll contact him
for more details and let you know.
Thank you,
Saliya
On Sun, Dec 28, 2014 at 3:18 AM, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:
> Where doe
I have a bunch of 8 GB memory nodes in a cluster who were lately
upgraded to 16 GB. When I run any jobs I get the following warning:
--
WARNING: It appears that your OpenFabrics subsystem is configured to
only
allow registering
Might also be worth checking to ensure that UD is enabled on your IB
installation as we depend upon it for wireup of IB connections.
> On Dec 28, 2014, at 12:18 AM, Gilles Gouaillardet
> wrote:
>
> Where does the error occurs ?
> MPI_Init ?
> MPI_Finalize ?
> In between ?
>
> In the first ca
Where does the error occurs ?
MPI_Init ?
MPI_Finalize ?
In between ?
In the first case, the bug is likely a mishandled error case,
which means OpenMPI is unlikely the root cause of the crash.
Did you check infniband is up and running on your cluster ?
Cheers,
Gilles
Saliya Ekanayake さんのメール:
>