I am willing to do, but in more than two months of
testing/trying/hoping/praying I have accumulated so much material and
information that if I post everything in this e-mail I am likely to
confuse a potential helper, more than helping him to understand the problem.
Thank you in advance,
Biagio L
Hi Dorian,
thank you for your message.
doriankrause wrote:
The trouble is with an MPI code that runs fine with an openmpi 1.1.2
library compiled without infiniband support (I have tested the
scalability of the code up to 64 cores, the nodes are 4 or 8 cores,
the results are exactly what I
Pavel Shamis (Pasha) wrote:
Biagio Lucini wrote:
Hello,
I am new to this list, where I hope to find a solution for a problem
that I have been having for quite a longtime.
I run various versions of openmpi (from 1.1.2 to 1.2.8) on a cluster
with Infiniband interconnects that I use and
since the installation directory is
non-standard (/opt/ompi128-intel/bin for the path and
/opt/ompi128-intel/lib for the libs).
I hope to have provided all the required info, if you need more or some
of them in more detail, please let me know.
Many thanks,
Biagio Lucini
Open
Jeff Squyres wrote:
Another thing to try is a change that we made late in the Open MPI
v1.2 series with regards to IB:
http://www.open-mpi.org/faq/?category=openfabrics#v1.2-use-early-completion
Thanks, this is something worth investigating. What would be the exact
syntax to use to tu
goes back
to (a) above
This implementation assumes that you do not need the data in any
particular order.
Hope it works for you.
Biagio
--
=
Dr. Biagio Lucini
Department of Physics, Swansea University
know.
Biagio
--
=
Dr. Biagio Lucini
Department of Physics, Swansea University
Singleton Park, SA2 8PP Swansea (UK)
Tel. +44 (0)1792 602284
=
Pavel Shamis (Pasha) wrote:
Another thing to try is a change that we made late in the Open MPI
v1.2 series with regards to IB:
http://www.open-mpi.org/faq/?category=openfabrics#v1.2-use-early-completion
Thanks, this is something worth investigating. What would be the
exact syntax to
nce you have 4 and 8 core machines, this test could
be run on the same 8 core machine over shared memory and not over
Infiniband, as you suspected.
You can rerun the IMB-MPI1 test with -mca btl self,openib to be sure
that the test does not use shared memory or tcp.
Lenny.
On 12/24/08, Biagio
Jeff Squyres wrote:
On Jan 7, 2009, at 6:28 PM, Biagio Lucini wrote:
[[5963,1],13][btl_openib_component.c:2893:handle_wc] from node24 to:
node11 error polling LP CQ with status RECEIVER NOT READY RETRY
EXCEEDED ERROR status number 13 for wr_id 37779456 opcode 0 qp_idx 0
Ah! If we're de
ade of the
firmware, although once again the OFED drivers were complaining about
the firmware being too old) fixed the problem. We did both upgrades at
once, so as in Brett's case I am not sure which one played the major role.
Biagio
--
==
main?
Many thanks,
Biagio Lucini
-
[node20:04178] *** Process received signal ***
[node20:04178] Signal: Segmentation fault (11)
[node20:04178] Signal code: Addres
messing up the memory). I suggest using some memory
checker tools such as valgrind to check the memory consistency of your
application.
george.
On Mar 5, 2009, at 17:37 , Biagio Lucini wrote:
We have an application that runs for a very long time with 16
processes (the time is order a few
13 matches
Mail list logo