Well, mpich2 and mvapich2 are working smoothly for my app. mpich2 under
gige is also giving ~2X the performance of openmpi during the working
cases for openmpi. After the paper deadline, I'll attempt to package up
a simple test case and send it to the list.
Thanks!
-Mike
Mike Ho
lar to what I
was seeing, so hopefully I can make some progress on a real solution.
Brian
On Mar 20, 2007, at 8:54 PM, Mike Houston wrote:
Well, I've managed to get a working solution, but I'm not sure how
I got
there. I built a test case that looked like a nice simple version
Also make sure that /tmp is user writable. By default, that is where
openmpi likes to stick some files.
-Mike
David Burns wrote:
Could also be a firewall problem. Make sure all nodes in the cluster
accept tcp packets from all others.
Dave
Walker, David T. wrote:
I am presently trying t
Marcus G. Daniels wrote:
Marcus G. Daniels wrote:
Mike Houston wrote:
The main issue with this, and addressed at the end
of the report, is that the code size is going to be a problem as data
and code must live in the same 256KB in each SPE. They mention dynamic
overlay loading
That's pretty cool. The main issue with this, and addressed at the end
of the report, is that the code size is going to be a problem as data
and code must live in the same 256KB in each SPE. They mention dynamic
overlay loading, which is also how we deal with large code size, but
things get t
I did notice that single sided transfers seem to be a
little slower than explicit send/recv, at least on GigE. Once I do some
more testing, I'll bring things up on IB and see how things are going.
-Mike
Mike Houston wrote:
Brian Barrett wrote:
On Mar 20, 2007, at 3:15 PM, Mike Hou
Brian Barrett wrote:
On Mar 20, 2007, at 3:15 PM, Mike Houston wrote:
If I only do gets/puts, things seem to be working correctly with
version
1.2. However, if I have a posted Irecv on the target node and issue a
MPI_Get against that target, MPI_Test on the posed IRecv causes a
If I only do gets/puts, things seem to be working correctly with version
1.2. However, if I have a posted Irecv on the target node and issue a
MPI_Get against that target, MPI_Test on the posed IRecv causes a segfaults:
[expose:21249] *** Process received signal ***
[expose:21249] Signal: Segm
I've been having similar issues with brand new FC5/6 and RHEL5 machines,
but our FC4/RHEL4 machines are just fine. On the FC5/6 RHEL5 machines,
I can get things to run as root. There must be some ACL or security
setting issue that's enabled by default on the newer distros. If I
figure it out
At least with 1.1.4, I'm having a heck of a time with enabling
multi-threading. Configuring with --with-threads=posix
--enable-mpi-threads --enable-progress-threads leads to mpirun just
hanging, even when not launching MPI apps, i.e. mpirun -np 1 hostname,
and I can't crtl-c to kill it, I have
vapi_flags 2 ./bw
25 131072
131072 801.580272 (MillionBytes/sec) 764.446518(MegaBytes/sec)
Mike Houston wrote:
What's the ETA, or should I try grabbing from cvs?
-Mike
Tim S. Woodall wrote:
Mike,
I believe was probably corrected today and should be in the
next release can
this:
/u/twoodall> orterun -np 2 -mca mpi_leave_pinned 1 -mca btl_mvapi_flags 2 ./bw
25 131072
131072 801.580272 (MillionBytes/sec) 764.446518(MegaBytes/sec)
Mike Houston wrote:
What's the ETA, or should I try grabbing from cvs?
-Mike
Tim S. Woodall wrote:
Mike,
I bel
I have things working now. I needed to limit OpenMPI to actual working
interfaces (thanks for the tip). It still seems that should be figured
out correctly... Now I've moved onto stress testing with the bandwidth
testing app I posted earlier in the Infiniband thread:
mpirun -mca btl_tcp_if_
What's the ETA, or should I try grabbing from cvs?
-Mike
Tim S. Woodall wrote:
Mike,
I believe was probably corrected today and should be in the
next release candidate.
Thanks,
Tim
Mike Houston wrote:
Woops, spoke to soon. The performance quoted was not actually going
between
:mca_btl_mvapi_component_progress] Got
error : VAPI_WR_FLUSH_ERR, Vendor code : 0 Frag : 0xb73412fc
repeated until it eventually hangs.
-Mike
Mike Houston wrote:
Woops, spoke to soon. The performance quoted was not actually going
between nodes. Actually using the network with the pinned option gives
: VAPI_WR_FLUSH_ERR, Vendor code : 0 Frag : 0xb74a1c18Got error :
VAPI_WR_FLUSH_ERR, Vendor code : 0 Frag : 0xb73e1720
repeated many times.
-Mike
Mike Houston wrote:
That seems to work with the pinning option enabled. THANKS!
Now I'll go back to testing my real code. I'm getting 7
e to tweak up performance. Now if I can get the tcp layer working,
I'm pretty much good to go.
Any word on an SDP layer? I can probably modify the tcp layer quickly
to do SDP, but I thought I would ask.
-Mike
Tim S. Woodall wrote:
Hello Mike,
Mike Houston wrote:
When only sending a fe
I'll give it a go. Attached is the code.
Thanks!
-Mike
Tim S. Woodall wrote:
Hello Mike,
Mike Houston wrote:
When only sending a few messages, we get reasonably good IB performance,
~500MB/s (MVAPICH is 850MB/s). However, if I crank the number of
messages up, we drop to
IB.
Thanks,
george.
On Oct 31, 2005, at 10:50 AM, Mike Houston wrote:
When only sending a few messages, we get reasonably good IB
performance,
~500MB/s (MVAPICH is 850MB/s). However, if I crank the number of
messages up, we drop to 3MB/s(!!!). This is with the OSU NBCL
mpi_bandwidth
When only sending a few messages, we get reasonably good IB performance,
~500MB/s (MVAPICH is 850MB/s). However, if I crank the number of
messages up, we drop to 3MB/s(!!!). This is with the OSU NBCL
mpi_bandwidth test. We are running Mellanox IB Gold 1.8 with 3.3.3
firmware on PCI-X (Couger
We can't seem to run across TCP. We did a default 'configure'. Shared
memory seems to work, but trying tcp give us:
[0,1,1][btl_tcp_endpoint.c:557:mca_btl_tcp_endpoint_complete_connect]
connect() failed with errno=113
I'm assuming that the tcp backend is the most thoroughly tested, so I
th
21 matches
Mail list logo