Troy,
I've been able to reproduce this. Should have this
corrected shortly.
Thanks,
Tim
> On Mon, 14 Nov 2005 10:38:03 -0700, Troy Telford
> wrote:
>
>> My mvapi config is using the Mellanox IB Gold 1.8 IB software release.
>> Kernel 2.6.5-7.201 (SLES 9 SP2)
>>
>> When I ran IMB using mvapi, I
On Mon, 14 Nov 2005 17:28:15 -0700, Troy Telford
wrote:
I've just finished a build of RC7, so I'll go give that a whirl and
report.
RC7:
With *both* mvapi and openib, I recieve the following when using IMB-MPI1:
***mvapi***
[0,1,3][btl_mvapi_component.c:637:mca_btl_mvapi_component_progr
On Mon, 14 Nov 2005 10:38:03 -0700, Troy Telford
wrote:
My mvapi config is using the Mellanox IB Gold 1.8 IB software release.
Kernel 2.6.5-7.201 (SLES 9 SP2)
When I ran IMB using mvapi, I received the following error:
***
[0,1,2][btl_mvapi_component.c:637:mca_btl_mvapi_component_progress] e
Thus far, it appears that moving to MX 1.1.0 didn't change the error
message I've been getting about parts being 'not implemented.'
I also re-provisioned four of the IB nodes (leaving me with 3 four-node
clusters: One using mvapi, one using openib, and one using myrinet)
My mvapi config is
On Sun, 13 Nov 2005 17:53:40 -0700, Jeff Squyres
wrote:
I can't believe I missed that, sorry. :-(
None of the btl's are capable of doing loopback communication except
"self." Hence, you really can't run "--mca btl foo" if your app ever
sends to itself -- you really need to run "--mca btl f
1.0rc6 is now available; we made some minor fixes in gm, the datatype
engine, and the shared memory btl. I'm not sure if your MX problem
will be fixed, but please give it a whirl. Let us know exactly which
version of MX you are using, too.
http://www.open-mpi.org/software/v1.0/
Than
I can't believe I missed that, sorry. :-(
None of the btl's are capable of doing loopback communication except
"self." Hence, you really can't run "--mca btl foo" if your app ever
sends to itself -- you really need to run "--mca btl foo,self" at a
minimum.
This is not so much an optimizati
One other thing I noticed... You specify -mca btl openib. Try
specifying -mca btl openib,self. The self component is used for
"send to self" operations. This could be the cause of your failures.
Brian
On Nov 13, 2005, at 3:02 PM, Jeff Squyres wrote:
Troy --
Were you perchance using mu
Troy --
Were you perchance using multiple processes per node? If so, we
literally just fixed some sm btl bugs that could have been affecting
you (they could have caused hangs). They're fixed in the nightly
snapshots from today (both trunk and v1.0): r8140. If you were using
the sm btl and
We have very limited openib resources for testing at the moment. Can
you provide details on how to reproduce?
My bad; I must've been in a bigger hurry to go home for the weekend
than I thought.
I'm going to start with the assumption you're interested in the steps
to reproduce it in OpenMPI
Hello Troy,
We have very limited openib resources for testing at the moment. Can
you provide details on how to reproduce?
Thanks,
Tim
> On Fri, 11 Nov 2005 13:12:13 -0700, Jeff Squyres
> wrote:
>
>> At long last, 1.0rc5 is available for download. It fixes all known
>> issues reported here on t
The bad:
OpenIB frequently crashes with the error:
***
[0,1,2][btl_openib_endpoint.c:
135:mca_btl_openib_endpoint_post_send] error posting send request
errno says Operation now in progress[0,1,2d
[0,1,3][btl_openib_endpoint.c:
135:mca_btl_openib_endpoint_post_send] error posting s
On Fri, 11 Nov 2005 13:12:13 -0700, Jeff Squyres
wrote:
At long last, 1.0rc5 is available for download. It fixes all known
issues reported here on the mailing list. We still have a few minor
issues to work out, but things appear to generally be working now.
Please try to break it:
At long last, 1.0rc5 is available for download. It fixes all known
issues reported here on the mailing list. We still have a few minor
issues to work out, but things appear to generally be working now.
Please try to break it:
http://www.open-mpi.org/software/v1.0/
--
{+} Jeff Squyr
14 matches
Mail list logo