On Mon, 14 Nov 2005 17:28:15 -0700, Troy Telford <ttelf...@linuxnetworx.com> wrote:

I've just finished a build of RC7, so I'll go give that a whirl and report.

RC7:


With *both* mvapi and openib, I recieve the following when using IMB-MPI1:

***mvapi***
[0,1,3][btl_mvapi_component.c:637:mca_btl_mvapi_component_progress] error in posting pending send [0,1,3][btl_mvapi_component.c:637:mca_btl_mvapi_component_progress] error in posting pending send [0,1,2][btl_mvapi_component.c:637:mca_btl_mvapi_component_progress] error in posting pending send
**openib***
[0,1,3][btl_openib_endpoint.c:134:mca_btl_openib_endpoint_post_send] error posting send request errno says Resource temporarily unavailable [0,1,3][btl_openib_component.c:655:mca_btl_openib_component_progress] error in posting pending send [0,1,2][btl_openib_endpoint.c:134:mca_btl_openib_endpoint_post_send] error posting send request errno says Resource temporarily unavailable [0,1,2][btl_openib_component.c:655:mca_btl_openib_component_progress] error in posting pending send [0,1,3][btl_openib_endpoint.c:134:mca_btl_openib_endpoint_post_send] error posting send request errno says Resource temporarily unavailable [0,1,3][btl_openib_component.c:655:mca_btl_openib_component_progress] error in posting pending send
***********

Notable is that they both fail in pretty much the same place (every time):
#----------------------------------------------------------------
# Benchmarking Reduce_scatter
# #processes = 4
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
            0         1000         0.04         0.04         0.04
<insert error here>

(sometimes it will finish having completed one more item -- ie. byte size of 4)



HPL will run on mvapi, but on openib, it segfaults before completing the first problem size with: mpirun noticed that job rank 0 with PID 25662 on node "n57" exited on signal 11.

HPCC also segfaults when it gets to the HPL section of HPCC with OpenIB (with no additional info)
HPCC is still running on mvapi... so far so good...

The Presta tests seem to still error out (similar to IMB) as previously reported; however it happens less frequently. (Meaning, I've been able to complete the particular test successfully, then when I run it again, it fails -- something like a 50% success rate.) This is with the 'com' and 'allred' tests; 'globalop' has refused to run since RC5, and this has not changed with RC7.

Reply via email to