Re: [OMPI users] error polling LP CQ with status RETRY EXCEEDED ERROR

2009-03-27 Thread Gary Draving
Thanks for the advice, we tried "-mca btl_openib_ib_min_rnr_timer 25 -mca btl_openib_ib_timeout 20" but we are still getting errors as we increase the Ns of HPL.dat value into the thousands. Is it ok to just add these valuse to .openmpi/mca-params.conf for the user running the test or should w

Re: [OMPI users] error polling LP CQ with status RETRY EXCEEDED ERROR

2009-03-26 Thread Ralph Castain
The default retry values are wrong and will be corrected in the next OMPI release. For now, try running with: -mca btl_openib_ib_min_rnr_timer 25 -mca btl_openib_ib_timeout 20 Should work. Ralph On Mar 26, 2009, at 2:16 PM, Gary Draving wrote: Hi Everyone, I'm doing some performance testin

[OMPI users] error polling LP CQ with status RETRY EXCEEDED ERROR

2009-03-26 Thread Gary Draving
Hi Everyone, I'm doing some performance testing using HPL with TCP turned off. My HPL.dat file looks like the following: It seems to work well for lower Ns values but as I increase that value it inevitably fails with "[[13535,1],169][btl_openib_component.c:2905:handle_wc] from compute-0-0.lo