Re: [OMPI users] OpenIB Error in ibv_create_srq

2010-08-04 Thread Terry Dontje
Allen Barnett wrote: Thanks for the pointer! Do you know if these sizes are dependent on the hardware? They can be, the following file sets up the defaults for some known cards: ompi/mca/btl/openib/mca-btl-openib-device-params.ini --td Thanks, Allen On Tue, 2010-08-03 at 10:29 -0400, Ter

Re: [OMPI users] OpenIB Error in ibv_create_srq

2010-08-04 Thread Allen Barnett
Thanks for the pointer! Do you know if these sizes are dependent on the hardware? Thanks, Allen On Tue, 2010-08-03 at 10:29 -0400, Terry Dontje wrote: > Sorry, I didn't see your prior question glad you found the > btl_openib_receive_queues parameter. There is not a faq entry for > this but I fo

Re: [OMPI users] OpenIB Error in ibv_create_srq

2010-08-03 Thread Terry Dontje
Sorry, I didn't see your prior question glad you found the btl_openib_receive_queues parameter. There is not a faq entry for this but I found the following in the openib btl help file that spells out the parameters when using Per-peer receive queue (ie receive queue setting with "P" as the fir

Re: [OMPI users] OpenIB Error in ibv_create_srq

2010-08-03 Thread Allen Barnett
Hi: In response to my own question, by studying the file mca-btl-openib-device-params.ini, I discovered that this option in OMPI-1.4.2: -mca btl_openib_receive_queues P,65536,256,192,128 was sufficient to prevent OMPI from trying to create shared receive queues and allowed my application to run t

Re: [OMPI users] OpenIB Error in ibv_create_srq

2010-08-02 Thread Allen Barnett
Hi Terry: It is indeed the case that the openib BTL has not been initialized. I ran with your tcp-disabled MCA option and it aborted in MPI_Init. The OFED stack is what's included in RHEL4. It appears to be made up of the RPMs: openib-1.4-1.el4 opensm-3.2.5-1.el4 libibverbs-1.1.2-1.el4 How can I

Re: [OMPI users] OpenIB Error in ibv_create_srq

2010-08-02 Thread Terry Dontje
My guess is from the message below saying "(openib) BTL failed to initialize" that the code is probably running over tcp. To absolutely prove this you can specify to only use the openib, sm and self btls to eliminate the tcp btl. To do that you add the following to the mpirun line "-mca btl

[OMPI users] OpenIB Error in ibv_create_srq

2010-07-30 Thread Allen Barnett
Hi: A customer is attempting to run our OpenMPI 1.4.2-based application on a cluster of machines running RHEL4 with the standard OFED stack. The HCAs are identified as: 03:01.0 PCI bridge: Mellanox Technologies MT23108 PCI Bridge (rev a1) 04:00.0 InfiniBand: Mellanox Technologies MT23108 InfiniHos