Well son of a gun. I just compiled the code with pgcc (version 16.5.0) instead of gcc and lo and behold:
# pgcc -libverbs ./ib_verbs_q.c -o ib_verbs_q # ./ib_verbs_q error obtaining device attributes for mlx5_0 errno says Cannot allocate memory # gcc -libverbs ./ib_verbs_q.c -o ib_verbs_q # ./ib_verbs_q hca_id: mlx5_0 fw ver: 10.12.1100 node guid: f452:1403:006c:99e0 I'm pretty stumped, but I'm gonna do some more digging. Suffice it to say I'm pretty sure this isn't an OpenMPI problem. -Aaron On Wed, Jul 13, 2016 at 9:28 PM, Aaron Knister <aaron.knis...@gmail.com> wrote: > Matt, you're far too kind :) I put together a test program that uses the > block of code in question and... it works for me? I've attached the > reproducer here. A compile should be just a "gcc -libverbs ib_verbs_q.c". > I'm a little perplexed. I truthfully didn't expect it to work given that > the same block called from inside of openmpi on the same node(s) where Matt > had it fail earlier. > > -Aaron > > On Wed, Jul 13, 2016 at 9:17 PM, Aaron Knister <aaron.s.knis...@nasa.gov> > wrote: > >> On Wed, Jul 13, 2016 at 9:50 AM, Nathan Hjelm <hje...@me.com> wrote: >> >>> As of 2.0.0 we now support experimental verbs. It looks like one of the >>> calls is failing: >>> >>> #if HAVE_DECL_IBV_EXP_QUERY_DEVICE >>> device->ib_exp_dev_attr.comp_mask = IBV_EXP_DEVICE_ATTR_RESERVED - 1; >>> if(ibv_exp_query_device(device->ib_dev_context, >>> &device->ib_exp_dev_attr)){ >>> BTL_ERROR(("error obtaining device attributes for %s errno says >>> %s", >>> ibv_get_device_name(device->ib_dev), >>> strerror(errno))); >>> goto error; >>> } >>> #endif >>> >>> Do you know what OFED or MOFED version you are running? >>> >> >> Per one of our gurus, answers from your IB page: >> >> 1. Which OpenFabrics version are you running? Please specify where you >> got the software from (e.g., from the OpenFabrics community web site, from >> a vendor, or it was already included in your Linux distribution). >> Mellanox OFED 3.1-1.0.3 (soon to be 3.3-1.0.0) >> >> 2. What distro and version of Linux are you running? What is your kernel >> version? >> SLES11 SP3 (LTSS); 3.0.101-0.47.71-default (soon to be >> 3.0.101-0.47.79-default) >> >> 3. Which subnet manager are you running? (e.g., OpenSM, a vendor-specific >> subnet manager, etc.) >> Mellanox UFM (OpenSM under the covers) >> >> -- >> Matt Thompson >> >> Man Among Men >> Fulcrum of History >> >> >