Hi - I’m trying to get OpenMPI working on a newly configured CentOS 7 system, 
and I’m not even sure what information would be useful to provide.  I’m using 
the CentOS built in libibverbs and/or libfabric, and I configure openmpi with 
just
        —with-verbs —with-ofi —prefix=$DEST
also tried —without-ofi, no change.  Basically, I can run with “—mca btl 
self,vader”, but if I try “—mca btl,openib” I get an error from each process:
[compute-0-0][[24658,1],5][connect/btl_openib_connect_udcm.c:1245:udcm_rc_qp_to_rtr]
 error modifing QP to RTR errno says Invalid argument
If I don’t specify the btl it appears to try to set up openib with the same 
errors, then crashes on some free() related segfault, presumably when it tries 
to actually use vader.

The machine seems to be able to see its IB interface, as reported by things 
like ibstatus or ibv_devinfo.  I’m not sure what else to look for.  I also 
confirmed that “ulimit -l” reports unlimited.

Does anyone have any suggestions as to how to diagnose this issue?

                                                                thanks,
                                                                Noam
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to