[OMPI users] handle_wc() in openib and IBV_WC_DRIVER2/MLX5DV_WC_RAW_WQE completion code

2022-02-22 Thread Crni Gorac via users
We've encountered OpenMPI crashing in handle_wc(), with following error message: [.../opal/mca/btl/openib/btl_openib_component.c:3610:handle_wc] Unhandled work completion opcode is 136 Our setup is admittedly little tricky, but I'm still worried that it may be a genuine problem, so please bear wit

Re: [OMPI users] OMPI_COMM_WORLD_LOCAL_SIZE problem between PBS and MLNX_OFED

2022-01-18 Thread Crni Gorac via users
i.e., "-host node1:5" assigns 5 slots > to node1. > > If tm support is included, then we read the PBS allocation and see one slot > on each node - and launch accordingly. > > > > On Jan 18, 2022, at 2:44 PM, Crni Gorac via users > > wrote: > > >

Re: [OMPI users] OMPI_COMM_WORLD_LOCAL_SIZE problem between PBS and MLNX_OFED

2022-01-18 Thread Crni Gorac via users
ine. My guess is that you are using the ssh launcher - what is odd is > that you should wind up with two procs on the first node, in which case those > envars are correct. If you are seeing one proc on each node, then something > is wrong. > > > > On Jan 18, 2022, at 1:33 PM,

Re: [OMPI users] OMPI_COMM_WORLD_LOCAL_SIZE problem between PBS and MLNX_OFED

2022-01-18 Thread Crni Gorac via users
t; cmd line. My guess is that you are using the ssh launcher - what is odd is > that you should wind up with two procs on the first node, in which case those > envars are correct. If you are seeing one proc on each node, then something > is wrong. > > > > On Jan 18, 2022, at 1:33

Re: [OMPI users] OMPI_COMM_WORLD_LOCAL_SIZE problem between PBS and MLNX_OFED

2022-01-18 Thread Crni Gorac via users
: > > Afraid I can't understand your scenario - when you say you "submit a job" to > run on two nodes, how many processes are you running on each node?? > > > > On Jan 18, 2022, at 1:07 PM, Crni Gorac via users > > wrote: > > > > Using Ope

[OMPI users] OMPI_COMM_WORLD_LOCAL_SIZE problem between PBS and MLNX_OFED

2022-01-18 Thread Crni Gorac via users
Using OpenMPI 4.1.2 from MLNX_OFED_LINUX-5.5-1.0.3.2 distribution, and have PBS 18.1.4 installed on my cluster (cluster nodes are running CentOS 7.9). When I try to submit a job that will run on two nodes in the cluster, both ranks get OMPI_COMM_WORLD_LOCAL_SIZE set to 2, instead of 1, and OMPI_CO

[OMPI users] problem with MPI datatypes not defined as constants in OpenMPI

2013-01-08 Thread Crni Gorac
Most MPI implementations (MPICH, Intel MPI) are defining MPI datatypes (MPI_INT, MPI_FLOAT etc.) as constants; in OpenMPI, these are practically pointers to corresponding internal structures (for example MPI_FLOAT is defined as pointer to mpi_float structure, etc.). In trying to employ some C++ te

[OMPI users] gpudirect p2p (again)?

2012-07-09 Thread Crni Gorac
Trying to examine CUDA support in OpenMPI, using OpenMPI current feature series (v1.7). There was a question on this mailing list back in October 2011 (http://www.open-mpi.org/community/lists/users/2011/10/17539.php), about OpenMPI being able to use P2P transfers in case when two MPI processed inv

Re: [OMPI users] mpicc -showme:compile output (possible bug report)

2008-04-17 Thread Crni Gorac
On Thu, Apr 17, 2008 at 6:36 AM, Terry Frankcombe wrote: > Given that discussion, might I suggest an (untested) workaround would be > to --prefix OpenMPI into a non-standard location? It is possible approach, but there are others - it is also possible to provide specific CMake variable value on

Re: [OMPI users] mpicc -showme:compile output (possible bug report)

2008-04-16 Thread Crni Gorac
On Wed, Apr 16, 2008 at 2:18 PM, Jeff Squyres wrote: > What exactly does mpicc --showme:compile output? > > mpicc (and friends) typically do not output -I only for "special" > directories, such as /usr/include, because adding -I/usr/include may > subvert the compiler's normal include directory

[OMPI users] mpicc -showme:compile output (possible bug report)

2008-04-16 Thread Crni Gorac
Am using CMake build system along with an OpenMPI based project. CMake is using mpicc's -showme:compile and -showme:link output to build compile and link flags; however, it is expecting -showme:compile to dump at least some "-I" flags, that it is further parsing in order to build the list of includ