Hi, Thank you so much for your help. I don’t have the permissions to update the software on the system I am using, but I will let the administrators know about the release.
Esthela Gallardo From: users <users-boun...@lists.open-mpi.org> on behalf of "Cabral, Matias A" <matias.a.cab...@intel.com> Reply-To: Open MPI Users <users@lists.open-mpi.org> Date: Friday, April 28, 2017 at 1:16 PM To: Open MPI Users <users@lists.open-mpi.org> Subject: Re: [OMPI users] Received eager message(s) from an unknown process error on KNL Hi Esthela, As George mentions, this is indeed libpsm2 printing this error. Opcode=0xCC is a disconnect retry. There are a few scenarios that could be happening, but can simplify in saying it is an already disconnected endpoint message arriving late. What version of Intel Ompin-path Software or libpsm2 do you have in your system? We have not seen this error since the release of IFS 10.3.0. I suggest updating and testing again. https://downloadcenter.intel.com/download/26567/Intel-Omni-Path-Fabric-Software-Including-Intel-Omni-Path-Host-Fabric-Interface-Driver-?v=t Thanks, _MAC From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of George Bosilca Sent: Thursday, April 27, 2017 7:46 PM To: Open MPI Users <users@lists.open-mpi.org> Subject: Re: [OMPI users] Received eager message(s) from an unknown process error on KNL Esthela, This error message is generated internally by the PSM2 library, so you will not be able to get rid of it simply by recompiling Open MPI. George. On Thu, Apr 27, 2017 at 8:21 PM, Gallardo, Esthela <egallar...@miners.utep.edu<mailto:egallar...@miners.utep.edu>> wrote: Hello, I am currently running a couple of benchmarks on two Intel Xeon Phi 7250 second-generation KNL MIC compute nodes using Open MPI 2.1.0. While trying to run the osu_bcast benchmark with 8 MPI tasks (4 on each node), I noticed the following error in my output: Received eager message(s) ptype=0x1 opcode=0xcc from an unknown process (err=49) I have tried running the benchmark in the following manners: mpirun -np 8 ./osu_bcast mpirun -np 8 -hostfile hosti --npernode 4 ./osu_bcast mpirun -np 8 -hostfile hosti --npernode 4 --mca mtl psm2 ./osu_bcast But, nothing changes the error message at the end. Note, that the error does not really impact the results of the benchmark, so it’s possible that the error may be occurring in MPI_Finalize. Also, in order to try to avoid getting this error, I tried to build the library with both of these configurations: ./configure --prefix=<path_to_build_folder> CC=icc CXX=icpc FC=ifort CFLAGS=-xCORE-AVX2 -axMIC-AVX512 CXXFLAGS=-xCORE-AVX2 -axMIC-AVX512 FFLAGS=-xCORE-AVX2 -axMIC-AVX512 LDFLAGS=-xCORE-AVX2 -axMIC-AVX512 ./configure --prefix=<path_to_build_folder> —enable-orterun-prefix-by-default —with-cma=yes --with-psm2 CC=icc CXX=icpc FC=ifort --disable-shared --enable-static --without-slurm However, this did not help prevent the occurrence of the error either. I was wondering if anyone has encountered this issue before, and what can be done in order to get rid of the error message. Thank you, Esthela Gallardo _______________________________________________ users mailing list users@lists.open-mpi.org<mailto:users@lists.open-mpi.org> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users