Do we know if this was definitely fixed in v4.1.x?
> On Feb 4, 2021, at 7:46 AM, Gilles Gouaillardet via users > <users@lists.open-mpi.org> wrote: > > Martin, > > this is a connectivity issue reported by the btl/tcp component. > > You can try restricting the IP interface to a subnet known to work > (and with no firewall) between both hosts > > mpirun --mca btl_tcp_if_include 192.168.0.0/24 ... > > If the error persists, you can > > mpirun --mca btl_tcp_base_verbose 20 ... > > and then compress and post the logs so we can have a look > > > Cheers, > > Gilles > > On Thu, Feb 4, 2021 at 9:33 PM Martín Morales via users > <users@lists.open-mpi.org> wrote: >> >> Hi Marcos, >> >> >> >> Yes, I have a problem with spawning to a “worker” host (on localhost, >> works). There are just two machines: “master” and “worker”. I’m using >> Windows 10 in both with same Cygwin and packages. Pasted below some details. >> >> Thanks for your help. Regards, >> >> >> >> Martín >> >> >> >> ---- >> >> >> >> Running: >> >> >> >> mpirun -np 1 -hostfile ./hostfile ./spawner.exe 8 >> >> >> >> hostfile: >> >> >> >> master slots=5 >> >> worker slots=5 >> >> >> >> Error: >> >> >> >> At least one pair of MPI processes are unable to reach each other for >> >> MPI communications. This means that no Open MPI device has indicated >> >> that it can be used to communicate between these processes. This is >> >> an error; Open MPI requires that all MPI processes be able to reach >> >> each other. This error can sometimes be the result of forgetting to >> >> specify the "self" BTL. >> >> >> >> Process 1 ([[31598,1],0]) is on host: DESKTOP-C0G4680 >> >> Process 2 ([[31598,2],2]) is on host: worker >> >> BTLs attempted: self tcp >> >> >> >> Your MPI job is now going to abort; sorry. >> >> -------------------------------------------------------------------------- >> >> [DESKTOP-C0G4680:02828] [[31598,1],0] ORTE_ERROR_LOG: Unreachable in file >> /pub/devel/openmpi/v4.0/openmpi-4.0.5-1.x86_64/src/openmpi-4.0.5/ompi/dpm/dpm.c >> at line 493 >> >> [DESKTOP-C0G4680:02828] *** An error occurred in MPI_Comm_spawn >> >> [DESKTOP-C0G4680:02828] *** reported by process [2070806529,0] >> >> [DESKTOP-C0G4680:02828] *** on communicator MPI_COMM_SELF >> >> [DESKTOP-C0G4680:02828] *** MPI_ERR_INTERN: internal error >> >> [DESKTOP-C0G4680:02828] *** MPI_ERRORS_ARE_FATAL (processes in this >> communicator will now abort, >> >> [DESKTOP-C0G4680:02828] *** and potentially your MPI job) >> >> >> >> USER_SSH@DESKTOP-C0G4680 ~ >> >> $ [WinDev2012Eval:00120] [[31598,2],2] ORTE_ERROR_LOG: Unreachable in file >> /pub/devel/openmpi/v4.0/openmpi-4.0.5-1.x86_64/src/openmpi-4.0.5/ompi/dpm/dpm.c >> at line 493 >> >> [WinDev2012Eval:00121] [[31598,2],3] ORTE_ERROR_LOG: Unreachable in file >> /pub/devel/openmpi/v4.0/openmpi-4.0.5-1.x86_64/src/openmpi-4.0.5/ompi/dpm/dpm.c >> at line 493 >> >> -------------------------------------------------------------------------- >> >> It looks like MPI_INIT failed for some reason; your parallel process is >> >> likely to abort. There are many reasons that a parallel process can >> >> fail during MPI_INIT; some of which are due to configuration or environment >> >> problems. This failure appears to be an internal failure; here's some >> >> additional information (which may only be relevant to an Open MPI >> >> developer): >> >> >> >> ompi_dpm_dyn_init() failed >> >> --> Returned "Unreachable" (-12) instead of "Success" (0) >> >> -------------------------------------------------------------------------- >> >> [WinDev2012Eval:00121] *** An error occurred in MPI_Init >> >> [WinDev2012Eval:00121] *** reported by process >> [15289389101093879810,12884901891] >> >> [WinDev2012Eval:00121] *** on a NULL communicator >> >> [WinDev2012Eval:00121] *** Unknown error >> >> [WinDev2012Eval:00121] *** MPI_ERRORS_ARE_FATAL (processes in this >> communicator will now abort, >> >> [WinDev2012Eval:00121] *** and potentially your MPI job) >> >> [DESKTOP-C0G4680:02831] 2 more processes have sent help message >> help-mca-bml-r2.txt / unreachable proc >> >> [DESKTOP-C0G4680:02831] Set MCA parameter "orte_base_help_aggregate" to 0 to >> see all help / error messages >> >> [DESKTOP-C0G4680:02831] 1 more process has sent help message >> help-mpi-runtime.txt / mpi_init:startup:internal-failure >> >> [DESKTOP-C0G4680:02831] 1 more process has sent help message >> help-mpi-errors.txt / mpi_errors_are_fatal unknown handle >> >> >> >> Script spawner: >> >> >> >> #include "mpi.h" >> >> #include <stdio.h> >> >> #include <stdlib.h> >> >> #include <unistd.h> >> >> >> >> int main(int argc, char ** argv){ >> >> int processesToRun; >> >> MPI_Comm intercomm; >> >> MPI_Info info; >> >> >> >> if(argc < 2 ){ >> >> printf("Processes number needed!\n"); >> >> return 0; >> >> } >> >> processesToRun = atoi(argv[1]); >> >> MPI_Init( NULL, NULL ); >> >> printf("Spawning from parent:...\n"); >> >> MPI_Comm_spawn( "./spawned.exe", MPI_ARGV_NULL, processesToRun, >> MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE); >> >> >> >> MPI_Finalize(); >> >> return 0; >> >> } >> >> >> >> Script spawned: >> >> >> >> #include "mpi.h" >> >> #include <stdio.h> >> >> #include <stdlib.h> >> >> >> >> int main(int argc, char ** argv){ >> >> int hostName_len,rank, size; >> >> MPI_Comm parentcomm; >> >> char hostName[200]; >> >> >> >> MPI_Init( NULL, NULL ); >> >> MPI_Comm_get_parent( &parentcomm ); >> >> MPI_Get_processor_name(hostName, &hostName_len); >> >> MPI_Comm_rank(MPI_COMM_WORLD, &rank); >> >> MPI_Comm_size(MPI_COMM_WORLD, &size); >> >> >> >> if (parentcomm != MPI_COMM_NULL) { >> >> printf("I'm the spawned h: %s r/s: %i/%i\n", hostName, rank, size ); >> >> } >> >> >> >> MPI_Finalize(); >> >> return 0; >> >> } >> >> >> >> >> >> >> >> >> >> From: Marco Atzeri via users >> Sent: miércoles, 3 de febrero de 2021 17:58 >> To: users@lists.open-mpi.org >> Cc: Marco Atzeri >> Subject: Re: [OMPI users] OMPI 4.1 in Cygwin packages? >> >> >> >> On 03.02.2021 21:35, Martín Morales via users wrote: >>> Hello, >>> >>> I would like to know if any OMPI 4.1.* is going to be available in the >>> Cygwin packages. >>> >>> Thanks and regards, >>> >>> Martín >>> >> >> Hi Martin, >> anything in it that is abolutely needed short term ? >> >> Any problem with current 4.0.5 package ? >> >> >> Usually it is very time consuming the build >> and I am busy with other cygwin stuff >> >> Regards >> Marco >> >> -- Jeff Squyres jsquy...@cisco.com