The buffer being overrun isn’t anything to do with you - it’s an internal buffer used as part of creating the connections. It indicates a problem in OMPI.
The 1.10 series is out of the support window, but if you want to stick with it you should at least update to the last release in that series - believe that is 1.10.7. The OMPI v2.x series had problems that don’t support dynamics, so you should skip that one. If you want to come all the way forward, you should take the OMPI v3.x series. Ralph > On Aug 3, 2018, at 3:40 AM, Florian Lindner <mailingli...@xgm.de> wrote: > > Hello, > > I have this piece of code: > > MPI_Comm icomm; > INFO << "Accepting connection on " << portName; > MPI_Comm_accept(portName.c_str(), MPI_INFO_NULL, 0, MPI_COMM_SELF, &icomm); > > and sometimes (like in 1 of 5 runs), I get: > > [helium:33883] [[32673,1],0] ORTE_ERROR_LOG: Data unpack would read past end > of buffer in file dpm_orte.c at line 406 > [helium:33883] *** An error occurred in MPI_Comm_accept > [helium:33883] *** reported by process [2141257729,0] > [helium:33883] *** on communicator MPI_COMM_SELF > [helium:33883] *** MPI_ERR_UNKNOWN: unknown error > [helium:33883] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will > now abort, > [helium:33883] *** and potentially your MPI job) > [helium:33883] [0] > func:/usr/lib/libopen-pal.so.13(opal_backtrace_buffer+0x33) [0x7fc1ad0ac6e3] > [helium:33883] [1] func:/usr/lib/libmpi.so.12(ompi_mpi_abort+0x365) > [0x7fc1af4955e5] > [helium:33883] [2] > func:/usr/lib/libmpi.so.12(ompi_mpi_errors_are_fatal_comm_handler+0xe2) > [0x7fc1af487e72] > [helium:33883] [3] func:/usr/lib/libmpi.so.12(ompi_errhandler_invoke+0x145) > [0x7fc1af4874b5] > [helium:33883] [4] func:/usr/lib/libmpi.so.12(MPI_Comm_accept+0x262) > [0x7fc1af4a90e2] > [helium:33883] [5] func:./mpiports() [0x41e43d] > [helium:33883] [6] > func:/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7fc1ad7a1830] > [helium:33883] [7] func:./mpiports() [0x41b249] > > > Before that I check for the length of portName > > DEBUG << "COMM ACCEPT portName.size() = " << portName.size(); > DEBUG << "MPI_MAX_PORT_NAME = " << MPI_MAX_PORT_NAME; > > which both return 1024. > > I am completely puzzled, how I can get a buffer issue, except something > faulty with std::string portName. > > Any clues? > > Launch command: mpirun -n 4 -mca opal_abort_print_stack 1 > OpenMPI 1.10.2 @ Ubuntu 16. > > Thanks, > Florian > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users