[OMPI users] Received data is different than sent data after Allgatherv() call
Hi, I will use a small part of C++ code to demonstrate my problem during shuffling. Assume that each slave has to shuffle some unsigned char array defined as *unsigned char* data *within some intracommunicator. *unsigned lineSize = 100;* *unsigned long long no_keys = 10;* *int bytes_send_count = (int)no_keys*lineSize; * *unsigned int commSize = (unsigned)comm.Get_size();* *int* recv_counts = new int[commSize];* *int* displs = new int[commSize];* *//Shuffle amount of data* *comm.Allgather(&bytes_send_count, 1, MPI::INT, recv_counts, 1, MPI::INT); * *unsigned long long total = 0; * *for(unsigned int i = 0; i < commSize; i++){ * * //Update the displacements* * displs[i] = total; * * //...and the total count* * total += recv_counts[i]; * *}* *unsigned char* recv_buf = new unsigned char[total];* *//Print data to be sent from rank == 1* *if(rank == 1){* * for(int l=0; l Package: Open MPI ubuntu@ip-172-31-30-250 Distribution Open MPI: 2.1.2 Open MPI repo revision: v2.1.1-188-g6157ed8 Open MPI release date: Sep 20, 2017 Open RTE: 2.1.2 Open RTE repo revision: v2.1.1-188-g6157ed8 Open RTE release date: Sep 20, 2017 OPAL: 2.1.2 OPAL repo revision: v2.1.1-188-g6157ed8 OPAL release date: Sep 20, 2017 MPI API: 3.1.0 Ident string: 2.1.2 Prefix: /usr/local Configured architecture: x86_64-unknown-linux-gnu Configure host: ip-172-31-30-250 Configured by: ubuntu Configured on: Sun Nov 19 22:08:57 UTC 2017 Configure host: ip-172-31-30-250 Built by: root Built on: Sun Nov 19 22:18:49 UTC 2017 Built host: ip-172-31-30-250 C bindings: yes C++ bindings: yes Fort mpif.h: yes (all) Fort use mpi: yes (full: ignore TKR) Fort use mpi size: deprecated-ompi-info-value Fort use mpi_f08: yes Fort mpi_f08 compliance: The mpi_f08 module is available, but due to limitations in the gfortran compiler, does not support the following: array subsections, direct passthru (where possible) to underlying Open MPI's C functionality Fort mpi_f08 subarrays: no Java bindings: no Wrapper compiler rpath: runpath C compiler: gcc C compiler absolute: /usr/bin/gcc C compiler family name: GNU C compiler version: 5.4.0 C++ compiler: g++ C++ compiler absolute: /usr/bin/g++ Fort compiler: gfortran Fort compiler abs: /usr/bin/gfortran Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::) Fort 08 assumed shape: yes Fort optional args: yes Fort INTERFACE: yes Fort ISO_FORTRAN_ENV: yes Fort STORAGE_SIZE: yes Fort BIND(C) (all): yes Fort ISO_C_BINDING: yes Fort SUBROUTINE BIND(C): yes Fort TYPE,BIND(C): yes Fort T,BIND(C,name="a"): yes Fort PRIVATE: yes Fort PROTECTED: yes Fort ABSTRACT: yes Fort ASYNCHRONOUS: yes Fort PROCEDURE: yes Fort USE...ONLY: yes Fort C_FUNLOC: yes Fort f08 using wrappers: yes Fort MPI_SIZEOF: yes C profiling: yes C++ profiling: yes Fort mpif.h profiling: yes Fort use mpi profiling: yes Fort use mpi_f08 prof: yes C++ exceptions: no Thread support: posix (MPI_THREAD_MULTIPLE: no, OPAL support: yes, OMPI progress: no, ORTE progress: yes, Event lib: yes) Sparse Groups: no Internal debug support: yes MPI interface warnings: yes MPI parameter check: runtime Memory profiling support: no Memory debugging support: no dl support: yes Heterogeneous support: no mpirun default --prefix: no MPI I/O support: yes MPI_WTIME support: native Symbol vis. support: yes Host topology support: yes MPI extensions: affinity, cuda MPI_MAX_PROCESSOR_NAME: 256 MPI_MAX_ERROR_STRING: 256 MPI_MAX_OBJECT_NAME: 64 MPI_MAX_INFO_KEY: 36 MPI_MAX_INFO_VAL: 256 MPI_MAX_PORT_NAME: 1024 MPI_MAX_DATAREP_STRING: 128 MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component v2.1.2) MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component v2.1.2) MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component v2.1.2) MCA btl: vader (MCA v2.1.0, API v3.0.0, Component v2.1.2) MCA btl: sm (MCA v2.1.0, API v3.0.0, Component v2.1.2) MCA btl: openib (MCA v2.1.0, API v3.0.0, Component v2.1.2) MCA btl: tcp (MCA v2.1.0, API v3.0.0, Component v2.1.2) MCA btl: self (MCA v2.1.0, API v3.0.0, Component v2.1.2) MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Co
Re: [OMPI users] Received data is different than sent data after Allgatherv() call
should the send buffer for MPI_Allgatherv() be data instead of &data ? BTW, is this issue specific to Open MPI ? If this is a general MPI issue, forums such as https://stackoverflow.com are a better place for this. Cheers, Gilles On 11/30/2017 5:02 PM, Konstantinos Konstantinidis wrote: Hi, I will use a small part of C++ code to demonstrate my problem during shuffling. Assume that each slave has to shuffle some unsigned char array defined as *unsigned char* data *within some intracommunicator. *unsigned lineSize = 100;* *unsigned long long no_keys = 10;* *int bytes_send_count = (int)no_keys*lineSize; * *unsigned int commSize = (unsigned)comm.Get_size();* *int* recv_counts = new int[commSize];* *int* displs = new int[commSize];* * * *//Shuffle amount of data* *comm.Allgather(&bytes_send_count, 1, MPI::INT, recv_counts, 1, MPI::INT); * * * *unsigned long long total = 0; * *for(unsigned int i = 0; i < commSize; i++){* *//Update the displacements* *displs[i] = total; * ** *//...and the total count* *total += recv_counts[i]; * *}* * * *unsigned char* recv_buf = new unsigned char[total]; * * * *//Print data to be sent from rank == 1* *if(rank == 1){* *for(int l=0; l*comm.Allgatherv(&data, bytes_send_count, MPI::UNSIGNED_CHAR, recv_buf, recv_counts, displs, MPI::UNSIGNED_CHAR);* * * *//Check the first portion of the received data* *if(rank == 1){* *for(int l=0; lMy problem is that the printf() that checks what is about to be sent from node 1 and what is actually received from node 1 by itself print different things that don't match. Based on my study of Allgatherv() I think that the sizes of the received blocks and the displacements are computed correctly. I don't think I need MPI_IN_PLACE since the input and output buffers are supposed to be different. Can you help me identify the problem? I am using Open MPI 2.1.2 and testing on a single computer with 7 MPI processes. The ompi_info is the attached file. ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes
Hi everyone, I have managed to solve the first part of this problem. It was caused by the quota on /tmp, that's where the session directory of openmpi was stored. There's a XFS default quota of 100MB to prevent users from filling up /tmp. Instead of an over quota message, the result was the openmpi crash from a bus error. After setting TMPDIR in slurm, I was finally able to run IMB-MPI1 with 1024 cores and openmpi 1.10.6. But now for the new problem: with openmpi3, the same test (IMB-MPI1, 1024 cores, 32 nodes) hangs after about 30 minutes of runtime. Any idea on this? Regards, Götz Waschk ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes
Can you upgrade to 1.10.7? That's the last release in the v1.10 series, and has all the latest bug fixes. > On Nov 30, 2017, at 9:53 AM, Götz Waschk wrote: > > Hi everyone, > > I have managed to solve the first part of this problem. It was caused > by the quota on /tmp, that's where the session directory of openmpi > was stored. There's a XFS default quota of 100MB to prevent users from > filling up /tmp. Instead of an over quota message, the result was the > openmpi crash from a bus error. > > After setting TMPDIR in slurm, I was finally able to run IMB-MPI1 with > 1024 cores and openmpi 1.10.6. > > But now for the new problem: with openmpi3, the same test (IMB-MPI1, > 1024 cores, 32 nodes) hangs after about 30 minutes of runtime. Any > idea on this? > > Regards, Götz Waschk > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users -- Jeff Squyres jsquy...@cisco.com ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes
Dear Jeff, I'm using openmpi as shipped by OpenHPC, so I'll upgrade 1.10 to 1.10.7 when they do. But it isn't 1.10 that is failing for me but openmpi 3.0.0. Regards, Götz On Thu, Nov 30, 2017 at 4:24 PM, Jeff Squyres (jsquyres) wrote: > Can you upgrade to 1.10.7? That's the last release in the v1.10 series, and > has all the latest bug fixes. > >> On Nov 30, 2017, at 9:53 AM, Götz Waschk wrote: >> >> Hi everyone, >> >> I have managed to solve the first part of this problem. It was caused >> by the quota on /tmp, that's where the session directory of openmpi >> was stored. There's a XFS default quota of 100MB to prevent users from >> filling up /tmp. Instead of an over quota message, the result was the >> openmpi crash from a bus error. >> >> After setting TMPDIR in slurm, I was finally able to run IMB-MPI1 with >> 1024 cores and openmpi 1.10.6. >> >> But now for the new problem: with openmpi3, the same test (IMB-MPI1, >> 1024 cores, 32 nodes) hangs after about 30 minutes of runtime. Any >> idea on this? >> >> Regards, Götz Waschk >> ___ >> users mailing list >> users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/users > > > -- > Jeff Squyres > jsquy...@cisco.com > > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
[OMPI users] IMB-MPI1 hangs after 30 minutes with Open MPI 3.0.0 (was: Openmpi 1.10.4 crashes with 1024 processes)
Ah, I was misled by the subject. Can you provide more information about "hangs", and your environment? You previously cited: - E5-2697A v4 CPUs and Mellanox ConnectX-3 FDR Infiniband - SLRUM - Open MPI v3.0.0 - IMB-MPI1 Can you send the information listed here: https://www.open-mpi.org/community/help/ BTW, the fact that you fixed the last error by growing the tmpdir size (admittedly: we should probably have a better error message here, and shouldn't just segv like you were seeing -- I'll open a bug on that), you can probably remove "--mca btl ^vader" or other similar CLI options. vader and sm were [probably?] failing due to the memory-mapped files on the filesystem running out of space and Open MPI not handling it well. Meaning: in general, you don't want to turn off shared memory support, because that will likely always be the fastest for on-node communication. > On Nov 30, 2017, at 11:10 AM, Götz Waschk wrote: > > Dear Jeff, > > I'm using openmpi as shipped by OpenHPC, so I'll upgrade 1.10 to > 1.10.7 when they do. But it isn't 1.10 that is failing for me but > openmpi 3.0.0. > > Regards, Götz > > On Thu, Nov 30, 2017 at 4:24 PM, Jeff Squyres (jsquyres) > wrote: >> Can you upgrade to 1.10.7? That's the last release in the v1.10 series, and >> has all the latest bug fixes. >> >>> On Nov 30, 2017, at 9:53 AM, Götz Waschk wrote: >>> >>> Hi everyone, >>> >>> I have managed to solve the first part of this problem. It was caused >>> by the quota on /tmp, that's where the session directory of openmpi >>> was stored. There's a XFS default quota of 100MB to prevent users from >>> filling up /tmp. Instead of an over quota message, the result was the >>> openmpi crash from a bus error. >>> >>> After setting TMPDIR in slurm, I was finally able to run IMB-MPI1 with >>> 1024 cores and openmpi 1.10.6. >>> >>> But now for the new problem: with openmpi3, the same test (IMB-MPI1, >>> 1024 cores, 32 nodes) hangs after about 30 minutes of runtime. Any >>> idea on this? >>> >>> Regards, Götz Waschk >>> ___ >>> users mailing list >>> users@lists.open-mpi.org >>> https://lists.open-mpi.org/mailman/listinfo/users >> >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> >> ___ >> users mailing list >> users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/users > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users -- Jeff Squyres jsquy...@cisco.com ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users