Hi, I installed and tried with version 1.8.1 but it also fails. I see the error when there are some processes without any matrix block. It's not a common situation, but this makes me feel unsure about I am not doing something wrong. The error I get is:
*** Error in `./binary': free(): invalid size: 0x0000000000a34c00 *** [oriol-VirtualBox:13975] *** Process received signal *** [oriol-VirtualBox:13975] Signal: Aborted (6) [oriol-VirtualBox:13975] Signal code: (-6) [oriol-VirtualBox:13969] *** Process received signal *** [oriol-VirtualBox:13969] Signal: Aborted (6) [oriol-VirtualBox:13969] Signal code: (-6) ======= Backtrace: ========= /lib/x86_64-linux-gnu/libc.so.6(+0x80996)[0x7f5844a8d996] [oriol-VirtualBox:13969] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36ff0)[0x7f06a50a7ff0] [oriol-VirtualBox:13969] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f06a50a7f77] [oriol-VirtualBox:13969] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f06a50ab5e8] [oriol-VirtualBox:13969] [ 3] /lib/x86_64-linux-gnu/libc.so.6(+0x744fb)[0x7f06a50e54fb] [oriol-VirtualBox:13969] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x80996)[0x7f06a50f1996] [oriol-VirtualBox:13969] [ 5] /usr/local/lib/openmpi/mca_io_romio.so(ADIOI_Delete_flattened+0x62)[0x7f0691e12c02] [oriol-VirtualBox:13969] [ 6] /usr/local/lib/openmpi/mca_io_romio.so(ADIO_Close+0x1f9)[0x7f0691df7189] [oriol-VirtualBox:13969] [ 7] /usr/local/lib/openmpi/mca_io_romio.so(mca_io_romio_dist_MPI_File_close+0xe8)[0x7f0691de9dd8] [oriol-VirtualBox:13969] [ 8] /usr/local/lib/libmpi.so.1(+0x3a2c6)[0x7f06a5ea02c6] [oriol-VirtualBox:13969] [ 9] /usr/local/lib/libmpi.so.1(ompi_file_close+0x41)[0x7f06a5ea0811] [oriol-VirtualBox:13969] [10] /usr/local/lib/libmpi.so.1(PMPI_File_close+0x78)[0x7f06a5edc118] [oriol-VirtualBox:13969] [11] ./binary[0x42099e] [oriol-VirtualBox:13969] [12] ./binary[0x48ed86] [oriol-VirtualBox:13969] [13] ./binary[0x40e49f] [oriol-VirtualBox:13969] [14] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f06a5092de5] [oriol-VirtualBox:13969] [15] ./binary[0x40d679] [oriol-VirtualBox:13969] *** End of error message *** [oriol-VirtualBox:13975] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36ff0)[0x7f1857201ff0] [oriol-VirtualBox:13975] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f1857201f77] [oriol-VirtualBox:13975] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f18572055e8] [oriol-VirtualBox:13975] [ 3] /lib/x86_64-linux-gnu/libc.so.6(+0x744fb)[0x7f185723f4fb] [oriol-VirtualBox:13975] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x80996)[0x7f185724b996] [oriol-VirtualBox:13975] [ 5] /usr/local/lib/openmpi/mca_io_romio.so(ADIOI_Delete_flattened+0x62)[0x7f18459d2c02] [oriol-VirtualBox:13975] [ 6] /usr/local/lib/openmpi/mca_io_romio.so(ADIO_Close+0x1f9)[0x7f18459b7189] [oriol-VirtualBox:13975] [ 7] /usr/local/lib/openmpi/mca_io_romio.so(mca_io_romio_dist_MPI_File_close+0xe8)[0x7f18459a9dd8] [oriol-VirtualBox:13975] [ 8] /usr/local/lib/libmpi.so.1(+0x3a2c6)[0x7f1857ffa2c6] [oriol-VirtualBox:13975] [ 9] /usr/local/lib/libmpi.so.1(ompi_file_close+0x41)[0x7f1857ffa811] [oriol-VirtualBox:13975] [10] /usr/local/lib/libmpi.so.1(PMPI_File_close+0x78)[0x7f1858036118] [oriol-VirtualBox:13975] [11] ./binary[0x42099e] [oriol-VirtualBox:13975] [12] ./binary[0x48ed86] [oriol-VirtualBox:13975] [13] ./binary[0x40e49f] [oriol-VirtualBox:13975] [14] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f18571ecde5] [oriol-VirtualBox:13975] [15] ./binary[0x40d679] [oriol-VirtualBox:13975] *** End of error message *** [oriol-VirtualBox:13972] *** Process received signal *** [oriol-VirtualBox:13972] Signal: Aborted (6) [oriol-VirtualBox:13972] Signal code: (-6) [oriol-VirtualBox:13972] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36ff0)[0x7f5844a43ff0] [oriol-VirtualBox:13972] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f5844a43f77] [oriol-VirtualBox:13972] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f5844a475e8] [oriol-VirtualBox:13972] [ 3] /lib/x86_64-linux-gnu/libc.so.6(+0x744fb)[0x7f5844a814fb] [oriol-VirtualBox:13972] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x80996)[0x7f5844a8d996] [oriol-VirtualBox:13972] [ 5] /usr/local/lib/openmpi/mca_io_romio.so(ADIOI_Delete_flattened+0x62)[0x7f58315f2c02] [oriol-VirtualBox:13972] [ 6] /usr/local/lib/openmpi/mca_io_romio.so(ADIO_Close+0x1f9)[0x7f58315d7189] [oriol-VirtualBox:13972] [ 7] /usr/local/lib/openmpi/mca_io_romio.so(mca_io_romio_dist_MPI_File_close+0xe8)[0x7f58315c9dd8] [oriol-VirtualBox:13972] [ 8] /usr/local/lib/libmpi.so.1(+0x3a2c6)[0x7f584583c2c6] [oriol-VirtualBox:13972] [ 9] /usr/local/lib/libmpi.so.1(ompi_file_close+0x41)[0x7f584583c811] [oriol-VirtualBox:13972] [10] /usr/local/lib/libmpi.so.1(PMPI_File_close+0x78)[0x7f5845878118] [oriol-VirtualBox:13972] [11] ./binary[0x42099e] [oriol-VirtualBox:13972] [12] ./binary[0x48ed86] [oriol-VirtualBox:13972] [13] ./binary[0x40e49f] [oriol-VirtualBox:13972] [14] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f5844a2ede5] [oriol-VirtualBox:13972] [15] ./binary[0x40d679] [oriol-VirtualBox:13972] *** End of error message *** -------------------------------------------------------------------------- mpirun noticed that process rank 2 with PID 13969 on node oriol-VirtualBox exited on signal 6 (Aborted). -------------------------------------------------------------------------- -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -----Original Message----- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain Sent: 14 May 2014 16:24 To: Open MPI Users Subject: Re: [OMPI users] bug in MPI_File_set_view? Our initial thinking was first half of June, but that is subject to change depending on severity of reported errors. FWIW: I don't believe we made any romio changes between 1.8.1 and the current 1.8.2 state, so using 1.8.1 should be a valid test. On May 14, 2014, at 8:16 AM, Bennet Fauber <ben...@umich.edu> wrote: > Is there an ETA for 1.8.2 general release instead of snapshot? > > Thanks, -- bennet > > On Wed, May 14, 2014 at 10:17 AM, Ralph Castain <r...@open-mpi.org> wrote: >> You might give it a try with 1.8.1 or the nightly snapshot from 1.8.2 >> - we updated ROMIO since the 1.6 series, and whatever fix is required >> may be in the newer version >> >> >> On May 14, 2014, at 6:52 AM, CANELA-XANDRI Oriol >> <oriol.canela-xan...@roslin.ed.ac.uk> wrote: >> >>> Hello, >>> >>> I am using MPI IO for writing/reading a block cyclic distribution matrix >>> into a file. >>> >>> It works fine except when there is some MPI threads with no data (i.e. when >>> the matrix is small enough, or the block size is big enough that some >>> processes in the grid do not have any matrix block). In this case, I >>> receive an error when calling MPI_File_set_view saying that the data cannot >>> be freed. I tried with 1.3 and 1.6 versions. When I try with MPICH it works >>> without errors. Could this be a bug? >>> >>> My function is (where nBlockRows/nBlockCols define the size of the blocks, >>> nGlobRows/nGlobCols define the global size of the matrix, >>> nProcRows/nProcCols define the dimensions of the process grid, and fname is >>> the name of the file.): >>> >>> void Matrix::writeMatrixMPI(std::string fname) { int dims[] = >>> {this->nGlobRows, this->nGlobCols}; int dargs[] = {this->nBlockRows, >>> this->nBlockCols}; int distribs[] = {MPI_DISTRIBUTE_CYCLIC, >>> MPI_DISTRIBUTE_CYCLIC}; int dim[] = {communicator->nProcRows, >>> communicator->nProcCols}; char nat[] = "native"; int rc; >>> MPI_Datatype dcarray; MPI_File cFile; MPI_Status status; >>> >>> MPI_Type_create_darray(communicator->mpiNumTasks, >>> communicator->mpiRank, 2, dims, distribs, dargs, dim, >>> MPI_ORDER_FORTRAN, MPI_DOUBLE, &dcarray); MPI_Type_commit(&dcarray); >>> >>> std::vector<char> fn(fname.begin(), fname.end()); >>> fn.push_back('\0'); rc = MPI_File_open(MPI_COMM_WORLD, &fn[0], >>> MPI_MODE_CREATE | MPI_MODE_WRONLY, MPI_INFO_NULL, &cFile); if(rc){ >>> std::stringstream ss; >>> ss << "Error: Failed to open file: " << rc; >>> misc.error(ss.str(), 0); >>> } >>> else >>> { >>> MPI_File_set_view(cFile, 0, MPI_DOUBLE, dcarray, nat, MPI_INFO_NULL); >>> MPI_File_write_all(cFile, this->m, this->nRows*this->nCols, >>> MPI_DOUBLE, &status); } MPI_File_close(&cFile); >>> MPI_Type_free(&dcarray); } >>> >>> Best regards, >>> >>> Oriol >>> >>> -- >>> The University of Edinburgh is a charitable body, registered in >>> Scotland, with registration number SC005336. >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users