If you have the time, it would be helpful. You might also configure —enable-debug.
Meantime, I can take another gander to see how it could happen - looking at the code, it sure seems impossible, but maybe there is some strange path that would break it. > On Jul 29, 2015, at 6:29 AM, Schlottke-Lakemper, Michael > <m.schlottke-lakem...@aia.rwth-aachen.de> wrote: > > If it is helpful, I can try to compile OpenMPI with debug information and get > more details on the reported error. However, it would be good if someone > could tell me the necessary compile flags (on top of -O0 -g) and it would > take me probably 1-2 weeks to do it. > > Michael > > > -------- Original message -------- > From: Gilles Gouaillardet <gilles.gouaillar...@gmail.com> > Date: 29/07/2015 14:17 (GMT+01:00) > To: Open MPI Users <us...@open-mpi.org> > Subject: Re: [OMPI users] Invalid read of size 4 (Valgrind error) with > OpenMPI 1.8.7 > > Thomas, > > can you please elaborate ? > I checked the code of opal_os_dirpath_create and could not find where such a > thing can happen > > Thanks, > > Gilles > > On Wednesday, July 29, 2015, Thomas Jahns <ja...@dkrz.de > <mailto:ja...@dkrz.de>> wrote: > Hello, > > On 07/28/15 17:34, Schlottke-Lakemper, Michael wrote: > That’s what I suspected. Thank you for your confirmation. > > you are mistaken, the allocation is 51 bytes long, i.e. valid bytes are at > offsets 0 to 50. But since the read of 4 bytes starts at offset 48, the bytes > at offsets 48, 49, 50 and 51 get read, the last of which is illegal. It > probably does no harm at the moment in practice, because virtually all > allocators always add some padding to the next multiple of some power of 2. > But still this means the program is incorrect in terms of any programming > language definition involved (might be C, C++ or Fortran). > > Regards, Thomas > > On 25 Jul 2015, at 16:10 , Ralph Castain <r...@open-mpi.org <> > <mailto:r...@open-mpi.org <>>> wrote: > > Looks to me like a false positive - we do malloc some space, and do access > different parts of it. However, it looks like we are inside the space at all > times. > > I’d suppress it > > > On Jul 23, 2015, at 12:47 AM, Schlottke-Lakemper, Michael > <m.schlottke-lakem...@aia.rwth-aachen.de <> > <mailto:m.schlottke-lakem...@aia.rwth-aachen.de <>>> wrote: > > Hi folks, > > recently we’ve been getting a Valgrind error in PMPI_Init for our suite of > regression tests: > > ==5922== Invalid read of size 4 > ==5922== at 0x61CC5C0: opal_os_dirpath_create (in > /aia/opt/openmpi-1.8.7/lib64/libopen-pal.so.6.2.2) > ==5922== by 0x5F207E5: orte_session_dir (in > /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6) > ==5922== by 0x5F34F04: orte_ess_base_app_setup (in > /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6) > ==5922== by 0x7E96679: rte_init (in > /aia/opt/openmpi-1.8.7/lib64/openmpi/mca_ess_env.so) > ==5922== by 0x5F12A77: orte_init (in > /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6) > ==5922== by 0x509883C: ompi_mpi_init (in > /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2) > ==5922== by 0x50B843A: PMPI_Init (in > /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2) > ==5922== by 0xEBA79C: ZFS::run() (in > /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production) > ==5922== by 0x4DC243: main (in > /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production) > ==5922== Address 0x710f670 is 48 bytes inside a block of size 51 alloc'd > ==5922== at 0x4C29110: malloc (in > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) > ==5922== by 0x61CC572: opal_os_dirpath_create (in > /aia/opt/openmpi-1.8.7/lib64/libopen-pal.so.6.2.2) > ==5922== by 0x5F207E5: orte_session_dir (in > /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6) > ==5922== by 0x5F34F04: orte_ess_base_app_setup (in > /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6) > ==5922== by 0x7E96679: rte_init (in > /aia/opt/openmpi-1.8.7/lib64/openmpi/mca_ess_env.so) > ==5922== by 0x5F12A77: orte_init (in > /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6) > ==5922== by 0x509883C: ompi_mpi_init (in > /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2) > ==5922== by 0x50B843A: PMPI_Init (in > /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2) > ==5922== by 0xEBA79C: ZFS::run() (in > /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production) > ==5922== by 0x4DC243: main (in > /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production) > ==5922== > > What is weird is that it seems to depend on the pbs/torque session we’re in: > sometimes the error does not occur and all and all tests run fine (this is in > fact the only Valgrind error we’re having at the moment). Other times every > single test we’re running has this error. > > Has anyone seen this or might be able to offer an explanation? If it is a > false-positive, I’d be happy to suppress it :) > > Thanks a lot in advance > > Michael > > P.S.: This error is not covered/suppressed by the default ompi suppression > file in $PREFIX/share/openmpi. > > > -- > Michael Schlottke-Lakemper > > SimLab Highly Scalable Fluids & Solids Engineering > Jülich Aachen Research Alliance (JARA-HPC) > RWTH Aachen University > Wüllnerstraße 5a > 52062 Aachen > Germany > > Phone: +49 (241) 80 95188 > Fax: +49 (241) 80 92257 > Mail: m.schlottke-lakem...@aia.rwth-aachen.de <> > <mailto:m.schlottke-lakem...@aia.rwth-aachen.de <>> > Web: http://www.jara.org/jara-hpc <http://www.jara.org/jara-hpc> > > _______________________________________________ > users mailing list > us...@open-mpi.org <> <mailto:us...@open-mpi.org <>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > <http://www.open-mpi.org/mailman/listinfo.cgi/users> > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/07/27303.php > <http://www.open-mpi.org/community/lists/users/2015/07/27303.php> > > _______________________________________________ > users mailing list > us...@open-mpi.org <> <mailto:us...@open-mpi.org <>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > <http://www.open-mpi.org/mailman/listinfo.cgi/users> > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/07/27328.php > <http://www.open-mpi.org/community/lists/users/2015/07/27328.php> > > > > _______________________________________________ > users mailing list > us...@open-mpi.org <> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > <http://www.open-mpi.org/mailman/listinfo.cgi/users> > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/07/27348.php > <http://www.open-mpi.org/community/lists/users/2015/07/27348.php> > > > > -- > Thomas Jahns > HD(CP)^2 > Abteilung Anwendungssoftware > > Deutsches Klimarechenzentrum GmbH > Bundesstraße 45a • D-20146 Hamburg • Germany > > Phone: +49 40 460094-151 > Fax: +49 40 460094-270 > Email: Thomas Jahns <ja...@dkrz.de <>> > URL: www.dkrz.de <http://www.dkrz.de/> > > Geschäftsführer: Prof. Dr. Thomas Ludwig > Sitz der Gesellschaft: Hamburg > Amtsgericht Hamburg HRB 39784 > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/07/27359.php