I agree with Ralph. Please run again with --enable-debug. That will give more information (line number) on where the error is occuring.
Looking at the function in question the only place I see that could be causing this warning is the call to strlen. Some implementations of strlen use operate on larger chunks (4 or 8 bytes). This will make valgrind unhappy but does not make the implementation invalid as no read will cross a page boundary (so no SEGV). One example of such a strlen implementation is the one used by icc which uses vector operations on 8-byte chunks of the string. -Nathan On Wed, Jul 29, 2015 at 07:58:09AM -0700, Ralph Castain wrote: > If you have the time, it would be helpful. You might also configure > -enable-debug. > Meantime, I can take another gander to see how it could happen - looking > at the code, it sure seems impossible, but maybe there is some strange > path that would break it. > > On Jul 29, 2015, at 6:29 AM, Schlottke-Lakemper, Michael > <m.schlottke-lakem...@aia.rwth-aachen.de> wrote: > If it is helpful, I can try to compile OpenMPI with debug information > and get more details on the reported error. However, it would be good if > someone could tell me the necessary compile flags (on top of -O0 -g) and > it would take me probably 1-2 weeks to do it. > Michael > > -------- Original message -------- > From: Gilles Gouaillardet <gilles.gouaillar...@gmail.com> > Date: 29/07/2015 14:17 (GMT+01:00) > To: Open MPI Users <us...@open-mpi.org> > Subject: Re: [OMPI users] Invalid read of size 4 (Valgrind error) with > OpenMPI 1.8.7 > > Thomas, > can you please elaborate ? > I checked the code of opal_os_dirpath_create and could not find where > such a thing can happen > Thanks, > Gilles > On Wednesday, July 29, 2015, Thomas Jahns <ja...@dkrz.de> wrote: > > Hello, > > On 07/28/15 17:34, Schlottke-Lakemper, Michael wrote: > > That's what I suspected. Thank you for your confirmation. > > you are mistaken, the allocation is 51 bytes long, i.e. valid bytes > are at offsets 0 to 50. But since the read of 4 bytes starts at offset > 48, the bytes at offsets 48, 49, 50 and 51 get read, the last of which > is illegal. It probably does no harm at the moment in practice, > because virtually all allocators always add some padding to the next > multiple of some power of 2. But still this means the program is > incorrect in terms of any programming language definition involved > (might be C, C++ or Fortran). > > Regards, Thomas > > On 25 Jul 2015, at 16:10 , Ralph Castain <r...@open-mpi.org > <mailto:r...@open-mpi.org>> wrote: > > Looks to me like a false positive - we do malloc some space, and > do access > different parts of it. However, it looks like we are inside the > space at all > times. > > I'd suppress it > > On Jul 23, 2015, at 12:47 AM, Schlottke-Lakemper, Michael > <m.schlottke-lakem...@aia.rwth-aachen.de > <mailto:m.schlottke-lakem...@aia.rwth-aachen.de>> wrote: > > Hi folks, > > recently we've been getting a Valgrind error in PMPI_Init for > our suite of > regression tests: > > ==5922== Invalid read of size 4 > ==5922== at 0x61CC5C0: opal_os_dirpath_create (in > /aia/opt/openmpi-1.8.7/lib64/libopen-pal.so.6.2.2) > ==5922== by 0x5F207E5: orte_session_dir (in > /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6) > ==5922== by 0x5F34F04: orte_ess_base_app_setup (in > /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6) > ==5922== by 0x7E96679: rte_init (in > /aia/opt/openmpi-1.8.7/lib64/openmpi/mca_ess_env.so) > ==5922== by 0x5F12A77: orte_init (in > /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6) > ==5922== by 0x509883C: ompi_mpi_init (in > /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2) > ==5922== by 0x50B843A: PMPI_Init (in > /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2) > ==5922== by 0xEBA79C: ZFS::run() (in > > /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production) > ==5922== by 0x4DC243: main (in > > /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production) > ==5922== Address 0x710f670 is 48 bytes inside a block of size > 51 alloc'd > ==5922== at 0x4C29110: malloc (in > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) > ==5922== by 0x61CC572: opal_os_dirpath_create (in > /aia/opt/openmpi-1.8.7/lib64/libopen-pal.so.6.2.2) > ==5922== by 0x5F207E5: orte_session_dir (in > /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6) > ==5922== by 0x5F34F04: orte_ess_base_app_setup (in > /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6) > ==5922== by 0x7E96679: rte_init (in > /aia/opt/openmpi-1.8.7/lib64/openmpi/mca_ess_env.so) > ==5922== by 0x5F12A77: orte_init (in > /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6) > ==5922== by 0x509883C: ompi_mpi_init (in > /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2) > ==5922== by 0x50B843A: PMPI_Init (in > /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2) > ==5922== by 0xEBA79C: ZFS::run() (in > > /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production) > ==5922== by 0x4DC243: main (in > > /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production) > ==5922== > > What is weird is that it seems to depend on the pbs/torque > session we're in: > sometimes the error does not occur and all and all tests run > fine (this is in > fact the only Valgrind error we're having at the moment). Other > times every > single test we're running has this error. > > Has anyone seen this or might be able to offer an explanation? > If it is a > false-positive, I'd be happy to suppress it :) > > Thanks a lot in advance > > Michael > > P.S.: This error is not covered/suppressed by the default ompi > suppression > file in $PREFIX/share/openmpi. > > -- > Michael Schlottke-Lakemper > > SimLab Highly Scalable Fluids & Solids Engineering > Ju:lich Aachen Research Alliance (JARA-HPC) > RWTH Aachen University > Wu:llnerstrasse 5a > 52062 Aachen > Germany > > Phone: +49 (241) 80 95188 > Fax: +49 (241) 80 92257 > Mail: m.schlottke-lakem...@aia.rwth-aachen.de > <mailto:m.schlottke-lakem...@aia.rwth-aachen.de> > Web: http://www.jara.org/jara-hpc > > _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/07/27303.php > > _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/07/27328.php > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/07/27348.php > > -- > Thomas Jahns > HD(CP)^2 > Abteilung Anwendungssoftware > > Deutsches Klimarechenzentrum GmbH > Bundesstrasse 45a o D-20146 Hamburg o Germany > > Phone: +49 40 460094-151 > Fax: +49 40 460094-270 > Email: Thomas Jahns <ja...@dkrz.de> > URL: www.dkrz.de > > Gescha:ftsfu:hrer: Prof. Dr. Thomas Ludwig > Sitz der Gesellschaft: Hamburg > Amtsgericht Hamburg HRB 39784 > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/07/27359.php > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/07/27360.php
pgpE4N3ZJyX1h.pgp
Description: PGP signature