If you have the time, it would be helpful. You might also configure 
—enable-debug.

Meantime, I can take another gander to see how it could happen - looking at the 
code, it sure seems impossible, but maybe there is some strange path that would 
break it.


> On Jul 29, 2015, at 6:29 AM, Schlottke-Lakemper, Michael 
> <m.schlottke-lakem...@aia.rwth-aachen.de> wrote:
> 
> If it is helpful, I can try to compile OpenMPI with debug information and get 
> more details on the reported error. However, it would be good if someone 
> could tell me the necessary compile flags (on top of -O0 -g) and it would 
> take me probably 1-2 weeks to do it. 
> 
> Michael 
> 
> 
> -------- Original message --------
> From: Gilles Gouaillardet <gilles.gouaillar...@gmail.com> 
> Date: 29/07/2015 14:17 (GMT+01:00) 
> To: Open MPI Users <us...@open-mpi.org> 
> Subject: Re: [OMPI users] Invalid read of size 4 (Valgrind error) with 
> OpenMPI 1.8.7 
> 
> Thomas,
> 
> can you please elaborate ?
> I checked the code of opal_os_dirpath_create and could not find where such a 
> thing can happen
> 
> Thanks,
> 
> Gilles
> 
> On Wednesday, July 29, 2015, Thomas Jahns <ja...@dkrz.de 
> <mailto:ja...@dkrz.de>> wrote:
> Hello,
> 
> On 07/28/15 17:34, Schlottke-Lakemper, Michael wrote:
> That’s what I suspected. Thank you for your confirmation.
> 
> you are mistaken, the allocation is 51 bytes long, i.e. valid bytes are at 
> offsets 0 to 50. But since the read of 4 bytes starts at offset 48, the bytes 
> at offsets 48, 49, 50 and 51 get read, the last of which is illegal. It 
> probably does no harm at the moment in practice, because virtually all 
> allocators always add some padding to the next multiple of some power of 2. 
> But still this means the program is incorrect in terms of any programming 
> language definition involved (might be C, C++ or Fortran).
> 
> Regards, Thomas
> 
> On 25 Jul 2015, at 16:10 , Ralph Castain <r...@open-mpi.org <>
> <mailto:r...@open-mpi.org <>>> wrote:
> 
> Looks to me like a false positive - we do malloc some space, and do access
> different parts of it. However, it looks like we are inside the space at all
> times.
> 
> I’d suppress it
> 
> 
> On Jul 23, 2015, at 12:47 AM, Schlottke-Lakemper, Michael
> <m.schlottke-lakem...@aia.rwth-aachen.de <>
> <mailto:m.schlottke-lakem...@aia.rwth-aachen.de <>>> wrote:
> 
> Hi folks,
> 
> recently we’ve been getting a Valgrind error in PMPI_Init for our suite of
> regression tests:
> 
> ==5922== Invalid read of size 4
> ==5922==    at 0x61CC5C0: opal_os_dirpath_create (in
> /aia/opt/openmpi-1.8.7/lib64/libopen-pal.so.6.2.2)
> ==5922==    by 0x5F207E5: orte_session_dir (in
> /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6)
> ==5922==    by 0x5F34F04: orte_ess_base_app_setup (in
> /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6)
> ==5922==    by 0x7E96679: rte_init (in
> /aia/opt/openmpi-1.8.7/lib64/openmpi/mca_ess_env.so)
> ==5922==    by 0x5F12A77: orte_init (in
> /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6)
> ==5922==    by 0x509883C: ompi_mpi_init (in
> /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2)
> ==5922==    by 0x50B843A: PMPI_Init (in
> /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2)
> ==5922==    by 0xEBA79C: ZFS::run() (in
> /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production)
> ==5922==    by 0x4DC243: main (in
> /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production)
> ==5922==  Address 0x710f670 is 48 bytes inside a block of size 51 alloc'd
> ==5922==    at 0x4C29110: malloc (in
> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==5922==    by 0x61CC572: opal_os_dirpath_create (in
> /aia/opt/openmpi-1.8.7/lib64/libopen-pal.so.6.2.2)
> ==5922==    by 0x5F207E5: orte_session_dir (in
> /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6)
> ==5922==    by 0x5F34F04: orte_ess_base_app_setup (in
> /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6)
> ==5922==    by 0x7E96679: rte_init (in
> /aia/opt/openmpi-1.8.7/lib64/openmpi/mca_ess_env.so)
> ==5922==    by 0x5F12A77: orte_init (in
> /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6)
> ==5922==    by 0x509883C: ompi_mpi_init (in
> /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2)
> ==5922==    by 0x50B843A: PMPI_Init (in
> /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2)
> ==5922==    by 0xEBA79C: ZFS::run() (in
> /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production)
> ==5922==    by 0x4DC243: main (in
> /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production)
> ==5922==
> 
> What is weird is that it seems to depend on the pbs/torque session we’re in:
> sometimes the error does not occur and all and all tests run fine (this is in
> fact the only Valgrind error we’re having at the moment). Other times every
> single test we’re running has this error.
> 
> Has anyone seen this or might be able to offer an explanation? If it is a
> false-positive, I’d be happy to suppress it :)
> 
> Thanks a lot in advance
> 
> Michael
> 
> P.S.: This error is not covered/suppressed by the default ompi suppression
> file in $PREFIX/share/openmpi.
> 
> 
> --
> Michael Schlottke-Lakemper
> 
> SimLab Highly Scalable Fluids & Solids Engineering
> Jülich Aachen Research Alliance (JARA-HPC)
> RWTH Aachen University
> Wüllnerstraße 5a
> 52062 Aachen
> Germany
> 
> Phone: +49 (241) 80 95188
> Fax: +49 (241) 80 92257
> Mail: m.schlottke-lakem...@aia.rwth-aachen.de <>
> <mailto:m.schlottke-lakem...@aia.rwth-aachen.de <>>
> Web: http://www.jara.org/jara-hpc <http://www.jara.org/jara-hpc>
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org <> <mailto:us...@open-mpi.org <>>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/07/27303.php 
> <http://www.open-mpi.org/community/lists/users/2015/07/27303.php>
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org <> <mailto:us...@open-mpi.org <>>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/07/27328.php 
> <http://www.open-mpi.org/community/lists/users/2015/07/27328.php>
> 
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org <>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/07/27348.php 
> <http://www.open-mpi.org/community/lists/users/2015/07/27348.php>
> 
> 
> 
> -- 
> Thomas Jahns
> HD(CP)^2
> Abteilung Anwendungssoftware
> 
> Deutsches Klimarechenzentrum GmbH
> Bundesstraße 45a • D-20146 Hamburg • Germany
> 
> Phone:  +49 40 460094-151
> Fax:    +49 40 460094-270
> Email:  Thomas Jahns <ja...@dkrz.de <>>
> URL:    www.dkrz.de <http://www.dkrz.de/>
> 
> Geschäftsführer: Prof. Dr. Thomas Ludwig
> Sitz der Gesellschaft: Hamburg
> Amtsgericht Hamburg HRB 39784
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/07/27359.php

Reply via email to