Could be a bug in 1.8. I rewrote a huge chunk of the osc code in master. If I get a C reproducer I can spend some time next week tracking down the problem.
-Nathan On Wed, Apr 15, 2015 at 10:11:51PM -0600, Howard Pritchard wrote: > Hi Stephan, > I"m not able to reproduce your problem using master. How are you running > the test and how many ranks? Are you running on a single node? > It may be that the problem is only present for a particular transport, > i.e. btl:sm, btl:vader, etc. > Howard > 2015-04-15 0:32 GMT-06:00 MOHR STEPHAN 239883 <stephan.m...@cea.fr>: > > Hi George, > > I think you forgot the ierror argument in the call to mpi_irecv, but > after correcting this it works fine. Thanks a lot for pointing out the > issue of the eager limit! > > But as you said, this does not directly solve my one-sided problem... > > Thanks, > Stephan > > ---------------------------------------------------------------------- > > From: users [users-boun...@open-mpi.org] on behalf of George Bosilca > [bosi...@icl.utk.edu] > Sent: Tuesday, April 14, 2015 17:49 > To: Open MPI Users > Subject: Re: [OMPI users] mpi_type_create_struct not working for large > counts > This is one of the most classical bugs with point-to-point applications. > Sends behave as non blocking as long as the amount of data is below the > eager limit. Once the eager limit is passed, sends have a blocking > behavior which requires that the peer has a posted receive. This > explains why it is working for 100 but not for 1000. > > The correct application is attached below. This doesn't validate the > one-sided run thou. > > George. > > > On Apr 14, 2015, at 11:07 , MOHR STEPHAN 239883 <stephan.m...@cea.fr> > wrote: > > > > Hi Nathan, > > > > I tried with send/recv, but the outcome is the same. It works for > small counts (e.g. n=100), but hangs for larger ones (e.g. n=1000). > > I attached my modified program. > > > > Thanks, > > Stephan > > ________________________________________ > > From: users [users-boun...@open-mpi.org] on behalf of Nathan Hjelm > [hje...@lanl.gov] > > Sent: Tuesday, April 14, 2015 16:44 > > To: Open MPI Users > > Subject: Re: [OMPI users] mpi_type_create_struct not working for large > counts > > > > Can you try using send/recv with the datatype in question? It could be > a > > problem with either the one-sided code or the datatype code. Could you > > also give master a try? > > > > -Nathan > > > > On Tue, Apr 14, 2015 at 06:43:31AM +0000, MOHR STEPHAN 239883 wrote: > >> Hi Howard, > >> > >> I tried with 1.8.5rc1, but it doesn't work either. > >> > >> The output of ompi_info is attached. > >> > >> Thanks, > >> Stephan > >> > >> > ---------------------------------------------------------------------- > >> > >> From: users [users-boun...@open-mpi.org] on behalf of Howard > Pritchard > >> [hpprit...@gmail.com] > >> Sent: Monday, April 13, 2015 19:41 > >> To: Open MPI Users > >> Subject: Re: [OMPI users] mpi_type_create_struct not working for > large > >> counts > >> HI Stephan, > >> For starters, would you mind sending the output you get when you > run the > >> ompi_info command? > >> If you could, it would be great if you could try running the test > against > >> the latest 1.8.5rc1? > >> The test appears to work without problem using mpich, at least with > 4 > >> ranks. > >> Thanks, > >> Howard > >> 2015-04-13 10:40 GMT-06:00 MOHR STEPHAN 239883 > <stephan.m...@cea.fr>: > >> > >> Hi there, > >> > >> I've got an issue when using a derived data type created by > >> mpi_type_create_struct in a one-sided communication. > >> > >> The problem can be reproduced using the small standalone program > which I > >> attached. It just creates a type which should be equivalent to n > >> contiguous elements. This type is then used in a mpi_get. With a > value > >> n=100 it works fine, but with n=1000 it either hangs (version > 1.8.1) or > >> crashes (version 1.6.5). > >> > >> Any help is appreciated. > >> > >> Best regards, > >> Stephan > >> _______________________________________________ > >> users mailing list > >> us...@open-mpi.org > >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > >> Link to this post: > >> http://www.open-mpi.org/community/lists/users/2015/04/26695.php > > > >> Package: Open MPI stephanm@girofle Distribution > >> Open MPI: 1.8.5rc1 > >> Open MPI repo revision: v1.8.4-184-g481d751 > >> Open MPI release date: Apr 05, 2015 > >> Open RTE: 1.8.5rc1 > >> Open RTE repo revision: v1.8.4-184-g481d751 > >> Open RTE release date: Apr 05, 2015 > >> OPAL: 1.8.5rc1 > >> OPAL repo revision: v1.8.4-184-g481d751 > >> OPAL release date: Apr 05, 2015 > >> MPI API: 3.0 > >> Ident string: 1.8.5rc1 > >> Prefix: /local/stephanm/openmpi-1.8.5rc1_intel > >> Configured architecture: x86_64-unknown-linux-gnu > >> Configure host: girofle > >> Configured by: stephanm > >> Configured on: Tue Apr 14 07:32:10 CEST 2015 > >> Configure host: girofle > >> Built by: stephanm > >> Built on: Tue Apr 14 08:05:43 CEST 2015 > >> Built host: girofle > >> C bindings: yes > >> C++ bindings: yes > >> Fort mpif.h: yes (all) > >> Fort use mpi: yes (full: ignore TKR) > >> Fort use mpi size: deprecated-ompi-info-value > >> Fort use mpi_f08: yes > >> Fort mpi_f08 compliance: The mpi_f08 module is available, but due to > limitations in the ifort compiler, does not support the following: array > subsections, direct passthru (where possible) to underlying Open MPI's C > functionality > >> Fort mpi_f08 subarrays: no > >> Java bindings: no > >> Wrapper compiler rpath: runpath > >> C compiler: icc > >> C compiler absolute: > /local/stephanm/composer_xe_2013_sp1.0.080/bin/intel64/icc > >> C compiler family name: INTEL > >> C compiler version: 1400.20130728 > >> C++ compiler: icpc > >> C++ compiler absolute: > /local/stephanm/composer_xe_2013_sp1.0.080/bin/intel64/icpc > >> Fort compiler: ifort > >> Fort compiler abs: > /local/stephanm/composer_xe_2013_sp1.0.080/bin/intel64/ifort > >> Fort ignore TKR: yes (!DEC$ ATTRIBUTES NO_ARG_CHECK ::) > >> Fort 08 assumed shape: no > >> Fort optional args: yes > >> Fort INTERFACE: yes > >> Fort ISO_FORTRAN_ENV: yes > >> Fort STORAGE_SIZE: yes > >> Fort BIND(C) (all): yes > >> Fort ISO_C_BINDING: yes > >> Fort SUBROUTINE BIND(C): yes > >> Fort TYPE,BIND(C): yes > >> Fort T,BIND(C,name="a"): yes > >> Fort PRIVATE: yes > >> Fort PROTECTED: yes > >> Fort ABSTRACT: yes > >> Fort ASYNCHRONOUS: yes > >> Fort PROCEDURE: yes > >> Fort C_FUNLOC: yes > >> Fort f08 using wrappers: yes > >> Fort MPI_SIZEOF: yes > >> C profiling: yes > >> C++ profiling: yes > >> Fort mpif.h profiling: yes > >> Fort use mpi profiling: yes > >> Fort use mpi_f08 prof: yes > >> C++ exceptions: no > >> Thread support: posix (MPI_THREAD_MULTIPLE: no, OPAL > support: yes, OMPI progress: no, ORTE progress: yes, Event lib: yes) > >> Sparse Groups: no > >> Internal debug support: no > >> MPI interface warnings: yes > >> MPI parameter check: runtime > >> Memory profiling support: no > >> Memory debugging support: no > >> libltdl support: yes > >> Heterogeneous support: no > >> mpirun default --prefix: no > >> MPI I/O support: yes > >> MPI_WTIME support: gettimeofday > >> Symbol vis. support: yes > >> Host topology support: yes > >> MPI extensions: > >> FT Checkpoint support: no (checkpoint thread: no) > >> C/R Enabled Debugging: no > >> VampirTrace support: yes > >> MPI_MAX_PROCESSOR_NAME: 256 > >> MPI_MAX_ERROR_STRING: 256 > >> MPI_MAX_OBJECT_NAME: 64 > >> MPI_MAX_INFO_KEY: 36 > >> MPI_MAX_INFO_VAL: 256 > >> MPI_MAX_PORT_NAME: 1024 > >> MPI_MAX_DATAREP_STRING: 128 > >> MCA backtrace: execinfo (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA compress: bzip (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA compress: gzip (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA crs: none (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA db: hash (MCA v2.0, API v1.0, Component v1.8.5) > >> MCA db: print (MCA v2.0, API v1.0, Component v1.8.5) > >> MCA event: libevent2021 (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA hwloc: hwloc191 (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA if: posix_ipv4 (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA if: linux_ipv6 (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA installdirs: env (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA installdirs: config (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA memory: linux (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA pstat: linux (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA sec: basic (MCA v2.0, API v1.0, Component v1.8.5) > >> MCA shmem: mmap (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA shmem: posix (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA shmem: sysv (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA timer: linux (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA dfs: app (MCA v2.0, API v1.0, Component v1.8.5) > >> MCA dfs: orted (MCA v2.0, API v1.0, Component v1.8.5) > >> MCA dfs: test (MCA v2.0, API v1.0, Component v1.8.5) > >> MCA errmgr: default_app (MCA v2.0, API v3.0, Component > v1.8.5) > >> MCA errmgr: default_hnp (MCA v2.0, API v3.0, Component > v1.8.5) > >> MCA errmgr: default_orted (MCA v2.0, API v3.0, Component > v1.8.5) > >> MCA errmgr: default_tool (MCA v2.0, API v3.0, Component > v1.8.5) > >> MCA ess: env (MCA v2.0, API v3.0, Component v1.8.5) > >> MCA ess: hnp (MCA v2.0, API v3.0, Component v1.8.5) > >> MCA ess: singleton (MCA v2.0, API v3.0, Component > v1.8.5) > >> MCA ess: slurm (MCA v2.0, API v3.0, Component v1.8.5) > >> MCA ess: tool (MCA v2.0, API v3.0, Component v1.8.5) > >> MCA filem: raw (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA iof: hnp (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA iof: mr_hnp (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA iof: mr_orted (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA iof: orted (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA iof: tool (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA odls: default (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA oob: tcp (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA plm: isolated (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA plm: rsh (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA plm: slurm (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA ras: loadleveler (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA ras: simulator (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA ras: slurm (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA rmaps: lama (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA rmaps: mindist (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA rmaps: ppr (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA rmaps: rank_file (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA rmaps: resilient (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA rmaps: round_robin (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA rmaps: staged (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA rml: oob (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA routed: binomial (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA routed: debruijn (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA routed: direct (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA routed: radix (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA state: app (MCA v2.0, API v1.0, Component v1.8.5) > >> MCA state: hnp (MCA v2.0, API v1.0, Component v1.8.5) > >> MCA state: novm (MCA v2.0, API v1.0, Component v1.8.5) > >> MCA state: orted (MCA v2.0, API v1.0, Component v1.8.5) > >> MCA state: staged_hnp (MCA v2.0, API v1.0, Component > v1.8.5) > >> MCA state: staged_orted (MCA v2.0, API v1.0, Component > v1.8.5) > >> MCA state: tool (MCA v2.0, API v1.0, Component v1.8.5) > >> MCA allocator: basic (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA allocator: bucket (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA bcol: basesmuma (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA bcol: ptpcoll (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA bml: r2 (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA btl: openib (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA btl: self (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA btl: sm (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA btl: tcp (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA btl: vader (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA coll: basic (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA coll: hierarch (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA coll: inter (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA coll: libnbc (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA coll: ml (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA coll: self (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA coll: sm (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA coll: tuned (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA dpm: orte (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA fbtl: posix (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA fcoll: dynamic (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA fcoll: individual (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA fcoll: static (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA fcoll: two_phase (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA fcoll: ylib (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA fs: ufs (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA io: ompio (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA io: romio (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA mpool: grdma (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA mpool: sm (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA osc: rdma (MCA v2.0, API v3.0, Component v1.8.5) > >> MCA osc: sm (MCA v2.0, API v3.0, Component v1.8.5) > >> MCA pml: v (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA pml: bfo (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA pml: cm (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA rcache: vma (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA rte: orte (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA sbgp: basesmsocket (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA sbgp: basesmuma (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA sbgp: p2p (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA sharedfp: individual (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA sharedfp: lockedfile (MCA v2.0, API v2.0, Component > v1.8.5) > >> MCA sharedfp: sm (MCA v2.0, API v2.0, Component v1.8.5) > >> MCA topo: basic (MCA v2.0, API v2.1, Component v1.8.5) > >> MCA vprotocol: pessimist (MCA v2.0, API v2.0, Component > v1.8.5) > > > >> _______________________________________________ > >> users mailing list > >> us...@open-mpi.org > >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > >> Link to this post: > http://www.open-mpi.org/community/lists/users/2015/04/26713.php > > > > <test_mpi_sendrecv.f90>_______________________________________________ > > users mailing list > > us...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/04/26720.php > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/04/26721.php > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/04/26733.php > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/04/26744.php
pgpiqmYORUCJA.pgp
Description: PGP signature