Re: [OMPI users] Setting coll_sm_priority = 35 didn't improve communication performance

2015-12-09 Thread Saliya Ekanayake
Thank you, Gilles for the pointer. I see what operations supported in SM now. On Wed, Dec 9, 2015 at 8:05 PM, Gilles Gouaillardet wrote: > Saliya, > > from ompi/mca/coll/sm/coll_sm_module.c in mca_coll_sm_comm_query() > sm_module->super.coll_allgatherv = NULL; > > that means the coll sm module d

Re: [OMPI users] Setting coll_sm_priority = 35 didn't improve communication performance

2015-12-09 Thread Gilles Gouaillardet
Saliya, from ompi/mca/coll/sm/coll_sm_module.c in mca_coll_sm_comm_query() sm_module->super.coll_allgatherv = NULL; that means the coll sm module does *not* implement allgatherv, so openmpi will use the next module (which is very likely the default module, that is why there is no performance i

Re: [OMPI users] Setting coll_sm_priority = 35 didn't improve communication performance

2015-12-09 Thread Saliya Ekanayake
I did this, but output is a bit unclear to me. For example it has lines like [j-053:221827] mca: base: components_register: found loaded component sm and in the same node, same process reports, [j-053:221827] coll:find_available: coll component sm is not available Does this mean SM is not avail

Re: [OMPI users] Setting coll_sm_priority = 35 didn't improve communication performance

2015-12-09 Thread Aurélien Bouteiller
Try to run with coll_base_verbose 1000, just to see what collective module got effectively loaded. Aurélien -- Aurélien Bouteiller, Ph.D. ~~ https://icl.cs.utk.edu/~bouteill/ > Le 9 déc. 2015 à 09:53, Saliya Ekanayake a écrit : > > Hi, > > In a previous ema

[OMPI users] Setting coll_sm_priority = 35 didn't improve communication performance

2015-12-09 Thread Saliya Ekanayake
Hi, In a previous email, I wanted to know how to enable shared memory collectives and I was told setting the coll_sm_priority to anything over 30 should do it. I tested this for a microbenchmark on allgatherv, but it didn't improve performance over the default setting. See below, where I tested f

Re: [OMPI users] OMPIO correctnes issues

2015-12-09 Thread Edgar Gabriel
ok, I can confirm that once I update the file_get_position function to what we have in master and the 2.x series, your test passes with ompio in the 1.10 series as well. I am happy to provide a patch for testing, and to submit a pr. I am however worried since we know that ompio in the 1.10 seri

Re: [OMPI users] OMPIO correctnes issues

2015-12-09 Thread Edgar Gabriel
ok, forget it, I found the issue. I totally forgot that in the 1.10 series I have to manually force ompio ( it is the default on master and 2.x). It fails now for me as well with v1.10, will elt you know what I find. Thanks Edgar On 12/9/2015 9:30 AM, Edgar Gabriel wrote: what does the mount

Re: [OMPI users] OMPIO correctnes issues

2015-12-09 Thread Edgar Gabriel
what does the mount command return? On 12/9/2015 9:27 AM, Paul Kapinos wrote: Dear Edgar, On 12/09/15 16:16, Edgar Gabriel wrote: I tested your code in master and v1.10 ( on my local machine), and I get for both version of ompio exactly the same (correct) output that you had with romio. I've

Re: [OMPI users] OMPIO correctnes issues

2015-12-09 Thread Paul Kapinos
Dear Edgar, On 12/09/15 16:16, Edgar Gabriel wrote: I tested your code in master and v1.10 ( on my local machine), and I get for both version of ompio exactly the same (correct) output that you had with romio. I've tested it at local hard disk.. pk224850@cluster:/tmp/pk224850/cluster_15384/T

Re: [OMPI users] OMPIO correctnes issues

2015-12-09 Thread Edgar Gabriel
Paul, I tested your code in master and v1.10 ( on my local machine), and I get for both version of ompio exactly the same (correct) output that you had with romio. However, I also noticed that in the ompio version that is in the v1.10 branch, the MPI_File_get_size function is not implemented o

Re: [OMPI users] OMPIO correctnes issues

2015-12-09 Thread Edgar Gabriel
I will look at your test case and see what is going on in ompio. That being said, the vast number of fixes and improvements that went into ompio over the last two years were not back ported to the 1.8 (and thus 1.10) series, since it would have required changes to the interfaces of the framewor

Re: [OMPI users] OMPIO correctnes issues

2015-12-09 Thread Paul Kapinos
Sorry, forgot to mention: 1.10.1 Open MPI: 1.10.1 Open MPI repo revision: v1.10.0-178-gb80f802 Open MPI release date: Nov 03, 2015 Open RTE: 1.10.1 Open RTE repo revision: v1.10.0-178-gb80f802 Open RTE release date: Nov 03, 2015 OPAL:

Re: [OMPI users] OMPIO correctnes issues

2015-12-09 Thread Gilles Gouaillardet
Paul, which OpenMPI version are you using ? thanks for providing a simple reproducer, that will make things much easier from now. (and at first glance, that might not be a very tricky bug) Cheers, Gilles On Wednesday, December 9, 2015, Paul Kapinos wrote: > Dear Open MPI developers, > did OM

[OMPI users] OMPIO correctnes issues

2015-12-09 Thread Paul Kapinos
Dear Open MPI developers, did OMPIO (1) reached 'usable-stable' state? As we reported in (2) we had some trouble in building Open MPI with ROMIO, which fact was hidden by OMPIO implementation stepping into the MPI_IO breach. The fact 'ROMIO isn't AVBL' was detected after users complained 'MPI_I

Re: [OMPI users] Odd behavior with subarray datatypes in OpenMPI 1.10.1

2015-12-09 Thread Gilles Gouaillardet
Daniel, your program works fine with mpich, and this is very likely an OpenMPI bug here is an intermediate patch that solves your problem, but i still have to fully test Best regards, Gilles On 12/9/2015 2:56 AM, GARMANN, DANIEL J DR-02 USAF AFMC AFRL/RQVA wrote: Hello all, I've noticed a