So I reworked the idea and got it working.
I also got it compiled.
The non-standard flags are now with OMPI_ while the standard has MPI_.
I also added to more split types.
The manual is also updated.

>> Note to devs:
I had problems right after the autogen.pl script.
Procedure:
$> git clone .. ompi
$> cd ompi
$> ./autogen.pl
My build versions:
m4: 1.4.17
automake: 1.14
autoconf: 2.69
libtool: 2.4.3
the autogen completes successfully (attached is the autogen output if
needed)
$> mkdir build
$> cd build
$> ../configure --with-platform=optimized
I have attached the config.log (note that I have tested it with both the
shipped 1.9.1 and 1.10.0 hwloc)
$> make all
Error message is:
make[2]: Entering directory '/home/nicpa/test/build/opal/libltdl'
CDPATH="${ZSH_VERSION+.}:" && cd ../../../opal/libltdl && /bin/bash
/home/nicpa/test/config/missing aclocal-1.14 -I ../../config
aclocal-1.14: error: ../../config/autogen_found_items.m4:308: file
'opal/mca/backtrace/configure.m4' does not exist
this error message is the same as found:
http://www.open-mpi.org/community/lists/devel/2013/07/12504.php
My work-around is simple
It has to do with the created ACLOCAL_AMFLAGS variable
in build/opal/libltdl/Makefile
OLD:
ACLOCAL_AMFLAGS = -I ../../config
CORRECT:
ACLOCAL_AMFLAGS = -I ../../
Either the configure script creates the wrong include paths for the m4
scripts, or the m4 scripts are not copied fully to the config directory.
Ok, it works and the fix is simple. I just wonder why?
<< End note to devs

First here is my test system 1:
$> hwloc-info
depth 0: 1 Machine (type #1)
depth 1: 1 Socket (type #3)
depth 2: 1 L3Cache (type #4)
depth 3: 2 L2Cache (type #4)
depth 4: 2 L1dCache (type #4)
depth 5: 2 L1iCache (type #4)
depth 6: 2 Core (type #5)
depth 7: 4 PU (type #6)
Special depth -3: 2 Bridge (type #9)
Special depth -4: 4 PCI Device (type #10)
Special depth -5: 5 OS Device (type #11)
and my test system 2:
depth 0: 1 Machine (type #1)
depth 1: 1 Socket (type #3)
depth 2: 1 L3Cache (type #4)
depth 3: 4 L2Cache (type #4)
depth 4: 4 L1dCache (type #4)
depth 5: 4 L1iCache (type #4)
depth 6: 4 Core (type #5)
depth 7: 8 PU (type #6)
Special depth -3: 3 Bridge (type #9)
Special depth -4: 3 PCI Device (type #10)
Special depth -5: 4 OS Device (type #11)

Here is an excerpt of what it can do (I have attached a fortran program
that creates a communicator using all types):

Example of MPI_Comm_Split_Type

Currently using 4 nodes.

Comm using CU Node: 2 local rank: 2 out of 4 ranks
Comm using CU Node: 3 local rank: 3 out of 4 ranks
Comm using CU Node: 1 local rank: 1 out of 4 ranks
Comm using CU Node: 0 local rank: 0 out of 4 ranks

Comm using Host Node: 0 local rank: 0 out of 4 ranks
Comm using Host Node: 2 local rank: 2 out of 4 ranks
Comm using Host Node: 3 local rank: 3 out of 4 ranks
Comm using Host Node: 1 local rank: 1 out of 4 ranks

Comm using Board Node: 2 local rank: 2 out of 4 ranks
Comm using Board Node: 3 local rank: 3 out of 4 ranks
Comm using Board Node: 1 local rank: 1 out of 4 ranks
Comm using Board Node: 0 local rank: 0 out of 4 ranks

Comm using Node Node: 0 local rank: 0 out of 4 ranks
Comm using Node Node: 1 local rank: 1 out of 4 ranks
Comm using Node Node: 2 local rank: 2 out of 4 ranks
Comm using Node Node: 3 local rank: 3 out of 4 ranks

Comm using Shared Node: 0 local rank: 0 out of 4 ranks
Comm using Shared Node: 3 local rank: 3 out of 4 ranks
Comm using Shared Node: 1 local rank: 1 out of 4 ranks
Comm using Shared Node: 2 local rank: 2 out of 4 ranks

Comm using Numa Node: 0 local rank: 0 out of 1 ranks
Comm using Numa Node: 2 local rank: 0 out of 1 ranks
Comm using Numa Node: 3 local rank: 0 out of 1 ranks
Comm using Numa Node: 1 local rank: 0 out of 1 ranks

Comm using Socket Node: 1 local rank: 0 out of 1 ranks
Comm using Socket Node: 2 local rank: 0 out of 1 ranks
Comm using Socket Node: 3 local rank: 0 out of 1 ranks
Comm using Socket Node: 0 local rank: 0 out of 1 ranks

Comm using L3 Node: 0 local rank: 0 out of 1 ranks
Comm using L3 Node: 3 local rank: 0 out of 1 ranks
Comm using L3 Node: 1 local rank: 0 out of 1 ranks
Comm using L3 Node: 2 local rank: 0 out of 1 ranks

Comm using L2 Node: 2 local rank: 0 out of 1 ranks
Comm using L2 Node: 3 local rank: 0 out of 1 ranks
Comm using L2 Node: 1 local rank: 0 out of 1 ranks
Comm using L2 Node: 0 local rank: 0 out of 1 ranks

Comm using L1 Node: 0 local rank: 0 out of 1 ranks
Comm using L1 Node: 1 local rank: 0 out of 1 ranks
Comm using L1 Node: 2 local rank: 0 out of 1 ranks
Comm using L1 Node: 3 local rank: 0 out of 1 ranks

Comm using Core Node: 0 local rank: 0 out of 1 ranks
Comm using Core Node: 3 local rank: 0 out of 1 ranks
Comm using Core Node: 1 local rank: 0 out of 1 ranks
Comm using Core Node: 2 local rank: 0 out of 1 ranks

Comm using HW Node: 2 local rank: 0 out of 1 ranks
Comm using HW Node: 3 local rank: 0 out of 1 ranks
Comm using HW Node: 1 local rank: 0 out of 1 ranks
Comm using HW Node: 0 local rank: 0 out of 1 ranks

This is the output on both systems (note that I in the first one
oversubscribe the node). I have not tested it on a cluster :(.
One thing that worries me is that the SOCKET and L3 cache split types are
not of size 4? I only have one socket, and one L3 cache, so they must be
sharing?
I am not so sure about NUMA in this case. If you need any more information
about my setup to debug this, please let me know.
Or am I completely missing something?

I tried looking into the opal/mca/hwloc/hwloc.h, but I have no idea whether
they are correct or not.

If you think, I can make a pull request at its current stage?


2014-11-27 13:22 GMT+00:00 Nick Papior Andersen <nickpap...@gmail.com>:

> No worries :)
>
> 2014-11-27 14:20 GMT+01:00 Jeff Squyres (jsquyres) <jsquy...@cisco.com>:
>
> Many thanks!
>>
>> Note that it's a holiday week here in the US -- I'm only on for a short
>> time here this morning; I'll likely disappear again shortly until next
>> week.  :-)
>>
>>
>>
>> On Nov 27, 2014, at 8:12 AM, Nick Papior Andersen <nickpap...@gmail.com>
>> wrote:
>>
>> > Sure, I will make the changes and commit to make them OMPI specific.
>> >
>> > I will post forward my problems on the devel list.
>> >
>> > I will keep you posted. :)
>> >
>> > 2014-11-27 13:58 GMT+01:00 Jeff Squyres (jsquyres) <jsquy...@cisco.com
>> >:
>> > On Nov 26, 2014, at 2:08 PM, Nick Papior Andersen <nickpap...@gmail.com>
>> wrote:
>> >
>> > > Here is my commit-msg:
>> > > "
>> > > We can now split communicators based on hwloc full capabilities up to
>> BOARD.
>> > > I.e.:
>> > > HWTHREAD,CORE,L1CACHE,L2CACHE,L3CACHE,SOCKET,NUMA,NODE,BOARD
>> > > where NODE is the same as SHARED.
>> > > "
>> > >
>> > > Maybe what I did could be useful somehow?
>> > > I mean to achieve the effect one could do:
>> > > comm_split_type(MPI_COMM_TYPE_Node,comm)
>> > > create new group from all nodes not belonging to this group.
>> > > This can even be more fine tuned if one wishes to create a group of
>> "master" cores on each node.
>> >
>> > I will say that there was a lot of debate about this kind of
>> functionality at the MPI Forum.  The problem is that although x86-based
>> clusters are quite common these days, they are not the only kind of
>> machines used for HPC out there, and the exact definitions of these kinds
>> of concepts (hwthread, core, lXcache, socket, numa, ...etc.) can vary
>> between architectures.
>> >
>> > Hence, the compromise was to just have MPI_COMM_TYPE_SHARED, where the
>> resulting communicator contains processes that share a single memory space.
>> >
>> > That being said, since OMPI uses hwloc for all of its supported
>> architectures, it might be worthwhile to have an OMPI extension for
>> OMPI_COMM_TYPE_<foo> for the various different types.  One could/should
>> only use these new constants if the OPEN_MPI macro is defined and is 1.
>> >
>> > And *that* being said, one of the goals of MPI is portability, so
>> anyone using these constants would inherently non-portable.  :-)
>> >
>> > > I have not been able to compile it due to my autogen.pl giving me
>> some errors.
>> >
>> > What kind of errors?  (we might want to move this discussion to the
>> devel list...)
>> >
>> > >  However, I think it should compile just fine.
>> > >
>> > > Do you think it could be useful?
>> > >
>> > > If interested you can find my, single commit branch, at:
>> https://github.com/zerothi/ompi
>> >
>> > This looks interesting.
>> >
>> > Can you file a pull requests against the ompi master, and send
>> something to the devel list about this functionality?
>> >
>> > I'd still strongly suggest renaming these constants to the "OMPI_" to
>> differentiate them from standard MPI constants / functionality.
>> >
>> > --
>> > Jeff Squyres
>> > jsquy...@cisco.com
>> > For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> >
>> > _______________________________________________
>> > users mailing list
>> > us...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> > Link to this post:
>> http://www.open-mpi.org/community/lists/users/2014/11/25878.php
>> >
>> >
>> >
>> > --
>> > Kind regards Nick
>> > _______________________________________________
>> > users mailing list
>> > us...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> > Link to this post:
>> http://www.open-mpi.org/community/lists/users/2014/11/25879.php
>>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2014/11/25880.php
>>
>
>
>
> --
> Kind regards Nick
>



-- 
Kind regards Nick

Attachment: autogen.out.bz2
Description: BZip2 compressed data

Attachment: config.log.bz2
Description: BZip2 compressed data

Attachment: comm_split.f90
Description: Binary data

Reply via email to