Re: [OMPI users] Troubles with linking C++ standard library with openmpi 1.10

2016-03-07 Thread Gilles Gouaillardet

Jordan,

do you really need vt ?
if no, a trivial workaround is to
configure --disable-vt ...

what is your configure command line ?
assuming g++ is your c++ compiler, what does g++ --version says ?

Cheers,

Gilles



On 3/7/2016 1:32 PM, Jordan Willis wrote:


Hi everyone,

I have tried everything to compile openmpi. It used to compile on my 
system, and I’m not sure what has changed in my c++ libraries to get 
this error. I get the following when trying to 
compile contrib/vt/vt/extlib/otf/tools/otfprofile


make[8]: Entering directory 
`/dnas/apps/openmpi/openmpi-1.10.2/ompi/contrib/vt/vt/extlib/otf/tools/otfprofile'

  CXXLD  otfprofile
otfprofile-collect_data.o: In function `std::string::_M_check(unsigned 
long, char const*) const':
/usr/include/c++/4.9/bits/basic_string.h:324: undefined reference to 
`std::__throw_out_of_range_fmt(char const*, ...)'
otfprofile-create_latex.o: In function `std::string::_M_check(unsigned 
long, char const*) const':
/usr/include/c++/4.9/bits/basic_string.h:324: undefined reference to 
`std::__throw_out_of_range_fmt(char const*, ...)'
/usr/include/c++/4.9/bits/basic_string.h:324: undefined reference to 
`std::__throw_out_of_range_fmt(char const*, ...)'
otfprofile-create_filter.o: In function 
`std::string::_M_check(unsigned long, char const*) const':
/usr/include/c++/4.9/bits/basic_string.h:324: undefined reference to 
`std::__throw_out_of_range_fmt(char const*, ...)'
otfprofile-create_filter.o: In function 
`std::vector*, std::allocator*> 
>::_M_range_check(unsigned long) const':
/usr/include/c++/4.9/bits/stl_vector.h:803: undefined reference to 
`std::__throw_out_of_range_fmt(char const*, ...)'
otfprofile-create_filter.o:/usr/include/c++/4.9/bits/stl_vector.h:803: 
more undefined references to `std::__throw_out_of_range_fmt(char 
const*, ...)' follow

collect2: error: ld returned 1 exit status
make[8]: *** [otfprofile] Error 1

If I look online, it may be due to trying to use gcc-4.8 functions in 
an 4.9 compiler. So I have tried switching to 4.8 just to check. They 
also say you may have to update your toolchain to force GCC-4.9 
although I’m not sure I know how to do this. I have also tried 
compiling openmpi1.8 (last stable) and get the same error. I have also 
reinstalled all my packages using aptitude.


The reason I’m trying to do a custom compile is because I’m trying to 
build the pmi libraries that come with SLURM, although I get the same 
error on a basic configuration.


I’m on ubuntu server 14.04. I think I have exhausted my 
troubleshooting ideas and I’m reaching out to you. My configuration 
log can be sent at request, but the attachment causes my message to 
get bounced from the list.


Thanks so much,
Jordan




___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2016/03/28648.php




Re: [OMPI users] Troubles with linking C++ standard library with openmpi 1.10

2016-03-07 Thread Jordan Willis
Thanks Giles, 

Looks like it works with excluding VT. I was using g++ —version 4.9 btw. 

Thank you,
Jordan
> On Mar 6, 2016, at 9:05 PM, Gilles Gouaillardet  wrote:
> 
> Jordan,
> 
> do you really need vt ?
> if no, a trivial workaround is to
> configure --disable-vt ...
> 
> what is your configure command line ?
> assuming g++ is your c++ compiler, what does g++ --version says ?
> 
> Cheers,
> 
> Gilles
> 
> 
> 
> On 3/7/2016 1:32 PM, Jordan Willis wrote:
>> 
>> Hi everyone,
>> 
>> I have tried everything to compile openmpi. It used to compile on my system, 
>> and I’m not sure what has changed in my c++ libraries to get this error. I 
>> get the following when trying to compile 
>> contrib/vt/vt/extlib/otf/tools/otfprofile
>> 
>> make[8]: Entering directory 
>> `/dnas/apps/openmpi/openmpi-1.10.2/ompi/contrib/vt/vt/extlib/otf/tools/otfprofile'
>>   CXXLDotfprofile
>> otfprofile-collect_data.o: In function `std::string::_M_check(unsigned long, 
>> char const*) const':
>> /usr/include/c++/4.9/bits/basic_string.h:324: undefined reference to 
>> `std::__throw_out_of_range_fmt(char const*, ...)'
>> otfprofile-create_latex.o: In function `std::string::_M_check(unsigned long, 
>> char const*) const':
>> /usr/include/c++/4.9/bits/basic_string.h:324: undefined reference to 
>> `std::__throw_out_of_range_fmt(char const*, ...)'
>> /usr/include/c++/4.9/bits/basic_string.h:324: undefined reference to 
>> `std::__throw_out_of_range_fmt(char const*, ...)'
>> otfprofile-create_filter.o: In function `std::string::_M_check(unsigned 
>> long, char const*) const':
>> /usr/include/c++/4.9/bits/basic_string.h:324: undefined reference to 
>> `std::__throw_out_of_range_fmt(char const*, ...)'
>> otfprofile-create_filter.o: In function `std::vector*, 
>> std::allocator*> >::_M_range_check(unsigned long) const':
>> /usr/include/c++/4.9/bits/stl_vector.h:803: undefined reference to 
>> `std::__throw_out_of_range_fmt(char const*, ...)'
>> otfprofile-create_filter.o:/usr/include/c++/4.9/bits/stl_vector.h:803: more 
>> undefined references to `std::__throw_out_of_range_fmt(char const*, ...)' 
>> follow
>> collect2: error: ld returned 1 exit status
>> make[8]: *** [otfprofile] Error 1
>> 
>> If I look online, it may be due to trying to use gcc-4.8 functions in an 4.9 
>> compiler. So I have tried switching to 4.8 just to check. They also say you 
>> may have to update your toolchain to force GCC-4.9 although I’m not sure I 
>> know how to do this. I have also tried compiling openmpi1.8 (last stable) 
>> and get the same error. I have also reinstalled all my packages using 
>> aptitude.
>> 
>> The reason I’m trying to do a custom compile is because I’m trying to build 
>> the pmi libraries that come with SLURM, although I get the same error on a 
>> basic configuration.
>> 
>> I’m on ubuntu server 14.04. I think I have exhausted my troubleshooting 
>> ideas and I’m reaching out to you. My configuration log can be sent at 
>> request, but the attachment causes my message to get bounced from the list. 
>> 
>> Thanks so much,
>> Jordan
>> 
>> 
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org 
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>> 
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2016/03/28648.php 
>> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2016/03/28649.php



Re: [OMPI users] MPI_INIT gets stuck

2016-03-07 Thread Marco Atzeri



On 06/03/2016 10:06, Marco Lubosch wrote:

Hello guys,

I try to do the first steps with Open MPI and I finally got it work on
Cygwin64(Windows 7 64bit).
I am able to compile plain C code without any issues via "mpicc ..." but
when I try to initialize MPI the program is getting stuck within
"MPI_INIT" without creating CPU load. Example from
https://svn.open-mpi.org/source/xref/ompi_1.8/examples/:

#include 
#include "mpi.h"
int main(int argc, char* argv[])
{
 int rank, size, len;
 char version[MPI_MAX_LIBRARY_VERSION_STRING];
 printf("1\n");
 MPI_Init(&argc, &argv);
 printf("2\n");
 MPI_Comm_rank(MPI_COMM_WORLD, &rank);
 printf("3\n");
 MPI_Comm_size(MPI_COMM_WORLD, &size);
 printf("4\n");
 MPI_Get_library_version(version, &len);
 printf("5\n");
 printf("Hello, world, I am %d of %d, (%s, %d)\n", rank, size,
version, len);
 MPI_Finalize();
 printf("6\n");
 return 0;
}

Compiling works perfectly fine with "mpicc -o hello_c.exe hello_c.c".
But when I run it with "mpirun -np 4 ./hello_c" it creates 4 threads
printing "1" but then keeps on running without doing anything. I then
have to kill the threads manually to keep on working with Cygwin.

Can you tell me what I am doing wrong?

Thanks
Marco

PS: Installed packages on Cygwin are libopenmpi, libopenmpi-devel,
openmpi, gcc-core





It seems a local issue. On my W7 64 bit:

$ mpirun -n 4 ./prova_mpi.exe
1
1
1
1
2
3
4
5
Hello, world, I am 0 of 4, (Open MPI v1.8.8, .., Aug 05, 2015, 126)
2
3
4
5
Hello, world, I am 2 of 4, (Open MPI v1.8.8, package: ..., Aug 05, 2015, 
126)

2
3
4
5
Hello, world, I am 1 of 4, (Open MPI v1.8.8, ... , Aug 05, 2015, 126)
2
3
4
5
Hello, world, I am 3 of 4, (Open MPI v1.8.8, ... , Aug 05, 2015, 126)
6
6
6
6



Re: [OMPI users] openmpi bug on mac os 10.11.3 ?

2016-03-07 Thread Nathan Hjelm

Just want to point you you do not want to

#include 
"/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/usr/include/machine/types.h"

instead

#include 

If you want to use a different OS X SDK use -isysroot.

-Nathan

On Sat, Mar 05, 2016 at 09:38:17PM +0100, Hans-Jürgen Greif wrote:
>Hello Jeff Squyres,
>I' m using: openmpi-1.10.2.tar.gz
>The file first.c:
>#include "mpi.h"
>#include 
>#include 
>#include 
>//#include 
>#include
>
> "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/usr/include/machine/types.h"
>//using namespace std;
>int main(int argc, char **argv )
>{
>  char text[20];
>  int myrank, size, sender=0, adressat=1, tag=99;
>  MPI_Status status;
>  MPI_Init( & argc, & argv );
>  MPI_Comm_rank(MPI_COMM_WORLD, & myrank);
>  MPI_Comm_size(MPI_COMM_WORLD, & size); 
>  printf(" myrank = %d \n",myrank);
>  printf(" size = %d \n",  size);
>  //printf("Hello, world! "
>  // "from process %d of %d\n", myrank, size);
>  if (size > 2 ) 
>  {
>   printf("Beispiel fu:r 2 Tasks \n");
>  MPI_Finalize();
>   exit(1);
>  }
>  if ( myrank == 0 )
>  {
>strcpy(text, "Hallo zusammen");
>MPI_Send(text, strlen(text), MPI_CHAR, adressat, tag, MPI_COMM_WORLD);
>  }
>  else 
>  {
>MPI_Recv(text, 20, MPI_CHAR, sender, tag, MPI_COMM_WORLD, & status);  
> 
>printf("Task %d empfing: %s: \n", myrank, text); 
>  }
> 
>  
> 
>  MPI_Finalize(); 
>  return 0;
>}
>is working, but it is much simpler.  On opensuse 13.2 second runs fine.
>Test second.c on your machine. I cannot debug it.
>Regards,
>Hans-Juergen Greif
> 
>Hans-Ju:rgen Greif
>hans_juergen.gr...@kabelbw.de
> 
>  Am 05.03.2016 um 16:54 schrieb Jeff Squyres (jsquyres)
>  :
>  What version of Open MPI are you using?
> 
>  Can you send all the information listed here:
> 
> https://www.open-mpi.org/community/help/
> 
>On Mar 5, 2016, at 5:35 AM, Hans-Ju:rgen Greif
> wrote:
> 
>Hello,
> 
>on mac os 10.11.3 I have found an error:
> 
>mpirun -np 2 valgrind ./second
>==612== Memcheck, a memory error detector
>==612== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et
>al.
>==612== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright
>info
>==612== Command: ./second
>==612==
>==611== Memcheck, a memory error detector
>==611== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et
>al.
>==611== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright
>info
>==611== Command: ./second
>==611==
>--612-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option
>--611-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option
>--612-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated
>2 times)
>--611-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated
>2 times)
>--611-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated
>4 times)
>--612-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated
>4 times)
>--611-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated
>8 times)
>--612-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated
>8 times)
>==612== Conditional jump or move depends on uninitialised value(s)
>==611== Conditional jump or move depends on uninitialised value(s)
>==611==at 0x10BED: main (second.c:39)
>==611==
>==612==at 0x10D1C: main (second.c:60)
>==612==
>==611== Conditional jump or move depends on uninitialised value(s)
>==611==at 0x100060781: MPI_Win_post (in
>/usr/local/openmpi/lib/libmpi.12.dylib)
>==611==by 0x10C69: main (second.c:43)
>==611==
>==611== Conditional jump or move depends on uninitialised value(s)
>==611==at 0x100413E98: __ultoa (in
>/usr/lib/system/libsystem_c.dylib)
>==611==by 0x10041136C: __vfprintf (in
>/usr/lib/system/libsystem_c.dylib)
>==611==by 0x1004396C8: __v2printf (in
>/usr/lib/system/libsystem_c.dylib)
>==611==by 0x10040EF51: _vasprintf (in
>/usr/lib/system/libsystem_c.dylib)
>==611==by 0x1001C379E: opal_show_help_vstring (in
>/usr/local/openmpi/lib/libopen-pal.13.dylib)
>==611==by 0x100128231: orte_show_help (in
>/usr/local/openmpi/lib/libopen-rte.12.dylib)
>==611==by 0x10002069E: backend_fatal (in
>/usr/local/openmpi/lib/libmpi.12.dylib)
>==611==by 0x1000

Re: [OMPI users] MPI_INIT gets stuck

2016-03-07 Thread Marco Lubosch

Thanks Marco,

I reinstalled Cygwin and OMPI like 10 times. I had an issues with 
gcc(mingw) because it was preinstalled under windows. I then had to 
remove it and reinstall gcc under cygwin and got it working but as I 
said only copiling plain C code with "mpicc". I also disabled Windows 
Firewall and tried a different router.


Do you have any suggestions what could cause that problem?

Greetings
Marco

Am 07.03.2016 um 15:26 schrieb Marco Atzeri:



On 06/03/2016 10:06, Marco Lubosch wrote:

Hello guys,

I try to do the first steps with Open MPI and I finally got it work on
Cygwin64(Windows 7 64bit).
I am able to compile plain C code without any issues via "mpicc ..." but
when I try to initialize MPI the program is getting stuck within
"MPI_INIT" without creating CPU load. Example from
https://svn.open-mpi.org/source/xref/ompi_1.8/examples/:

#include 
#include "mpi.h"
int main(int argc, char* argv[])
{
 int rank, size, len;
 char version[MPI_MAX_LIBRARY_VERSION_STRING];
 printf("1\n");
 MPI_Init(&argc, &argv);
 printf("2\n");
 MPI_Comm_rank(MPI_COMM_WORLD, &rank);
 printf("3\n");
 MPI_Comm_size(MPI_COMM_WORLD, &size);
 printf("4\n");
 MPI_Get_library_version(version, &len);
 printf("5\n");
 printf("Hello, world, I am %d of %d, (%s, %d)\n", rank, size,
version, len);
 MPI_Finalize();
 printf("6\n");
 return 0;
}

Compiling works perfectly fine with "mpicc -o hello_c.exe hello_c.c".
But when I run it with "mpirun -np 4 ./hello_c" it creates 4 threads
printing "1" but then keeps on running without doing anything. I then
have to kill the threads manually to keep on working with Cygwin.

Can you tell me what I am doing wrong?

Thanks
Marco

PS: Installed packages on Cygwin are libopenmpi, libopenmpi-devel,
openmpi, gcc-core





It seems a local issue. On my W7 64 bit:

$ mpirun -n 4 ./prova_mpi.exe
1
1
1
1
2
3
4
5
Hello, world, I am 0 of 4, (Open MPI v1.8.8, .., Aug 05, 2015, 126)
2
3
4
5
Hello, world, I am 2 of 4, (Open MPI v1.8.8, package: ..., Aug 05, 
2015, 126)

2
3
4
5
Hello, world, I am 1 of 4, (Open MPI v1.8.8, ... , Aug 05, 2015, 126)
2
3
4
5
Hello, world, I am 3 of 4, (Open MPI v1.8.8, ... , Aug 05, 2015, 126)
6
6
6
6

___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2016/03/28651.php







Re: [OMPI users] MPI_INIT gets stuck

2016-03-07 Thread Marco Atzeri

On 07/03/2016 18:58, Marco Lubosch wrote:

Thanks Marco,

I reinstalled Cygwin and OMPI like 10 times. I had an issues with
gcc(mingw) because it was preinstalled under windows. I then had to
remove it and reinstall gcc under cygwin and got it working but as I
said only copiling plain C code with "mpicc". I also disabled Windows
Firewall and tried a different router.

Do you have any suggestions what could cause that problem?

Greetings
Marco



Does it works without networks ?
In the past I saw issues with virtual network drivers.

In addition as mentioned on
https://cygwin.com/problems.html

"Run cygcheck -s -v -r > cygcheck.out and include that file as an 
attachment in your report. Please do not compress or otherwise encode 
the output. Just attach it as a straight text file so that it can be 
easily viewed."


send me a copy of cygcheck.out, I will look for possible cygwin problem.

Regards
Marco





Re: [OMPI users] Nonblocking neighborhood collectives with distributed graph creation

2016-03-07 Thread Jun Kudo
Giles,
Thanks for the small bug fix.  It helped clear up that test case but I'm
again running into another segmentation fault on a more complicated problem.

I've attached another 'working' example.  This time I am using the
MPI_Ineighbor_alltoallw on a triangular topology;  node 0 communicates
bi-directionally with nodes 1 and 2, node 1 with nodes 0 and 2, and node 2
with node 0 and 1.  Each node is sending one double (with value my_rank) to
each of its neighbors.

The code has two different calls to the MPI API that only differ in the
receive buffer arguments.  In both versions, I am sending from and
receiving into the same static array.  In the working (non-segfaulting)
version, I am receiving into the latter half of the array by pointing to
the start of the second half (&send_number[2]) and specifying displacements
of 0 and 8 bytes.  In the segfaulting version, I am again receiving into
the latter half of the array by pointing to the start of the array
(send_number) with displacements of 16 to 24 bytes.

The program run with the command 'mpirun -n 3
./simpleneighborhood_multiple' compiled with the latest OpenMPI  (1.10.2) +
patch encounters a segmentation fault when receiving using the latter
commands.  The same program compiled with MPICH (3.2) runs without any
problems and with the expected results.

Let me know if I'm screwing anything up.  Thanks for the help.

Sincerely,
Jun


On Mon, Feb 29, 2016 at 9:34 PM, Gilles Gouaillardet 
wrote:

> Thanks for the report and the test case,
>
> this is a bug and i pushed a commit to master.
> for the time being, you can download a patch for v1.10 at
> https://github.com/ggouaillardet/ompi-release/commit/4afdab0aa86e5127767c4dfbdb763b4cb641e37a.patch
>
> Cheers,
>
> Gilles
>
>
> On 3/1/2016 12:17 AM, Jun Kudo wrote:
>
> Hello,
> I'm trying to use the neighborhood collective communication capabilities
> (MPI_Ineighbor_x) of MPI coupled with the distributed graph constructor
> (MPI_Dist_graph_create_adjacent) but I'm encountering a segmentation fault
> on a test case.
>
> I have attached a 'working' example where I create a MPI communicator with
> a simple distributed graph topology where Rank 0 contains Node 0 that
> communicates bi-directionally (receiving from and sending to) with Node 1
> located on Rank 1.  I then attempt to send integer messages using the
> neighborhood collective MPI_Ineighbor_alltoall.  The program run with the
> command 'mpirun -n 2 ./simpleneighborhood' compiled with the latest
> OpenMPI  (1.10.2) encounters a segmentation fault during the non-blocking
> call.  The same program compiled with MPICH (3.2) runs without any problems
> and with the expected results.  To muddy the waters a little more, the same
> program compiled with OpenMPI but using the blocking neighborhood
> collective, MPI_Neighbor_alltoall, seems to run just fine as well.
>
> I'm not really sure at this point if I'm making a simple mistake in the
> construction of my test or if something is more fundamentally wrong.  I
> would appreciate any insight into my problem!
>
> Thanks ahead of the time for help and let me know if I can provide anymore
> information.
>
> Sincerely,
> Jun
>
>
> ___
> users mailing listus...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2016/02/28608.php
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/02/28610.php
>
#include 
#include 

int main (int argc, char* argv[]) {
  MPI_Init(nullptr, nullptr);
  //--> Connect graph to my mpi communicator
  bool reorder = false;
  int indegree  = 2;
  int outdegree = 2;
  int sources[indegree];
  int sweights[indegree];
  int destinations[outdegree];
  int dweights[outdegree];
  int my_rank;
  MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); //get my rank
  if (my_rank == 0) {
sources[0] = 1;
sources[1] = 2;
sweights[0] = 1;
sweights[1] = 1;

destinations[0] = 1;
destinations[1] = 2;
dweights[0] = 1;
dweights[1] = 1;

  }else if (my_rank == 1) {
sources[0] = 0;
sources[1] = 2;
sweights[0] = 1;
sweights[1] = 1;

destinations[0] = 0;
destinations[1] = 2;
dweights[0] = 1;
dweights[1] = 1;
  }else if (my_rank == 2) {
sources[0] = 0;
sources[1] = 1;
sweights[0] = 1;
sweights[1] = 1;

destinations[0] = 0;
destinations[1] = 1;
dweights[0] = 1;
dweights[1] = 1;
  }

  MPI_Info mpi_info = MPI_INFO_NULL;
  MPI_Info_create(&mpi_info);
  MPI_Comm mpi_comm_with_graph;
  MPI_Dist_graph_create_adjacent(MPI_COMM_WORLD, indegree, sources,
 sweights, outdegree,
 destinations, dweights,
 mpi_info, reorder, &mp

Re: [OMPI users] Nonblocking neighborhood collectives with distributed graph creation

2016-03-07 Thread Gilles Gouaillardet

Jun,

a patch is available at 
https://github.com/ggouaillardet/ompi-release/commit/f277beace9fbe8dd71f733602b5d4b0344d77a29.patch

this is not a bulletproof one, but it does fix your problem.

in this case, MPI_Ineighbor_alltoallw is invoked with sendbuf == 
recvbuf, and internally,
libnbc considers this is an in place alltoall, and hence allocate a 
temporary buffer

(that is now (almost) correctly used with this patch)
this is suboptimal, since even if sendbuf == recvbuf, the displacements 
you use ensure there

is no overlap.

bottom line, this patch does fix your problem, but because of the libnbc 
internals, the second MPI_Ineighbor_alltoallw is suboptimal (assuming 
such a call is allowed by the standard)


master does things differently, and there is no such bug here.


George,

is it valid (per the MPI standard) to invoke MPI_Ineighbor_alltoallw 
with sendbuf == recvbuf ?


bonus question :
what if we have sendbuf != recvbuf, but the data overlap because of the 
displacements ?

for example :
int buf[1];
MPI_Ineighbor_alltoallw(buf, 1, {0}, MPI_INT, buf+1, 1, {-4}, MPI_INT, 
MPI_COMM_WORLD, &request)
is this allowed per the MPI standard ? if yes, then the implementation 
should figure this out, and i am pretty sure it does not currently.


Cheers,

Gilles


On 3/8/2016 9:18 AM, Jun Kudo wrote:

Giles,
Thanks for the small bug fix.  It helped clear up that test case but 
I'm again running into another segmentation fault on a more 
complicated problem.


I've attached another 'working' example.  This time I am using the 
MPI_Ineighbor_alltoallw on a triangular topology; node 0 communicates 
bi-directionally with nodes 1 and 2, node 1 with nodes 0 and 2, and 
node 2 with node 0 and 1.  Each node is sending one double (with value 
my_rank) to each of its neighbors.


The code has two different calls to the MPI API that only differ in 
the receive buffer arguments.  In both versions, I am sending from and 
receiving into the same static array.  In the working 
(non-segfaulting) version, I am receiving into the latter half of the 
array by pointing to the start of the second half (&send_number[2]) 
and specifying displacements of 0 and 8 bytes.  In the segfaulting 
version, I am again receiving into the latter half of the array by 
pointing to the start of the array (send_number) with displacements of 
16 to 24 bytes.


The program run with the command 'mpirun -n 3 
./simpleneighborhood_multiple' compiled with the latest OpenMPI 
(1.10.2) + patch encounters a segmentation fault when receiving using 
the latter commands.  The same program compiled with MPICH (3.2) runs 
without any problems and with the expected results.


Let me know if I'm screwing anything up.  Thanks for the help.

Sincerely,
Jun


On Mon, Feb 29, 2016 at 9:34 PM, Gilles Gouaillardet 
mailto:gil...@rist.or.jp>> wrote:


Thanks for the report and the test case,

this is a bug and i pushed a commit to master.
for the time being, you can download a patch for v1.10 at

https://github.com/ggouaillardet/ompi-release/commit/4afdab0aa86e5127767c4dfbdb763b4cb641e37a.patch

Cheers,

Gilles


On 3/1/2016 12:17 AM, Jun Kudo wrote:

Hello,
I'm trying to use the neighborhood collective communication
capabilities (MPI_Ineighbor_x) of MPI coupled with the
distributed graph constructor (MPI_Dist_graph_create_adjacent)
but I'm encountering a segmentation fault on a test case.

I have attached a 'working' example where I create a MPI
communicator with a simple distributed graph topology where Rank
0 contains Node 0 that communicates bi-directionally (receiving
from and sending to) with Node 1 located on Rank 1.  I then
attempt to send integer messages using the neighborhood
collective MPI_Ineighbor_alltoall. The program run with the
command 'mpirun -n 2 ./simpleneighborhood' compiled with the
latest OpenMPI  (1.10.2) encounters a segmentation fault during
the non-blocking call.  The same program compiled with MPICH
(3.2) runs without any problems and with the expected results. 
To muddy the waters a little more, the same program compiled with

OpenMPI but using the blocking neighborhood collective,
MPI_Neighbor_alltoall, seems to run just fine as well.

I'm not really sure at this point if I'm making a simple mistake
in the construction of my test or if something is more
fundamentally wrong.  I would appreciate any insight into my
problem!

Thanks ahead of the time for help and let me know if I can
provide anymore information.

Sincerely,
Jun


___
users mailing list
us...@open-mpi.org 
Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/02/28608.php



___
users mailing list
us...@open-mpi.org