Re: [OMPI users] error from MPI_Allgather
Jeff Squyres cisco.com> writes: > > Two things: > > 1. That looks like an MPICH error message (i.e., it's not from Open MPI -- Open MPI and MPICH2 are entirely > different software packages with different developers and behaviors). You might want to contact them > for more specific details. > > 2. That being said, it looks like you used the same buffer for both the sbuf and rbuf. MPI does not allow you to > do that; you need to specify different buffers for those arguments. > > Hi Jeff, Thank you for your reply. The problem occurs with openmpi. I could understand the problem as you said in the reply. But how can I set different buffers for them? thank you Rajesh
[OMPI users] error from MPI_Allgather
Hello, I have some error while using mpirun. Could anyone please help me solve this. I googled this and found some, but too technical for me since I am not so familiar with mpi programs. Is this due to some installation problem or the program which I run? Fatal error in PMPI_Allgather: Invalid buffer pointer, error stack: PMPI_Allgather(958): MPI_Allgather(sbuf=0x6465d30, scount=1, MPI_INTEGER, rbuf=0x6465d30, rcount=1, MPI_INTEGER, MPI_COMM_WORLD) failed PMPI_Allgather(931): Buffers must not be aliased Thank you very much -- **Dr. Rajesh J. Postdoctoral Research Associate, Center for Global Environmental Research, National Institute for Environmental Studies, 16-2 Onogawa, Tsukuba, Ibaraki, 305-8506 Japan
[OMPI users] MPI tests
Hello, I am looking for MPI tests, that tests performance, not just basic features. What kind of tests can I add to MTT (OpenMPI testing tool)? Where can I find open source tests to test OpenMPI performance? Any information you can provide will be helpful. Thanks! Shans
[OMPI users] Reinitialize MPI_COMM_WORLD
Hi, I have simple MPI program that uses MPI_comm_spawn to create additional child processes. Using MPI_Intercomm_merge, I merge the child and the parent communicator resulting in a single expanded user defined intracommunicator. I know MPI_COMM_WORLD is a constant which is statically initialized during MPI_Init call. But is there a way to update the value of MPI_COMM_WORLD at runtime to reflect this expanded set of processes? Is it possible to some how reinitialize MPI_COMM_WORLD using ompi_comm_init() function? Regards, Rajesh
[OMPI users] MPI_Finalize segmentation fault with MPI_Intercomm_merge
Hi, I am trying to write a simple code which does the following - A master process running on 'n' processors spawn 4 processes using the MPI_Comm_spawn_multiple command. After spawning, the intercommunicator between the master and the spawned processes are merged using MPI_Intercomm_merge to create a new common intracommunicator for the expanded set of processes. These steps are repeated over a loop - execute the master processes, spawn new processes, and merge the communicators to get a global communicator. In this example the new processes are always spawned on the same 4 nodes. After the loop is completed, when I call MPI_Finalize, I get a segmentation fault. I do not get segmentation fault if I run the loop only once, i.e.callMPI_Intercomm_merge only once. Is there something wrong with my program or is it a known issue with MPI_Intercomm_merge when called multiple times? I have pasted the sample code below. It has 3 files - master.c, spawn.c, hello.c. I will be glad to clarify if anything looks confusing. Any help will be appreciated. // Master function. This function calls MPI_Comm_spawn_multiple the first time. (master.c) /*/ /*Global variables*/ MPI_Comm grid_comm; /*new global communicator after spawning*/ int loop=0; /*number of iterations*/ int newprocess = -1; /* Variable to identify whether the current process is an old process or a spawned process.*/ int main (int argc, char ** argv) { int size,rank; MPI_Init(&argc, &argv); grid_comm=MPI_COMM_WORLD; newprocess=0; for( ;loop < 2; loop++){ fprintf(stdout,"\n\nLOOP in main =%d\n",loop); mpicomm_spawn(); /*Broacasting the loop value to spawned processes so that the new processes join the next iteration with the correct loop value.*/ MPI_Bcast(&loop,1,MPI_INT,0,grid_comm); MPI_Comm_size(grid_comm, &size); MPI_Comm_rank(grid_comm, &rank); } fprintf(stdout,"Exiting...main..rank=%d\n",rank); fflush(stdout); MPI_Barrier(grid_comm); MPI_Comm_free(&grid_comm); MPI_Finalize(); } /**/ Spawning function (spawn.c) // extern MPI_Comm grid_comm; int mpicomm_spawn() { MPI_Comm parent, intercomm; int rank, nprocs=4, size,nspawned; MPI_Info info[4]; char *host = (char *) "host"; /*String to be stored as a key in MPI_Info*/ char *commands[4];/*Stores the array of executable names to be spawned*/ int maxprocs[4]; /*maximum number of processes that can be spawned on each process.*/ char ***args=NULL;/*array of arguments for each executable*/ int i; /*loop counter*/ char nodenames[4][50]; MPI_Comm_get_parent(&parent); if(newprocess==0) { /*Master processes*/ strcpy(nodenames[0],"n1009"); strcpy(nodenames[1],"n1010"); strcpy(nodenames[2],"n1011"); strcpy(nodenames[3],"n1012"); for(i=0;i<4;i++) { commands[i]=(char*)malloc(sizeof(char)*50); strcpy(commands[i],"./hello"); maxprocs[i]=1; MPI_Info_create (&info[i]); MPI_Info_set (info[i], host, nodenames[i]); } nspawned = MPI_Comm_spawn_multiple(nprocs, commands, args, maxprocs, info, 0, grid_comm, &intercomm, MPI_ERRCODES_IGNORE); MPI_Intercomm_merge(intercomm, 0, &grid_comm); } else { /* This part of the code is executed by the newly spawned process*/ newprocess=0; MPI_Intercomm_merge(parent, 1, &grid_comm); } } /***/ Function that needs to be called while spawning (hello.c) // /*Global variables*/ MPI_Comm grid_comm; /*new global communicator after spawning*/ int loop=0; /*number of iterations*/ int newprocess = -1; /* Variable to identify whether the current process is an old process or a spawned process.*/ int main (int argc, char **argv) { int myrank,size; MPI_Init(&argc, &argv); while(loop<2){ if(newprocess!=0){ newprocess=1; mpicomm_spawn(); } else mpicomm_spawn(); MPI_Comm_rank(grid_comm, &myrank); MPI_Bcast(&loop,1,MPI_INT,0,grid_comm); fprintf(stdout,"\n\n<
Re: [OMPI users] MPI_Finalize segmentation fault with MPI_Intercomm_merge
As a followup to my problem, I tested this sample code with LAM/MPI and it worked perfectly without any segmentation faults. Has any one tried this and faced this isue? Any help will be appreciated. Regards, Rajesh -- Forwarded message -- From: Rajesh Sudarsan List-Post: users@lists.open-mpi.org Date: Jan 15, 2008 12:33 AM Subject: MPI_Finalize segmentation fault with MPI_Intercomm_merge To: Open MPI Users Hi, I am trying to write a simple code which does the following - A master process running on 1 processor spawn 4 processes using the MPI_Comm_spawn_multiple command. After spawning, the intercommunicator between the master and the spawned processes are merged using MPI_Intercomm_merge to create a new common intracommunicator for the expanded set of processes. These new set of processes call MPI_Comm_spawn to create more processes and merged together to get a new intracomm. These steps are repeated over a loop. Now the problem is whenI call MPI_Finalize at the end of the loop, I get a segmentation fault. I do not get segmentation fault if I run the loop only once and call MPI_Intercomm_merge only once. Is there something wrong with my program or is it a known issue with MPI_Intercomm_merge when called multiple times? I have pasted the sample code below. It has 3 files - master.c, spawn.c, hello.c. I will be glad to clarify if anything looks confusing. Any help will be appreciated. // Master function. This function calls MPI_Comm_spawn_multiple the first time. ( master.c) /*/ /*Global variables*/ MPI_Comm grid_comm; /*new global communicator after spawning*/ int loop=0; /*number of iterations*/ int newprocess = -1; /* Variable to identify whether the current process is an old process or a spawned process.*/ int main (int argc, char ** argv) { int size,rank; MPI_Init(&argc, &argv); grid_comm=MPI_COMM_WORLD; newprocess=0; for( ;loop < 2; loop++){ fprintf(stdout,"\n\nLOOP in main =%d\n",loop); mpicomm_spawn(); /*Broacasting the loop value to spawned processes so that the new processes join the next iteration with the correct loop value.*/ MPI_Bcast(&loop,1,MPI_INT,0,grid_comm); MPI_Comm_size(grid_comm, &size); MPI_Comm_rank(grid_comm, &rank); } fprintf(stdout,"Exiting...main..rank=%d\n",rank); fflush(stdout); MPI_Barrier(grid_comm); MPI_Comm_free(&grid_comm); MPI_Finalize(); } /**/ Spawning function (spawn.c) // extern MPI_Comm grid_comm; int mpicomm_spawn() { MPI_Comm parent, intercomm; int rank, nprocs=4, size,nspawned; MPI_Info info[4]; char *host = (char *) "host"; /*String to be stored as a key in MPI_Info*/ char *commands[4];/*Stores the array of executable names to be spawned*/ int maxprocs[4]; /*maximum number of processes that can be spawned on each process.*/ char ***args=NULL;/*array of arguments for each executable*/ int i; /*loop counter*/ char nodenames[4][50]; MPI_Comm_get_parent(&parent); if(newprocess==0) { /*Master processes*/ strcpy(nodenames[0],"n1009"); strcpy(nodenames[1],"n1010"); strcpy(nodenames[2],"n1011"); strcpy(nodenames[3],"n1012"); for(i=0;i<4;i++) { commands[i]=(char*)malloc(sizeof(char)*50); strcpy(commands[i],"./hello"); maxprocs[i]=1; MPI_Info_create (&info[i]); MPI_Info_set (info[i], host, nodenames[i]); } nspawned = MPI_Comm_spawn_multiple(nprocs, commands, args, maxprocs, info, 0, grid_comm, &intercomm, MPI_ERRCODES_IGNORE); MPI_Intercomm_merge(intercomm, 0, &grid_comm); } else { /* This part of the code is executed by the newly spawned process*/ newprocess=0; MPI_Intercomm_merge(parent, 1, &grid_comm); } } /***/ Function that needs to be called while spawning (hello.c) // /*Global variables*/ MPI_Comm grid_comm; /*new global communicator after spawning*/ int loop=0; /*number of iterations*/ int newprocess = -1; /* Variable to identify whether the current process is an old process or a spawned process.*/ int main (int argc, char **argv) { int myrank,size; MPI_Init(&argc, &argv); while(loop<2){ if(newprocess!=0){ newprocess=1; mpicomm_spawn(); } else mpicomm_spawn(); MPI_Comm_rank(grid_comm, &myrank); MPI_Bcast(&loop,1,MPI_INT,0,grid_comm); fprintf(stdout,"\n\n<<<<<<<<<
[OMPI users] MPI + Mixed language coding(Fortran90 + C++)
Hello MPI Users, I am completely new to MPI. I have a basic question concerning MPI and mixed language coding. I hope any of you could help me out. Is it possible to access FORTRAN common blocks in C++ in a MPI compiled code. It works without MPI but as soon I switch to MPI the access of common block does not work anymore. I have a Linux MPI executable which loads a shared library at runtime and resolves all undefined symbols etc The shared library is written in C++ and the MPI executable in written in FORTRAN. Some of the input that the shared library looking for are in the Fortran common blocks. As I access those common blocks during runtime the values are not initialized. I would like to know if what I am doing is possible ?I hope that my problem is clear.. Your valuable suggestions are welcome !!! Thank you, Rajesh
Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++)
Hello Jeff Squyres, Thank you very much for the immediate reply. I am able to successfully access the data from the common block but the values are zero. In my algorithm I even update a common block but the update made by the shared library is not taken in to account by the executable. Can you please be very specific how to make the parallel algorithm aware of the data? Actually I am not writing any MPI code inside? It's the executable (third party software) who does that part. All that I am doing is to compile my code with MPI c compiler and add it in the LD_LIBIRARY_PATH. In fact I did a simple test by creating a shared library using a FORTRAN code and the update made to the common block is taken in to account by the executable. Is there any flag or pragma that need to be activated for mixed language MPI? Thank you once again for the reply. Rajesh -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: vendredi 31 octobre 2008 18:53 To: Open MPI Users Subject: Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++) On Oct 31, 2008, at 11:57 AM, Rajesh Ramaya wrote: > I am completely new to MPI. I have a basic question concerning > MPI and mixed language coding. I hope any of you could help me out. > Is it possible to access FORTRAN common blocks in C++ in a MPI > compiled code. It works without MPI but as soon I switch to MPI the > access of common block does not work anymore. > I have a Linux MPI executable which loads a shared library at > runtime and resolves all undefined symbols etc The shared library > is written in C++ and the MPI executable in written in FORTRAN. Some > of the input that the shared library looking for are in the Fortran > common blocks. As I access those common blocks during runtime the > values are not initialized. I would like to know if what I am > doing is possible ?I hope that my problem is clear.. Generally, MPI should not get in the way of sharing common blocks between Fortran and C/C++. Indeed, in Open MPI itself, we share a few common blocks between Fortran and the main C Open MPI implementation. What is the exact symptom that you are seeing? Is the application failing to resolve symbols at run-time, possibly indicating that something hasn't instantiated a common block? Or are you able to successfully access the data from the common block, but it doesn't have the values you expect (e.g., perhaps you're seeing all zeros)? If the former, you might want to check your build procedure. You *should* be able to simply replace your C++ / F90 compilers with mpicxx and mpif90, respectively, and be able to build an MPI version of your app. If the latter, you might need to make your parallel algorithm aware of what data is available in which MPI process -- perhaps not all the data is filled in on each MPI process...? -- Jeff Squyres Cisco Systems ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++)
Helllo Jeff, Gustavo, Mi Thank for the advice. I am familiar with the difference in the compiler code generation for C, C++ & FORTRAN. I even tried to look at some of the common block symbols. The name of the symbol remains the same. The only difference that I observe is in FORTRAN compiled *.o 00515bc0 B aux7loc_ and the C++ compiled code U aux7loc_ the memory is not allocated as it has been declared as extern in C++. When the executable loads the shared library it finds all the undefined symbols. Atleast if it did not manage to find a single symbol it prints undefined symbol error. I am completely stuck up and do not know how to continue further. Thanks, Rajesh From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Mi Yan Sent: samedi 1 novembre 2008 23:26 To: Open MPI Users Cc: 'Open MPI Users'; users-boun...@open-mpi.org Subject: Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++) So your tests show: 1. "Shared library in FORTRAN + MPI executable in FORTRAN" works. 2. "Shared library in C++ + MPI executable in FORTRAN " does not work. It seems to me that the symbols in C library are not really recognized by FORTRAN executable as you thought. What compilers did yo use to built OpenMPI? Different compiler has different convention to handle symbols. E.g. if there is a variable "var_foo" in your FORTRAN code, some FORTRN compiler will save "var_foo_" in the object file by default; if you want to access "var_foo" in C code, you actually need to refer "var_foo_" in C code. If you define "var_foo" in a module in the FORTAN compiler, some FORTRAN compiler may append the module name to "var_foo". So I suggest to check the symbols in the object files generated by your FORTAN and C compiler to see the difference. Mi Inactive hide details for "Rajesh Ramaya" "Rajesh Ramaya" "Rajesh Ramaya" Sent by: users-boun...@open-mpi.org 10/31/2008 03:07 PM Please respond to Open MPI Users To "'Open MPI Users'" , "'Jeff Squyres'" cc Subject Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++) Hello Jeff Squyres, Thank you very much for the immediate reply. I am able to successfully access the data from the common block but the values are zero. In my algorithm I even update a common block but the update made by the shared library is not taken in to account by the executable. Can you please be very specific how to make the parallel algorithm aware of the data? Actually I am not writing any MPI code inside? It's the executable (third party software) who does that part. All that I am doing is to compile my code with MPI c compiler and add it in the LD_LIBIRARY_PATH. In fact I did a simple test by creating a shared library using a FORTRAN code and the update made to the common block is taken in to account by the executable. Is there any flag or pragma that need to be activated for mixed language MPI? Thank you once again for the reply. Rajesh -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: vendredi 31 octobre 2008 18:53 To: Open MPI Users Subject: Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++) On Oct 31, 2008, at 11:57 AM, Rajesh Ramaya wrote: > I am completely new to MPI. I have a basic question concerning > MPI and mixed language coding. I hope any of you could help me out. > Is it possible to access FORTRAN common blocks in C++ in a MPI > compiled code. It works without MPI but as soon I switch to MPI the > access of common block does not work anymore. > I have a Linux MPI executable which loads a shared library at > runtime and resolves all undefined symbols etc The shared library > is written in C++ and the MPI executable in written in FORTRAN. Some > of the input that the shared library looking for are in the Fortran > common blocks. As I access those common blocks during runtime the > values are not initialized. I would like to know if what I am > doing is possible ?I hope that my problem is clear.. Generally, MPI should not get in the way of sharing common blocks between Fortran and C/C++. Indeed, in Open MPI itself, we share a few common blocks between Fortran and the main C Open MPI implementation. What is the exact symptom that you are seeing? Is the application failing to resolve symbols at run-time, possibly indicating that something hasn't instantiated a common block? Or are you able to successfully access the data from the common block, but it doesn't have the values you expect (e.g., perhaps you're seeing all zeros)? If the former, you might want to check your build procedure. You *should* be able to simply replace your
[OMPI users] Machinefile option in opempi-1.3.2
Hi, I tested a simple hello world program on 5 nodes each with dual quad-core processors. I noticed that openmpi does not always follow the order of the processors indicated in the machinefile. Depending upon the number of processors requested, openmpi does some type of sorting to find the best node fit for a particular job and runs on them. Is there a way to make openmpi to turn off this sorting and strictly follow the order indicated in the machinefile? mpiexec supports three options to specify the machinefile - default-machinefile, hostfile, and machinefile. Can anyone tell what is the difference between these three options? Any help would be greatly appreciated. Thanks, Rajesh
Re: [OMPI users] Machinefile option in opempi-1.3.2
Rank 2 of C version says: Hello world!..hostname = n106 Rank 5 of C version says: Hello world!..hostname = n106 Rank 9 of C version says: Hello world!..hostname = n105 Thanks, Rajesh On Fri, Jun 19, 2009 at 10:40 PM, Ralph Castain wrote: > If you do "man orte_hosts", you'll see a full explanation of how the various > machinefile options work. > The default mapper doesn't do any type of sorting - it is a round-robin > mapper that just works its way through the provided nodes. We don't reorder > them in any way. > However, it does depend on the number of slots we are told each node has, so > that might be what you are encountering. If you do a --display-map and send > it along, I might be able to spot the issue. > Thanks > Ralph > > On Fri, Jun 19, 2009 at 1:35 PM, Rajesh Sudarsan > wrote: >> >> Hi, >> >> I tested a simple hello world program on 5 nodes each with dual >> quad-core processors. I noticed that openmpi does not always follow >> the order of the processors indicated in the machinefile. Depending >> upon the number of processors requested, openmpi does some type of >> sorting to find the best node fit for a particular job and runs on >> them. Is there a way to make openmpi to turn off this sorting and >> strictly follow the order indicated in the machinefile? >> >> mpiexec supports three options to specify the machinefile - >> default-machinefile, hostfile, and machinefile. Can anyone tell what >> is the difference between these three options? >> >> Any help would be greatly appreciated. >> >> Thanks, >> Rajesh >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] Machinefile option in opempi-1.3.2
Thanks Ralph. It worked. Regards, Rajesh On Sat, Jun 20, 2009 at 10:28 AM, Ralph Castain wrote: > Ah, yes - that is definitely true. What you need to use is the "seq" (for > "sequential") mapper. Do the following on your cmd line: > --hostfile hostfile -mca rmaps seq > This will cause OMPI to map the process ranks according to the order in the > hostfile. You need to specify one line for each node/rank, just as you have > done. > Ralph > > On Fri, Jun 19, 2009 at 10:24 PM, Rajesh Sudarsan > wrote: >> >> Hi Ralph, >> >> Thanks for the reply. The default mapper does round-robin assignment >> as long as I do not specify the machinefile in the following format: >> >> n1 >> n2 >> n2 >> n1 where, n1 and n2 are two nodes in the cluster and I use two >> slots within each node. >> >> >> I have pasted the output and the display map for execution on 2, 4,8 >> and 16 processors. The mapper does not use the nodes in which it is >> listed in the file. >> >> The machinefile that I tested with uses two nodes n105 and n106 with 8 >> cores in each node. >> >> n105 >> n105 >> n105 >> n105 >> n106 >> n106 >> n106 >> n106 >> n106 >> n106 >> n106 >> n106 >> n105 >> n105 >> n105 >> n105 >> >> When I run a hello world program on 2 processors which prints the >> hostname, the output and the display map are as follows: >> >> >> $ mpiexec --display-map -machinefile m3 -np 2 ./hello >> >> JOB MAP >> >> Data for node: Name: n106 Num procs: 2 >> Process OMPI jobid: [7838,1] Process rank: 0 >> Process OMPI jobid: [7838,1] Process rank: 1 >> >> = >> Rank 0 is present in C version of Hello World...hostname = n106 >> Rank 1 of C version says: Hello world!..hostname = n106 >> >> >> >> >> On 4 processors the output is as follows >> >> $ mpiexec --display-map -machinefile m3 -np 4 ./hello >> >> JOB MAP >> >> Data for node: Name: n106 Num procs: 4 >> Process OMPI jobid: [7294,1] Process rank: 0 >> Process OMPI jobid: [7294,1] Process rank: 1 >> Process OMPI jobid: [7294,1] Process rank: 2 >> Process OMPI jobid: [7294,1] Process rank: 3 >> >> = >> Rank 0 is present in C version of Hello World...hostname = n106 >> Rank 1 of C version says: Hello world!..hostname = n106 >> Rank 3 of C version says: Hello world!..hostname = n106 >> Rank 2 of C version says: Hello world!..hostname = n106 >> >> >> >> >> On 8 processors the output is as follows: >> >> $ mpiexec --display-map -machinefile m3 -np 8 ./hello >> >> JOB MAP >> >> Data for node: Name: n106 Num procs: 8 >> Process OMPI jobid: [7264,1] Process rank: 0 >> Process OMPI jobid: [7264,1] Process rank: 1 >> Process OMPI jobid: [7264,1] Process rank: 2 >> Process OMPI jobid: [7264,1] Process rank: 3 >> Process OMPI jobid: [7264,1] Process rank: 4 >> Process OMPI jobid: [7264,1] Process rank: 5 >> Process OMPI jobid: [7264,1] Process rank: 6 >> Process OMPI jobid: [7264,1] Process rank: 7 >> >> = >> Rank 3 of C version says: Hello world!..hostname = n106 >> Rank 7 of C version says: Hello world!..hostname = n106 >> Rank 0 is present in C version of Hello World...hostname = n106 >> Rank 2 of C version says: Hello world!..hostname = n106 >> Rank 4 of C version says: Hello world!..hostname = n106 >> Rank 6 of C version says: Hello world!..hostname = n106 >> Rank 5 of C version says: Hello world!..hostname = n106 >> Rank 1 of C version says: Hello world!..hostname = n106 >> >> >> >> On 16 nodes the output is as follows: >> >> $ mpiexec --display-map -machinefile m3 -np 16 ./hello >> >> JOB MAP >> >> Data for node: Name: n106 Num procs: 8 >> Process OMPI jobid: [7266,1] Process rank: 0 >> Process OMPI jobid: [7266,1] Process rank: 1 >> Process OMPI jobid: [7266,1] Process rank: 2 >> Process OMPI jo