Re: [OMPI users] openmpi-1.7.4a1r29646 with -hostfile option under Torque manager
Thank you, Ralph. I just hope that it helps you to improve the quality of openmpi-1.7 series. Tetsuya Mishima > Hmmm...okay, I understand the scenario. Must be something in the algo when it only has one node, so it shouldn't be too hard to track down. > > I'm off on travel for a few days, but will return to this when I get back. > > Sorry for delay - will try to look at this while I'm gone, but can't promise anything :-( > > > On Dec 10, 2013, at 6:58 PM, tmish...@jcity.maeda.co.jp wrote: > > > > > > > Hi Ralph, sorry for confusing. > > > > We usually logon to "manage", which is our control node. > > From manage, we submit job or enter a remote node such as > > node03 by torque interactive mode(qsub -I). > > > > At that time, instead of torque, I just did rsh to node03 from manage > > and ran myprog on the node. I hope you could understand what I did. > > > > Now, I retried with "-host node03", which still causes the problem: > > (I comfirmed local run on manage caused the same problem too) > > > > [mishima@manage ~]$ rsh node03 > > Last login: Wed Dec 11 11:38:57 from manage > > [mishima@node03 ~]$ cd ~/Desktop/openmpi-1.7/demos/ > > [mishima@node03 demos]$ > > [mishima@node03 demos]$ mpirun -np 8 -host node03 -report-bindings > > -cpus-per-proc 4 -map-by socket myprog > > -- > > A request was made to bind to that would result in binding more > > processes than cpus on a resource: > > > > Bind to: CORE > > Node:node03 > > #processes: 2 > > #cpus: 1 > > > > You can override this protection by adding the "overload-allowed" > > option to your binding directive. > > -- > > > > It' strange, but I have to report that "-map-by socket:span" worked well. > > > > [mishima@node03 demos]$ mpirun -np 8 -host node03 -report-bindings > > -cpus-per-proc 4 -map-by socket:span myprog > > [node03.cluster:11871] MCW rank 2 bound to socket 1[core 8[hwt 0]], socket > > 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s > > ocket 1[core 11[hwt 0]]: > > [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.] > > [node03.cluster:11871] MCW rank 3 bound to socket 1[core 12[hwt 0]], socket > > 1[core 13[hwt 0]], socket 1[core 14[hwt 0]], > > socket 1[core 15[hwt 0]]: > > [./././././././.][././././B/B/B/B][./././././././.][./././././././.] > > [node03.cluster:11871] MCW rank 4 bound to socket 2[core 16[hwt 0]], socket > > 2[core 17[hwt 0]], socket 2[core 18[hwt 0]], > > socket 2[core 19[hwt 0]]: > > [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.] > > [node03.cluster:11871] MCW rank 5 bound to socket 2[core 20[hwt 0]], socket > > 2[core 21[hwt 0]], socket 2[core 22[hwt 0]], > > socket 2[core 23[hwt 0]]: > > [./././././././.][./././././././.][././././B/B/B/B][./././././././.] > > [node03.cluster:11871] MCW rank 6 bound to socket 3[core 24[hwt 0]], socket > > 3[core 25[hwt 0]], socket 3[core 26[hwt 0]], > > socket 3[core 27[hwt 0]]: > > [./././././././.][./././././././.][./././././././.][B/B/B/B/./././.] > > [node03.cluster:11871] MCW rank 7 bound to socket 3[core 28[hwt 0]], socket > > 3[core 29[hwt 0]], socket 3[core 30[hwt 0]], > > socket 3[core 31[hwt 0]]: > > [./././././././.][./././././././.][./././././././.][././././B/B/B/B] > > [node03.cluster:11871] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket > > 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so > > cket 0[core 3[hwt 0]]: > > [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.] > > [node03.cluster:11871] MCW rank 1 bound to socket 0[core 4[hwt 0]], socket > > 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so > > cket 0[core 7[hwt 0]]: > > [././././B/B/B/B][./././././././.][./././././././.][./././././././.] > > Hello world from process 2 of 8 > > Hello world from process 6 of 8 > > Hello world from process 3 of 8 > > Hello world from process 7 of 8 > > Hello world from process 1 of 8 > > Hello world from process 5 of 8 > > Hello world from process 0 of 8 > > Hello world from process 4 of 8 > > > > Regards, > > Tetsuya Mishima > > > > > >> On Dec 10, 2013, at 6:05 PM, tmish...@jcity.maeda.co.jp wrote: > >> > >>> > >>> > >>> Hi Ralph, > >>> > >>> I tried again with -cpus-per-proc 2 as shown below. > >>> Here, I found that "-map-by socket:span" worked well. > >>> > >>> [mishima@node03 demos]$ mpirun -np 8 -report-bindings -cpus-per-proc 2 > >>> -map-by socket:span myprog > >>> [node03.cluster:10879] MCW rank 2 bound to socket 1[core 8[hwt 0]], > > socket > >>> 1[core 9[hwt 0]]: [./././././././.][B/B/././. > >>> /././.][./././././././.][./././././././.] > >>> [node03.cluster:10879] MCW rank 3 bound to socket 1[core 10[hwt 0]], > > socket > >>> 1[core 11[hwt 0]]: [./././././././.][././B/B > >>> /./././.][./././././././.][./././././././.] > >>> [node03.cluster:10879] MCW rank 4 bound to socket 2[core 16[hwt 0]], > > socket > >>> 2[core 17[hwt 0]]: [./././
Re: [OMPI users] Prototypes for Fortran MPI_ commands using 64-bit indexing
On Dec 10, 2013, at 10:42 AM, Dave Love wrote: > This doesn't seem to have been fixed, and I think it's going to bite > here. Is this the right change? Thanks for reminding us. > --- openmpi-1.6.5/ompi/config/ompi_setup_mpi_fortran.m4~ 2012-04-03 > 15:30:24.0 +0100 > +++ openmpi-1.6.5/ompi/config/ompi_setup_mpi_fortran.m4 2013-12-10 > 12:23:54.232854527 + > @@ -127,8 +127,8 @@ > AC_MSG_RESULT([skipped (no Fortran bindings)]) > else > bytes=`expr 4 \* $ac_cv_sizeof_int + $ac_cv_sizeof_size_t` > -num_integers=`expr $bytes / $OMPI_SIZEOF_FORTRAN_INTEGER` > -sanity=`expr $num_integers \* $OMPI_SIZEOF_FORTRAN_INTEGER` > +num_integers=`expr $bytes / $ac_cv_sizeof_int` > +sanity=`expr $num_integers \* $ac_cv_sizeof_int` I think this is right, but it is has different implications for different series: 1. No more releases are planned for the v1.6 series. We can commit this fix over there, and it will be available via nightly tarballs. There are also ABI implications -- see #2, below. 2. This fix changes the ABI for the 1.5/1.6 and 1.7/1.8 series (separately, of course). As such, we will need to make this a non-default configure option. E.g., only do this new behavior if --enable-abi-breaking-fortran-status-i8-fix is specified (or some name like that). By default, we have to keep the ABI for the entire 1.5/1.6 and 1.7/1.8 series -- so if you specify this switch, you acknowledge that you're breaking ABI for the -i8 case. 3. For the v1.9 series (i.e., currently the SVN trunk), we can make this be the default, and the --enable-abi-breaking... switch will not exist. Sound ok? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] Prototypes for Fortran MPI_ commands using 64-bit indexing
All, Per George's recommendation, I found the following definition of ompi_fortran_integer_t #define ompi_fortran_integer_t long long My application is working fine with the MPI_STATUS_IGNORE flag set Cheers, --Jim On Tue, Nov 12, 2013 at 3:42 AM, George Bosilca wrote: > On Nov 12, 2013, at 00:38 , Jeff Squyres (jsquyres) > wrote: > > > 2. In the 64 bit case, you'll have a difficult time extracting the MPI > status values from the 8-byte INTEGERs in the status array in Fortran > (because the first 2 of 3 each really be 2 4-byte integers). > > My understanding is that in Fortran explicitly types variables will retain > their expected size. Thus, instead of declaring > > INTEGER :: status[MPI_STATUS_SIZE] > > one should go for > > INTEGER*4 :: status[MPI_STATUS_SIZE] > > This should make it work right now. However, it is a non-standard > solution, and we should fix the status handling internally in Open MPI. > > Looking at the code I think that correctly detecting the type of our > ompi_fortran_integer_t during configure (which should be a breeze if the > correct flags are passed) should solve all issues here as we are protecting > the status conversion between C and Fortran. > > Jim, can you go in the include directory on your Open MPI installation and > grep for the definition of ompi_fortran_integer_t please. > > George. > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
[OMPI users] environment variables and MPI_Comm_spawn
Hi all, I'm developing on Open MPI 1.4.5-ubuntu2 on Ubuntu 13.10 (so, Ubuntu's packaged Open MPI) at the moment. I'd like to pass environment variables to processes started via MPI_Comm_spawn. Unfortunately, the MPI 3.0 standard (at least) does not seem to specify a way to do this; thus I have been searching for implementation-specific ways to accomplish my task. I have tried setting the environment variable using the POSIX setenv(3) call, but it seems that Open MPI comm-spawn'd processes do not inherit environment variables. See the attached 2 C99 programs; one prints out the environment it receives, and one sets the MEANING_OF_LIFE environment variable, spawns the previous 'env printing' program, and exits. I run via: $ env -i HOME=/home/tfogal \ PATH=/bin:/usr/bin:/usr/local/bin:/sbin:/usr/sbin \ mpirun -x TJFVAR=testing -n 5 ./mpienv ./envpar and expect (well, hope) to find the MEANING_OF_LIFE in 'envpar's output. I do see TJFVAR, but the MEANING_OF_LIFE sadly does not propagate. Perhaps I am asking the wrong question... I found another MPI implementation which allowed passing such information via the MPI_Info argument, however I could find no documentation of similar functionality in Open MPI. Is there a way to accomplish what I'm looking for? I could even be convinced to hack source, but a starting pointer would be appreciated. Thanks, -tom #define _POSIX_C_SOURCE 200112L #include #include #include #include #include #define ROOT(stmt) \ do { \ if(rank() == 0) { stmt; } \ } while(0) static size_t rank(); static size_t size(); static void rebuild_args(size_t argc, char* argv[], char** cmd, char* subv[]); int main(int argc, char* argv[]) { MPI_Init(&argc, &argv); if(rank() == 0 && argc < 2) { fprintf(stderr, "Need at least one argument: the binary to run.\n"); MPI_Abort(MPI_COMM_WORLD, EXIT_FAILURE); } ROOT(printf("Running on %zu procs.\n", size())); /* MPI (annoyingly) displaces the argument list by one, so rebuild it. */ char* subv[argc]; memset(subv, 0, sizeof(char*)*argc); char* command; assert(argc > 0); rebuild_args((size_t)argc, argv, &command, subv); MPI_Comm intercomm; /* we don't need, but MPI requires. */ int errors[size()]; if(setenv("MEANING_OF_EVERYTHING", "42", 1) != 0) { fprintf(stderr, "[%zu] failed setting LD_PRELOAD env var.\n", rank()); } int spawn = MPI_Comm_spawn(command, subv, (int)size(), MPI_INFO_NULL, 0, MPI_COMM_WORLD, &intercomm, errors); if(spawn != MPI_SUCCESS) { fprintf(stderr, "[%zu] spawn error: %d\n", rank(), spawn); } for(size_t i=0; rank()==0 && i < size(); ++i) { if(errors[i] != MPI_SUCCESS) { printf("process %zu error: %d\n", i, errors[i]); } } MPI_Finalize(); return 0; } static size_t rank() { int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); return (size_t)rank; } static size_t size() { int size; MPI_Comm_size(MPI_COMM_WORLD, &size); return (size_t)size; } static void rebuild_args(size_t argc, char* argv[], char** cmd, char* subv[]) { /* argv[0] is the name of this program. * argv[1] is the name of the program the user wanted to run, "child" * argv[x] for x > 1 are the arguments of "child". */ for(size_t i=2; i < argc; ++i) { subv[i-2] = argv[i]; } *cmd = argv[1]; } #include #include static size_t rank() { int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); return (size_t)rank; } extern char** environ; int main(int argc, char* argv[]) { MPI_Init(&argc, &argv); for(char** ev=environ; rank() == 0 && *ev; ++ev) { printf("env: %s\n", *ev); } MPI_Finalize(); }