Re: [OMPI users] Can you set the gid of the processes created by mpirun?
Typically it is something like 'qsub -W group_list=groupB myjob.sh'. Ultimately myjob.sh runs with gid groupB on some host in the cluster. When that script reaches the mpirun command, then mpirun and the processes started on the same host all run with gid groupB, but any of the spawned processes that start on other hosts run with the user's default group, say groupA. It did occur to me that the launching technique might have some ability to influence this behavior as you indicated. I don't know what launcher is being used in our cases, I guess it's rsh/ssh. -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Reuti Sent: Wednesday, September 07, 2011 12:24 PM To: Open MPI Users Subject: Re: [OMPI users] Can you set the gid of the processes created by mpirun? Hi, you mean you change the group id of the user before you submit the job? In GridEngine you can specify whether the actual group id should be used for the job, or the default login id. Having a tight integration, also the slave processes will run with the same group id. -- Reuti > Ed > > From: Ralph Castain [mailto:r...@open-mpi.org] > Sent: Wednesday, September 07, 2011 8:53 AM > To: Open MPI Users > Subject: Re: [OMPI users] Can you set the gid of the processes created by mpirun? > > On Sep 7, 2011, at 7:38 AM, Blosch, Edwin L wrote: > > > The mpirun command is invoked when the user's group is 'set group' to group 650. When the rank 0 process creates files, they have group ownership 650. But the user's login group is group 1040. The child processes that get started on other nodes run with group 1040, and the files they create have group ownership 1040. > > Is there a way to tell mpirun to start the child processes with the same uid and gid as the rank 0 process? > > I'm afraid not - never came up before. Could be done, but probably not right away. What version are you using? > > > > Thanks > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Can you set the gid of the processes created by mpirun?
In my case the directories are actually the "tmp" directories created by the job-scheduling system, but I think a wrapper script could chgrp and setguid appropriately so that a process running group 1040 would effectively write files with group ownership 650. Thanks for the clever idea. -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Reuti Sent: Thursday, September 15, 2011 12:23 PM To: Open MPI Users Subject: Re: [OMPI users] Can you set the gid of the processes created by mpirun? Edwin, going back to this: >> The mpirun command is invoked when the user's group is 'set group' to group 650. When the rank 0 process creates files, they have group ownership 650. But the user's login group is group 1040. The child processes that get started on other nodes run with group 1040, and the files they create have group ownership 1040. What about setting the set-guid flag for the directory? Created files therein will inherit the group from the folder then (which has to be set to the group in question of course). -- Reuti ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy'
Thank you for all this information. Your diagnosis is totally right. I actually sent e-mail yesterday but apparently it never got through :< It IS the MPI application that is failing to link, not OpenMPI itself; my e-mail was not well written; sorry Brice. The situation is this: I am trying to compile using an OpenMPI 1.5.4 that was built to be rooted in /release, but it is not placed there yet (testing); it is currently under /builds/release. I have set OPAL_PREFIX in the environment, with the intention of helping the compiler wrappers work right. Under /release, I currently have OpenMPI 1.4.3, whereas the OpenMPI under /builds/release is 1.5.4. What I am getting is this: The mpif90 wrapper (under /builds/release/openmpi/bin) puts -I/release instead of -I/builds/release. But it includes -L/builds/release. So I'm getting headers from 1.4.3 when compiling, but the libmpi from 1.5.4 when linking. I did a quick "move 1.4.3 out of the way and put 1.5.4 over to /release where it belongs" test, and my application did link without errors, so I think that confirms the nature of the problem. Is it a bug that mpif90 didn't pay attention to OPAL_PREFIX in the -I but did use it in the -L ? -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: Friday, September 30, 2011 7:04 AM To: Open MPI Users Subject: Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy' I think the issue here is that it's linking the *MPI application* that is causing the problem. Is that right? If so, can you send your exact application compile line, and the the output of that compile line with "--showme" at the end? On Sep 29, 2011, at 4:24 PM, Brice Goglin wrote: > Le 28/09/2011 23:02, Blosch, Edwin L a écrit : >> Jeff, >> >> I've tried it now adding --without-libnuma. Actually that did NOT fix the problem, so I can send you the full output from configure if you want, to understand why this "hwloc" function is trying to use a function which appears to be unavailable. > > This function is likely available... in the dynamic version of libnuma > (that's why configure is happy), but make is probably trying to link > with the static version which isn't available on your machine. That's my > guess, at least. > >> I don't understand about make V=1. What tree? Somewhere in the OpenMPI build, or in the application compilation itself? Is "V=1" something in the OpenMPI makefile structure? > > Instead of doing > ./configure ... > make > do > ./configure > make V=1 > > It will make the output more verbose. Once you get the failure, please > send the last 15 lines or so. We will look at these verbose lines to > understand how things are being compiled (which linker flags, which > libraries, ...) > > Brice > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage
Thanks very much, exactly what I wanted to hear. How big is /tmp? -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of David Turner Sent: Thursday, November 03, 2011 6:36 PM To: us...@open-mpi.org Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage I'm not a systems guy, but I'll pitch in anyway. On our cluster, all the compute nodes are completely diskless. The root file system, including /tmp, resides in memory (ramdisk). OpenMPI puts these session directories therein. All our jobs run through a batch system (torque). At the conclusion of each batch job, an epilogue process runs that removes all files belonging to the owner of the current batch job from /tmp (and also looks for and kills orphan processes belonging to the user). This epilogue had to written by our systems staff. I believe this is a fairly common configuration for diskless clusters. On 11/3/11 4:09 PM, Blosch, Edwin L wrote: > Thanks for the help. A couple follow-up-questions, maybe this starts to go outside OpenMPI: > > What's wrong with using /dev/shm? I think you said earlier in this thread that this was not a safe place. > > If the NFS-mount point is moved from /tmp to /work, would a /tmp magically appear in the filesystem for a stateless node? How big would it be, given that there is no local disk, right? That may be something I have to ask the vendor, which I've tried, but they don't quite seem to get the question. > > Thanks > > > > > -Original Message- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain > Sent: Thursday, November 03, 2011 5:22 PM > To: Open MPI Users > Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage > > > On Nov 3, 2011, at 2:55 PM, Blosch, Edwin L wrote: > >> I might be missing something here. Is there a side-effect or performance loss if you don't use the sm btl? Why would it exist if there is a wholly equivalent alternative? What happens to traffic that is intended for another process on the same node? > > There is a definite performance impact, and we wouldn't recommend doing what Eugene suggested if you care about performance. > > The correct solution here is get your sys admin to make /tmp local. Making /tmp NFS mounted across multiple nodes is a major "faux pas" in the Linux world - it should never be done, for the reasons stated by Jeff. > > >> >> Thanks >> >> >> -Original Message- >> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Eugene Loh >> Sent: Thursday, November 03, 2011 1:23 PM >> To: us...@open-mpi.org >> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage >> >> Right. Actually "--mca btl ^sm". (Was missing "btl".) >> >> On 11/3/2011 11:19 AM, Blosch, Edwin L wrote: >>> I don't tell OpenMPI what BTLs to use. The default uses sm and puts a session file on /tmp, which is NFS-mounted and thus not a good choice. >>> >>> Are you suggesting something like --mca ^sm? >>> >>> >>> -Original Message- >>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Eugene Loh >>> Sent: Thursday, November 03, 2011 12:54 PM >>> To: us...@open-mpi.org >>> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage >>> >>> I've not been following closely. Why must one use shared-memory >>> communications? How about using other BTLs in a "loopback" fashion? >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Best regards, David Turner User Services Groupemail: dptur...@lbl.gov NERSC Division phone: (510) 486-4027 Lawrence Berkeley Labfax: (510) 486-4316 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] Confused on simple MPI/OpenMP program
Consider this Fortran program snippet: program test ! everybody except rank=0 exits. call mpi_init(ierr) call mpi_comm_rank(MPI_COMM_WORLD,irank,ierr) if (irank /= 0) then call mpi_finalize(ierr) stop endif ! rank 0 tries to set number of OpenMP threads to 4 call omp_set_num_threads(4) nthreads = omp_get_max_threads() print*, "nthreads = ", nthreads call mpi_finalize(ierr) end program test It is compiled like this: 'mpif90 -o test -O2 -openmp test.f90' (Intel 11.x) When I run it like this: mpirun -np 2 ./test The output is: "nthreads = 0" Does that make sense? I was expecting 4. If I comment out the MPI lines and run the program serially (but still compiled with mpif90), then I get the expected output value 4. I'm sure I must be overlooking something basic here. Please enlighten me. Does this have anything to do with how I've configured OpenMPI? Thanks, Ed
Re: [OMPI users] Confused on simple MPI/OpenMP program
I figured it out. In the real application, I also did not have the 'use' statement, and there was an IMPLICIT statement causing the omp_get_max_threads() function to be automatically compiled as a real function instead of as integer, and the integers were automatically promoted to 8-byte using -i8. Once I added the 'use omp_lib' statement, the compiler caught the mis-match. Just to verify, I did add the 'use omp_lib' statement and ran the test program by itself. I do get '4' as expected regardless of whether or not I run the program under MPI. So there is no OpenMPI-related issue. I thought it was OpenMPI-related because, after commenting out the MPI calls, I got the right answer. But this was probably just a coincidence. Thanks, Ed I did not have the 'use' statement. -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Reuti Sent: Thursday, April 04, 2013 7:13 AM To: Open MPI Users Subject: Re: [OMPI users] Confused on simple MPI/OpenMP program Hi, Am 04.04.2013 um 04:35 schrieb Ed Blosch: > Consider this Fortran program snippet: > > program test use omp_lib include 'mpif.h' might be missing. > ! everybody except rank=0 exits. > call mpi_init(ierr) > call mpi_comm_rank(MPI_COMM_WORLD,irank,ierr) > if (irank /= 0) then >call mpi_finalize(ierr) >stop > endif > > ! rank 0 tries to set number of OpenMP threads to 4 call > omp_set_num_threads(4) nthreads = omp_get_max_threads() print*, > "nthreads = ", nthreads > > call mpi_finalize(ierr) > > end program test > > It is compiled like this: 'mpif90 -o test -O2 -openmp test.f90' > (Intel > 11.x) > > When I run it like this: mpirun -np 2 ./test > > The output is: "nthreads = 0" > > Does that make sense? I was expecting 4. > > If I comment out the MPI lines and run the program serially (but still > compiled with mpif90), then I get the expected output value 4. Nope, for me it's still 0 then. -- Reuti > I'm sure I must be overlooking something basic here. Please enlighten me. > Does this have anything to do with how I've configured OpenMPI? > > Thanks, > > Ed > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] EXTERNAL: Re: basic questions about compiling OpenMPI
Much appreciated, guys. I am a middle man in a discussion over whether MPI should be handled by apps people or system people and there was some confusion when we saw RHEL6 had an RPM for OpenMpi. Your comments make it clear that there is a pretty strong preference to build OpenMpi on the system to be used, with the compilers that match your apps compiler. Just a prereq for supporting a development environment Original message From: Tim Prince Date: To: us...@open-mpi.org Subject: Re: [OMPI users] EXTERNAL: Re: basic questions about compiling OpenMPI On 5/25/2013 8:26 AM, Jeff Squyres (jsquyres) wrote: > On May 23, 2013, at 9:50 AM, "Blosch, Edwin L" > wrote: > >> Excellent. Now I've read the FAQ and noticed that it doesn't mention the >> issue with the Fortran 90 .mod signatures. Our applications are Fortran. >> So your replies are very helpful -- now I know it really isn't practical for >> us to use the default OpenMPI shipped with RHEL6 since we use both Intel and >> PGI compilers and have several applications to accommodate. Presumably if >> all the applications did INCLUDE 'mpif.h' instead of 'USE MPI' then we >> could get things working, but it's not a great workaround. > No, not even if they use mpif.h. Here's a chunk of text from the v1.6 README: > > - While it is possible -- on some platforms -- to configure and build > Open MPI with one Fortran compiler and then build MPI applications > with a different Fortran compiler, this is not recommended. Subtle > problems can arise at run time, even if the MPI application > compiled and linked successfully. > > Specifically, the following two cases may not be portable between > different Fortran compilers: > > 1. The C constants MPI_F_STATUS_IGNORE and MPI_F_STATUSES_IGNORE > will only compare properly to Fortran applications that were > created with Fortran compilers that that use the same > name-mangling scheme as the Fortran compiler with which Open MPI > was configured. > > 2. Fortran compilers may have different values for the logical > .TRUE. constant. As such, any MPI function that uses the Fortran > LOGICAL type may only get .TRUE. values back that correspond to > the the .TRUE. value of the Fortran compiler with which Open MPI > was configured. Note that some Fortran compilers allow forcing > .TRUE. to be 1 and .FALSE. to be 0. For example, the Portland > Group compilers provide the "-Munixlogical" option, and Intel > compilers (version >= 8.) provide the "-fpscomp logicals" option. > > You can use the ompi_info command to see the Fortran compiler with > which Open MPI was configured. > > Even when the name mangling obstacle doesn't arise (it shouldn't for the cited case of gfortran vs. ifort), run-time library function usage is likely to conflict between the compiler used to build the MPI Fortran library and the compiler used to build the application. So there really isn't a good incentive to retrogress away from the USE files simply to avoid one aspect of mixing incompatible compilers. -- Tim Prince ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Application hangs on mpi_waitall
It ran a bit longer but still deadlocked. All matching sends are posted 1:1with posted recvs so it is a delivery issue of some kind. I'm running a debug compiled version tonight to see what that might turn up. I may try to rewrite with blocking sends and see if that works. I can also try adding a barrier (irecvs, barrier, isends, waitall) to make sure sends are not buffering waiting for recvs to be posted. Sent via the Samsung Galaxy S™ III, an AT&T 4G LTE smartphone Original message From: George Bosilca Date: To: Open MPI Users Subject: Re: [OMPI users] Application hangs on mpi_waitall Ed, Im not sure but there might be a case where the BTL is getting overwhelmed by the nob-blocking operations while trying to setup the connection. There is a simple test for this. Add an MPI_Alltoall with a reasonable size (100k) before you start posting the non-blocking receives, and let's see if this solves your issue. George. On Jun 26, 2013, at 04:02 , eblo...@1scom.net wrote: > An update: I recoded the mpi_waitall as a loop over the requests with > mpi_test and a 30 second timeout. The timeout happens unpredictably, > sometimes after 10 minutes of run time, other times after 15 minutes, for > the exact same case. > > After 30 seconds, I print out the status of all outstanding receive > requests. The message tags that are outstanding have definitely been > sent, so I am wondering why they are not getting received? > > As I said before, everybody posts non-blocking standard receives, then > non-blocking standard sends, then calls mpi_waitall. Each process is > typically waiting on 200 to 300 requests. Is deadlock possible via this > implementation approach under some kind of unusual conditions? > > Thanks again, > > Ed > >> I'm running OpenMPI 1.6.4 and seeing a problem where mpi_waitall never >> returns. The case runs fine with MVAPICH. The logic associated with the >> communications has been extensively debugged in the past; we don't think >> it has errors. Each process posts non-blocking receives, non-blocking >> sends, and then does waitall on all the outstanding requests. >> >> The work is broken down into 960 chunks. If I run with 960 processes (60 >> nodes of 16 cores each), things seem to work. If I use 160 processes >> (each process handling 6 chunks of work), then each process is handling 6 >> times as much communication, and that is the case that hangs with OpenMPI >> 1.6.4; again, seems to work with MVAPICH. Is there an obvious place to >> start, diagnostically? We're using the openib btl. >> >> Thanks, >> >> Ed >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Open MPI exited on signal 11 (Segmentation fault). Trying to run a script that uses Open MPI
Compile with -traceback and -check all if using Intel. Otherwise find the right compiler options to check array bounds accesses and to dump a stack trace. Then compile debug and run that way. Assuming it fails, you probably will get good info on the source of the problem. If it doesn't fail then the compiler has a bug (not as rare as you might think). You need to look at the application output. Not the output from mpirun. Ed Sent via the Samsung Galaxy S™ III, an AT&T 4G LTE smartphone Original message From: Ralph Castain Date: To: Open MPI Users Subject: Re: [OMPI users] Open MPI exited on signal 11 (Segmentation fault). Trying to run a script that uses Open MPI Well, it's telling you that your program segfaulted - so I'd start with that, perhaps looking for any core it might have dropped. On Jul 4, 2013, at 8:36 PM, Rick White wrote: Hello, I have this error: mpiexec noticed that process rank 1 with PID 16087 on node server exited on signal 11 (Segmentation fault) Wondering how to fix it? Cheers and many thanks Rick -- Richard Allen White III M.S. PhD Candidate - Suttle Lab Department of Microbiology & Immunology The University of British Columbia Vancouver, BC, Canada cell. 604-440-5150 http://www.ocgy.ubc.ca/~suttle/ ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users