On Fri, Jan 15, 2016 at 7:53 AM, Matt Thompson <fort...@gmail.com> wrote:
> All, > > I'm not too sure if this is an MPI issue, a Fortran issue, or something > else but I thought I'd ask the MPI gurus here first since my web search > failed me. > > There is a chance in the future I might want/need to query an environment > variable in a Fortran program, namely to figure out what switch a currently > running process is on (via SLURM_TOPOLOGY_ADDR in my case) and perhaps make > a "per-switch" communicator.[1] > > So, I coded up a boring Fortran program whose only exciting lines are: > > call MPI_Get_Processor_Name(processor_name,name_length,ierror) > call get_environment_variable("HOST",host_name) > > write (*,'(A,X,I4,X,A,X,I4,X,A,X,A)') "Process", myid, "of", npes, "is > on processor", trim(processor_name) > write (*,'(A,X,I4,X,A,X,I4,X,A,X,A)') "Process", myid, "of", npes, "is > on host", trim(host_name) > > I decided to try out with the HOST environment variable first because it > is simple and different per node (I didn't want to take many, many nodes to > find the point when a switch is traversed). I then grabbed two nodes with 4 > processes per node and...: > > (1046) $ echo "$SLURM_NODELIST" > borgj[020,036] > (1047) $ pdsh -w "$SLURM_NODELIST" echo '$HOST' > borgj036: borgj036 > borgj020: borgj020 > (1048) $ mpifort -o hostenv.x hostenv.F90 > (1049) $ mpirun -np 8 ./hostenv.x | sort -g -k2 > Process 0 of 8 is on host borgj020 > Process 0 of 8 is on processor borgj020 > Process 1 of 8 is on host borgj020 > Process 1 of 8 is on processor borgj020 > Process 2 of 8 is on host borgj020 > Process 2 of 8 is on processor borgj020 > Process 3 of 8 is on host borgj020 > Process 3 of 8 is on processor borgj020 > Process 4 of 8 is on host borgj020 > Process 4 of 8 is on processor borgj036 > Process 5 of 8 is on host borgj020 > Process 5 of 8 is on processor borgj036 > Process 6 of 8 is on host borgj020 > Process 6 of 8 is on processor borgj036 > Process 7 of 8 is on host borgj020 > Process 7 of 8 is on processor borgj036 > > It looks like MPI_Get_Processor_Name is doing its thing, but the HOST one > seems to only be reflecting the first host. My guess is that OpenMPI > doesn't export every processes' environment separately to every process so > it is reflecting HOST from process 0. > > I would guess that what is actually happening is that slurm is exporting all of the variables from the host node including the $HOST variable and overwriting the defaults on other nodes. You should use the SLURM options to limit the list of variables that you export from the host to only those that you need. > > > So, I guess my question is: can this be done? Is there an option to Open > MPI that might do it? Or is this just something MPI doesn't do? Or is my > Google-fu just too weak to figure out the right search-phrase to find the > answer to this probable FAQ? > > Matt > > [1] Note, this might be unnecessary, but I got to the point where I wanted > to see if I *could* do it, rather than *should*. > > -- > Matt Thompson > > Man Among Men > Fulcrum of History > > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/01/28287.php > -- Jim Edwards CESM Software Engineer National Center for Atmospheric Research Boulder, CO