Matt Thompson <fort...@gmail.com> writes:

> All,
>
> I'm not too sure if this is an MPI issue, a Fortran issue, or something
> else but I thought I'd ask the MPI gurus here first since my web search
> failed me.
>
> There is a chance in the future I might want/need to query an environment
> variable in a Fortran program, namely to figure out what switch a currently
> running process is on (via SLURM_TOPOLOGY_ADDR in my case) and perhaps make
> a "per-switch" communicator.[1]
>
> So, I coded up a boring Fortran program whose only exciting lines are:
>
>    call MPI_Get_Processor_Name(processor_name,name_length,ierror)
>    call get_environment_variable("HOST",host_name)
>
>    write (*,'(A,X,I4,X,A,X,I4,X,A,X,A)') "Process", myid, "of", npes, "is
> on processor", trim(processor_name)
>    write (*,'(A,X,I4,X,A,X,I4,X,A,X,A)') "Process", myid, "of", npes, "is
> on host", trim(host_name)
>
> I decided to try out with the HOST environment variable first because it is
> simple and different per node

For what it's worth, HOST isn't in the default environment on our
cluster.  HOSTNAME is defined as a bash variable (but not in the POSIX
shell), but tasks may or may not be launched via a shell and inherit it.

[...]

> It looks like MPI_Get_Processor_Name is doing its thing, but the HOST one
> seems to only be reflecting the first host. My guess is that OpenMPI
> doesn't export every processes' environment separately to every process so
> it is reflecting HOST from process 0.

In addition to what's already been said:  First, was something wrong
with mpi_get_processor_name, or was this just an illustration?

Second, I'd expect the environment even a serial job gets to depend --
there's no particular reason the batch resource manager should export
specific variables to it.ยน  I don't know what SLURM does, but HOSTNAME
happens to be in the list SGE defines for the job that are reset from
any values exported from the submission environment.  Then, what an
MPI process sees depends on how it was launched (via qrsh in SGE's
case).

I happen to have a copy of the timeless classic mpi-hello.f90 lying
around with these lines.  They do the same thing when run with OMPI
under SGE if you swap the commenting (as the demo doubtless should have
shown with a runtime switch...).

  !  call MPI_Get_processor_name (hostname, l, ierr)
    call get_environment_variable ("HOSTNAME", hostname)

__
1. In fact, it can be a security issue to export things like LD_PRELOAD,
   depending on how jobs are started; I had bad experience of the
   CVE process from fixing that after two of us coincidentally
   noticed different aspects of it being allowed.  I've an idea it's
   been an issue for at least Torque too.

Reply via email to