After talking with the slurm folks and tracking down the history of how OMPI dealt with this variable, I have made a change to OMPI's use of it. This should now work correctly in the upcoming release.

Thanks
Ralph

On Aug 24, 2009, at 2:22 PM, matthew.pi...@ndsu.edu wrote:

Hello again,

As you requested:

node64-test ~>salloc -n7
salloc: Granted job allocation 827

node64-test ~>srun hostname
node64-17.xxxx.xxxx.xxxx.xxxx
node64-17.xxxx.xxxx.xxxx.xxxx
node64-20.xxxx.xxxx.xxxx.xxxx
node64-18.xxxx.xxxx.xxxx.xxxx
node64-19.xxxx.xxxx.xxxx.xxxx
node64-18.xxxx.xxxx.xxxx.xxxx
node64-19.xxxx.xxxx.xxxx.xxxx

node64-test ~>printenv | grep SLURM
SLURM_NODELIST=node64-[17-20]
SLURM_NNODES=4
SLURM_JOBID=827
SLURM_TASKS_PER_NODE=2(x3),1
SLURM_JOB_ID=827
SLURM_NPROCS=7
SLURM_JOB_NODELIST=node64-[17-20]
SLURM_JOB_CPUS_PER_NODE=2(x4)
SLURM_JOB_NUM_NODES=4

Thanks again for your time.
Matt

Very interesting! I see the problem - we have never encountered the
SLURM_TASKS_PER_NODE in that format, while the SLURM_JOB_CPUS_PER_NODE indicates that we have indeed been allocated two processors on each of the
nodes! So when you just do mpirun without specifying the number of
processes, we will launch 4 processes (2 on each node) since that is what
SLURM told us we have been given.

Interesting configuration you have there.

I can add some logic that tests for internal consistency between the two
and
compensates for the discrepancy. Can you get a slightly bigger allocation, one that covers several nodes? For example, "salloc -n7"? And then send
the
output again from "printenv | grep SLURM"?

I need to see if your configuration use a regex to describe the
SLURM_TASKS_PER_NODE, and what it looks like.

Thanks
Ralph



On Mon, Aug 24, 2009 at 1:55 PM, <matthew.pi...@ndsu.edu> wrote:

Hello,

Hopefully the below information will be helpful.

SLURM Version: 1.3.15

node64-test ~>salloc -n3
salloc: Granted job allocation 826

node64-test ~>srun hostname
node64-24.xxxx.xxxx.xxxx.xxxx
node64-25.xxxx.xxxx.xxxx.xxxx
node64-24.xxxx.xxxx.xxxx.xxxx

node64-test ~>printenv | grep SLURM
SLURM_NODELIST=node64-[24-25]
SLURM_NNODES=2
SLURM_JOBID=826
SLURM_TASKS_PER_NODE=2,1
SLURM_JOB_ID=826
SLURM_NPROCS=3
SLURM_JOB_NODELIST=node64-[24-25]
SLURM_JOB_CPUS_PER_NODE=2(x2)
SLURM_JOB_NUM_NODES=2

node64-test ~>mpirun --display-allocation hostname

======================   ALLOCATED NODES   ======================

Data for node: Name: node64-test.xxxx.xxxx.xxxx.xxxx   Num slots: 0
Max slots: 0
Data for node: Name: node64-24 Num slots: 2    Max slots: 0
Data for node: Name: node64-25 Num slots: 2    Max slots: 0

=================================================================
node64-24.xxxx.xxxx.xxxx.xxxx
node64-24.xxxx.xxxx.xxxx.xxxx
node64-25.xxxx.xxxx.xxxx.xxxx
node64-25.xxxx.xxxx.xxxx.xxxx


Thanks,
Matt

Haven't seen that before on any of our machines.

Could you do "printenv | grep SLURM" after the salloc and send the
results?

What version of SLURM is this?

Please run "mpirun --display-allocation hostname" and send the
results.

Thanks
Ralph

On Mon, Aug 24, 2009 at 11:30 AM, <matthew.pi...@ndsu.edu> wrote:

Hello,

I've seem to run into an interesting problem with openMPI. After
allocating 3 processors and confirming that the 3 processors are
allocated. mpirun on a simple mpitest program seems to run on 4
processors. We have 2 processors per node. I can repeat this case
with
any
odd number of nodes, openMPI seems to take any remaining processors
on
the
box. We are running openMPI v1.3.3. Here is an example of what
happens:

node64-test ~>salloc -n3
salloc: Granted job allocation 825

node64-test ~>srun hostname
node64-28.xxxx.xxxx.xxxx.xxxx
node64-28.xxxx.xxxx.xxxx.xxxx
node64-29.xxxx.xxxx.xxxx.xxxx

node64-test ~>MX_RCACHE=0
LD_LIBRARY_PATH="/hurd/mpi/openmpi/lib:/usr/local/mx/lib" mpirun
mpi_pgms/mpitest
MPI domain size: 4
I am rank 000 - node64-28.xxxx.xxxx.xxxx.xxxx
I am rank 003 - node64-29.xxxx.xxxx.xxxx.xxxx
I am rank 001 - node64-28.xxxx.xxxx.xxxx.xxxx
I am rank 002 - node64-29.xxxx.xxxx.xxxx.xxxx



For those who may be curious here is the program:

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

extern int main(int argc, char *argv[]);

extern int main(int argc, char *argv[])

{
      auto int rank,
               size,
               namelen;

      MPI_Status status;

      static char processor_name[MPI_MAX_PROCESSOR_NAME];

      MPI_Init(&argc, &argv);
      MPI_Comm_rank(MPI_COMM_WORLD, &rank);
      MPI_Comm_size(MPI_COMM_WORLD, &size);

     if ( rank == 0 )
      {
              MPI_Get_processor_name(processor_name, &namelen);
              fprintf(stdout,"My name is: %s\n",processor_name);
              fprintf(stdout,"Cluster size is: %d\n", size);

      }
      else
      {
              MPI_Get_processor_name(processor_name, &namelen);
              fprintf(stdout,"My name is: %s\n",processor_name);
      }

      MPI_Finalize();
      return(0);
}


I'm curious if this is a bug in the way openMPI interprets SLURM
environment variables. If you have any ideas or need any more
information
let me know.


Thanks.
Matt

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to