[OMPI users] Slot count parameter in hostfile ignored

2017-09-07 Thread Maksym Planeta
Hello,

I'm trying to tell OpenMPI how many processes per node I want to use, but 
mpirun seems to ignore the configuration I provide.

I create following hostfile:

$ cat hostfile.16
taurusi6344 slots=16
taurusi6348 slots=16

And then start the app as follows:

$ mpirun --display-map   -machinefile hostfile.16 -np 2 hostname
 Data for JOB [42099,1] offset 0

    JOB MAP   

 Data for node: taurusi6344 Num slots: 1Max slots: 0Num procs: 1
Process OMPI jobid: [42099,1] App: 0 Process rank: 0 Bound: socket 
0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 
0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 
0[core 6[hwt 0]], socket 0[core 7[hwt 0]], socket 0[core 8[hwt 0]], socket 
0[core 9[hwt 0]], socket 0[core 10[hwt 0]], socket 0[core 11[hwt 
0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]

 Data for node: taurusi6348 Num slots: 1Max slots: 0Num procs: 1
Process OMPI jobid: [42099,1] App: 0 Process rank: 1 Bound: socket 
0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 
0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 
0[core 6[hwt 0]], socket 0[core 7[hwt 0]], socket 0[core 8[hwt 0]], socket 
0[core 9[hwt 0]], socket 0[core 10[hwt 0]], socket 0[core 11[hwt 
0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]

 =
taurusi6344
taurusi6348

If I put anything more than 2 in "-np 2", I get following error message:

$ mpirun --display-map   -machinefile hostfile.16 -np 4 hostname
--
There are not enough slots available in the system to satisfy the 4 slots
that were requested by the application:
  hostname

Either request fewer slots for your application, or make more slots available
for use.
--

The OpenMPI version is "mpirun (Open MPI) 2.1.0"

Also there is SLURM installed with version "slurm 
16.05.7-Bull.1.1-20170512-1252"

Could you help me to enforce OpenMPI to respect slots paremeter?
-- 
Regards,
Maksym Planeta



smime.p7s
Description: S/MIME Cryptographic Signature
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Slot count parameter in hostfile ignored

2017-09-07 Thread r...@open-mpi.org
My best guess is that SLURM has only allocated 2 slots, and we respect the RM 
regardless of what you say in the hostfile. You can check this by adding 
--display-allocation to your cmd line. You probably need to tell slurm to 
allocate more cpus/node.


> On Sep 7, 2017, at 3:33 AM, Maksym Planeta  
> wrote:
> 
> Hello,
> 
> I'm trying to tell OpenMPI how many processes per node I want to use, but 
> mpirun seems to ignore the configuration I provide.
> 
> I create following hostfile:
> 
> $ cat hostfile.16
> taurusi6344 slots=16
> taurusi6348 slots=16
> 
> And then start the app as follows:
> 
> $ mpirun --display-map   -machinefile hostfile.16 -np 2 hostname
> Data for JOB [42099,1] offset 0
> 
>    JOB MAP   
> 
> Data for node: taurusi6344 Num slots: 1Max slots: 0Num procs: 1
>Process OMPI jobid: [42099,1] App: 0 Process rank: 0 Bound: socket 
> 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 
> 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 
> 0[core 6[hwt 0]], socket 0[core 7[hwt 0]], socket 0[core 8[hwt 0]], socket 
> 0[core 9[hwt 0]], socket 0[core 10[hwt 0]], socket 0[core 11[hwt 
> 0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]
> 
> Data for node: taurusi6348 Num slots: 1Max slots: 0Num procs: 1
>Process OMPI jobid: [42099,1] App: 0 Process rank: 1 Bound: socket 
> 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 
> 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 
> 0[core 6[hwt 0]], socket 0[core 7[hwt 0]], socket 0[core 8[hwt 0]], socket 
> 0[core 9[hwt 0]], socket 0[core 10[hwt 0]], socket 0[core 11[hwt 
> 0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]
> 
> =
> taurusi6344
> taurusi6348
> 
> If I put anything more than 2 in "-np 2", I get following error message:
> 
> $ mpirun --display-map   -machinefile hostfile.16 -np 4 hostname
> --
> There are not enough slots available in the system to satisfy the 4 slots
> that were requested by the application:
>  hostname
> 
> Either request fewer slots for your application, or make more slots available
> for use.
> --
> 
> The OpenMPI version is "mpirun (Open MPI) 2.1.0"
> 
> Also there is SLURM installed with version "slurm 
> 16.05.7-Bull.1.1-20170512-1252"
> 
> Could you help me to enforce OpenMPI to respect slots paremeter?
> -- 
> Regards,
> Maksym Planeta
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


[OMPI users] Cygwin64 mpiexec freezes

2017-09-07 Thread Llelan D.

  
  
Windows 10 64bit, Cygwin64, openmpi 1.10.7-1 (dev, c, c++,
  fortran), GCC 6.3.0-2 (core, gcc, g++, fortran)
I am compiling the standard "hello_c.c" example with mgicc:
$ mpicc -g hello_c.c -o hello_c

The showme:
gcc -g hello_c.c -o hello_c -fexceptions -L/usr/lib -lmpi -lopen-rte -lopen-pal -lm -lgdi32

This successfully creates hello_c.exe. When I run it directly, it
  performs as expected (The first time run brings up a Windows
  Firewall dialog and I click Accept):
$ ./hello_c
Hello World! I am 0 of 1, (Open MPI v1.10.7, package: Open MPI marco@GE-MATZERI-EU Distribution, ident: 1.10.7, repo rev: v1.10.6-48-g5e373bf, May 16, 2017, 129)

However, when I run it using mpiexec:
$ mpiexec -n 4 ./hello_c

$ ^C

Nothing is displayed and I have to ^C out. If I insert a
  puts("Start") just before the call to MPI_Init(&argc,
  &argv), and a puts("MPI_Init done.") just after, mpiexec will
  print "Start" for each process (4 times for the above example) and
  then freeze. It is never returning from the call to MPI_Init(...).
This is a freshly installed Cygwin64 and other non-mpi programs
  work fine. Can anyone give me an idea of what is going on?

hello_c.c
---
#include 
#include "mpi.h"

int main(int argc, char* argv[])
{
  int rank, size, len;
  char version[MPI_MAX_LIBRARY_VERSION_STRING];

  MPI_Init(&argc, &argv);

  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  MPI_Get_library_version(version, &len);
  printf("Hello World! I am %d of %d, (%s, %d)\n", rank, size, version, len);

  MPI_Finalize();

  return 0;
}
---


  

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Cygwin64 mpiexec freezes

2017-09-07 Thread Marco Atzeri

On 07/09/2017 21:12, Llelan D. wrote:
Windows 10 64bit, Cygwin64, openmpi 1.10.7-1 (dev, c, c++, fortran), GCC 
6.3.0-2 (core, gcc, g++, fortran)


I am compiling the standard "hello_c.c" example with *mgicc*:

$ mpicc -g hello_c.c -o hello_c

The showme:

gcc -g hello_c.c -o hello_c -fexceptions -L/usr/lib -lmpi -lopen-rte -lopen-pal 
-lm -lgdi32

This successfully creates hello_c.exe. When I run it directly, it 
performs as expected (The first time run brings up a Windows Firewall 
dialog and I click Accept):


$ ./hello_c
Hello World! I am 0 of 1, (Open MPI v1.10.7, package: Open MPI 
marco@GE-MATZERI-EU Distribution, ident: 1.10.7, repo rev: v1.10.6-48-g5e373bf, 
May 16, 2017, 129)

However, when I run it using mpiexec:

$ mpiexec -n 4 ./hello_c

$ ^C

Nothing is displayed and I have to ^C out. If I insert a puts("Start") 
just before the call to MPI_Init(&argc, &argv), and a puts("MPI_Init 
done.") just after, mpiexec will print "Start" for each process (4 times 
for the above example) and then freeze. It is never returning from the 
call to MPI_Init(...).


This is a freshly installed Cygwin64 and other non-mpi programs work 
fine. Can anyone give me an idea of what is going on?




same here.
I will investigate to check if is a side effect of the
 new 6.3.0-2 compiler or of the latest cygwin

Regards
Marco







___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


[OMPI users] Errors when compiled with Cygwin MinGW gcc

2017-09-07 Thread Llelan D.

  
  
Windows 10 64bit, Cygwin64, openmpi 1.10.7-1 (dev, c, c++,
  fortran), x86_64-w64-mingw32-gcc 6.3.0-1 (core, gcc, g++, fortran)
I am compiling the standard "hello_c.c" example with mgicc
  configured to use the Cygwin installed MinGW gcc compiler:
$ export OMPI_CC=x86_64-w64-mingw32-gcc
$ mpicc -idirafter /cygdrive/c/cygwin64/usr/include hello_c.c -o hello_c 

For some unknown reason, I have to manually include the
  "usr/include" directory to pick up the "mpi.h" header, and it must
  be searched after the standard header directories to avoid
  "time_t" typedef conflicts. The showme:
x86_64-w64-mingw32-gcc -idirafter /cygdrive/c/cygwin64/usr/include hello_c.c -o hello_c -fexceptions -L/usr/lib -lmpi -lopen-rte -lopen-pal -lm -lgdi32

This successfully creates hello_c.exe. Running it either directly
  or with mpiexec displays the following errors for each
  process:

$ ./hello_c
  1 [main] hello_c 18116 child_copy: cygheap read copy failed, 0x180307408..0x180319318, done 0, windows pid 18116, Win32 error 6
    112 [main] hello_c 18116 D:\mpi\examples\hello_c.exe: *** fatal error - ccalloc would have returned NULL
$ mpiexec -n 4 hello_c
  1 [main] hello_c 15660 child_copy: cygheap read copy failed, 0x180307408..0x1803216B0, done 0, windows pid 15660, Win32 error 6
182 [main] hello_c 15660 D:\mpi\examples\hello_c.exe: *** fatal error - ccalloc would have returned NULL
  2 [main] hello_c 7852 child_copy: cygheap read copy failed, 0x180307408..0x18031F588, done 0, windows pid 7852, Win32 error 6
223 [main] hello_c 7852 D:\mpi\examples\hello_c.exe: *** fatal error - ccalloc would have returned NULL
  1 [main] hello_c 16464 child_copy: cygheap read copy failed, 0x180307408..0x1803208E0, done 0, windows pid 16464, Win32 error 6
215 [main] hello_c 16464 D:\mpi\examples\hello_c.exe: *** fatal error - ccalloc would have returned NULL
  2 [main] hello_c 17184 child_copy: cygheap read copy failed, 0x180307408..0x180322710, done 0, windows pid 17184, Win32 error 6
281 [main] hello_c 17184 D:\mpi\examples\hello_c.exe: *** fatal error - ccalloc would have returned NULL
Does anyone have any ideas as to what is causing these errors. Can
an open mpi application even be compiled with the Cygwin installed
MinGW gcc compiler?
hello_c.c
---
#include 
#include "mpi.h"

int main(int argc, char* argv[])
{
  int rank, size, len;
  char version[MPI_MAX_LIBRARY_VERSION_STRING];

  MPI_Init(&argc, &argv);

  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  MPI_Get_library_version(version, &len);
  printf("Hello World! I am %d of %d, (%s, %d)\n", rank, size, version, len);

  MPI_Finalize();

  return 0;
}
---


  

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users