Re: [OMPI users] Slot count parameter in hostfile ignored

2017-09-08 Thread Maksym Planeta
Indeed mpirun shows slots=1 per node, but I create allocation with 
--ntasks-per-node 24, so I do have all cores of the node allocated.

When I use srun I can get all the cores.

On 09/07/2017 02:12 PM, r...@open-mpi.org wrote:
> My best guess is that SLURM has only allocated 2 slots, and we respect the RM 
> regardless of what you say in the hostfile. You can check this by adding 
> --display-allocation to your cmd line. You probably need to tell slurm to 
> allocate more cpus/node.
> 
> 
>> On Sep 7, 2017, at 3:33 AM, Maksym Planeta  
>> wrote:
>>
>> Hello,
>>
>> I'm trying to tell OpenMPI how many processes per node I want to use, but 
>> mpirun seems to ignore the configuration I provide.
>>
>> I create following hostfile:
>>
>> $ cat hostfile.16
>> taurusi6344 slots=16
>> taurusi6348 slots=16
>>
>> And then start the app as follows:
>>
>> $ mpirun --display-map   -machinefile hostfile.16 -np 2 hostname
>> Data for JOB [42099,1] offset 0
>>
>>    JOB MAP   
>>
>> Data for node: taurusi6344 Num slots: 1Max slots: 0Num procs: 1
>> Process OMPI jobid: [42099,1] App: 0 Process rank: 0 Bound: socket 
>> 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 
>> 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 
>> 0[core 6[hwt 0]], socket 0[core 7[hwt 0]], socket 0[core 8[hwt 0]], socket 
>> 0[core 9[hwt 0]], socket 0[core 10[hwt 0]], socket 0[core 11[hwt 
>> 0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]
>>
>> Data for node: taurusi6348 Num slots: 1Max slots: 0Num procs: 1
>> Process OMPI jobid: [42099,1] App: 0 Process rank: 1 Bound: socket 
>> 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 
>> 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 
>> 0[core 6[hwt 0]], socket 0[core 7[hwt 0]], socket 0[core 8[hwt 0]], socket 
>> 0[core 9[hwt 0]], socket 0[core 10[hwt 0]], socket 0[core 11[hwt 
>> 0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]
>>
>> =
>> taurusi6344
>> taurusi6348
>>
>> If I put anything more than 2 in "-np 2", I get following error message:
>>
>> $ mpirun --display-map   -machinefile hostfile.16 -np 4 hostname
>> --
>> There are not enough slots available in the system to satisfy the 4 slots
>> that were requested by the application:
>>   hostname
>>
>> Either request fewer slots for your application, or make more slots available
>> for use.
>> --
>>
>> The OpenMPI version is "mpirun (Open MPI) 2.1.0"
>>
>> Also there is SLURM installed with version "slurm 
>> 16.05.7-Bull.1.1-20170512-1252"
>>
>> Could you help me to enforce OpenMPI to respect slots paremeter?
>> -- 
>> Regards,
>> Maksym Planeta
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
> 

-- 
Regards,
Maksym Planeta



smime.p7s
Description: S/MIME Cryptographic Signature
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Slot count parameter in hostfile ignored

2017-09-08 Thread Gilles Gouaillardet

Maxsym,


can you please post your sbatch script ?

fwiw, i am unable to reproduce the issue with the latest v2.x from github.


by any chance, would you be able to test the latest openmpi 2.1.2rc3 ?


Cheers,


Gilles


On 9/8/2017 4:19 PM, Maksym Planeta wrote:

Indeed mpirun shows slots=1 per node, but I create allocation with 
--ntasks-per-node 24, so I do have all cores of the node allocated.

When I use srun I can get all the cores.

On 09/07/2017 02:12 PM, r...@open-mpi.org wrote:

My best guess is that SLURM has only allocated 2 slots, and we respect the RM 
regardless of what you say in the hostfile. You can check this by adding 
--display-allocation to your cmd line. You probably need to tell slurm to 
allocate more cpus/node.



On Sep 7, 2017, at 3:33 AM, Maksym Planeta  
wrote:

Hello,

I'm trying to tell OpenMPI how many processes per node I want to use, but 
mpirun seems to ignore the configuration I provide.

I create following hostfile:

$ cat hostfile.16
taurusi6344 slots=16
taurusi6348 slots=16

And then start the app as follows:

$ mpirun --display-map   -machinefile hostfile.16 -np 2 hostname
Data for JOB [42099,1] offset 0

   JOB MAP   

Data for node: taurusi6344 Num slots: 1Max slots: 0Num procs: 1
 Process OMPI jobid: [42099,1] App: 0 Process rank: 0 Bound: socket 
0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 
0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 
0[core 6[hwt 0]], socket 0[core 7[hwt 0]], socket 0[core 8[hwt 0]], socket 
0[core 9[hwt 0]], socket 0[core 10[hwt 0]], socket 0[core 11[hwt 
0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]

Data for node: taurusi6348 Num slots: 1Max slots: 0Num procs: 1
 Process OMPI jobid: [42099,1] App: 0 Process rank: 1 Bound: socket 
0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 
0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 
0[core 6[hwt 0]], socket 0[core 7[hwt 0]], socket 0[core 8[hwt 0]], socket 
0[core 9[hwt 0]], socket 0[core 10[hwt 0]], socket 0[core 11[hwt 
0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]

=
taurusi6344
taurusi6348

If I put anything more than 2 in "-np 2", I get following error message:

$ mpirun --display-map   -machinefile hostfile.16 -np 4 hostname
--
There are not enough slots available in the system to satisfy the 4 slots
that were requested by the application:
   hostname

Either request fewer slots for your application, or make more slots available
for use.
--

The OpenMPI version is "mpirun (Open MPI) 2.1.0"

Also there is SLURM installed with version "slurm 
16.05.7-Bull.1.1-20170512-1252"

Could you help me to enforce OpenMPI to respect slots paremeter?
--
Regards,
Maksym Planeta

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users




___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Slot count parameter in hostfile ignored

2017-09-08 Thread Maksym Planeta
I run start an interactive allocation and I just noticed that the problem 
happens, when I join this allocation from another shell.

Here is how I join:

srun --pty --x11 --jobid=$(squeue -u $USER -o %A | tail -n 1) bash

And here is how I create the allocation:

srun --pty --nodes 8 --ntasks-per-node 24 --mem 50G --time=3:00:00 
--partition=haswell --x11 bash


On 09/08/2017 09:58 AM, Gilles Gouaillardet wrote:
> Maxsym,
> 
> 
> can you please post your sbatch script ?
> 
> fwiw, i am unable to reproduce the issue with the latest v2.x from github.
> 
> 
> by any chance, would you be able to test the latest openmpi 2.1.2rc3 ?
> 
> 
> Cheers,
> 
> 
> Gilles
> 
> 
> On 9/8/2017 4:19 PM, Maksym Planeta wrote:
>> Indeed mpirun shows slots=1 per node, but I create allocation with 
>> --ntasks-per-node 24, so I do have all cores of the node allocated.
>>
>> When I use srun I can get all the cores.
>>
>> On 09/07/2017 02:12 PM, r...@open-mpi.org wrote:
>>> My best guess is that SLURM has only allocated 2 slots, and we 
>>> respect the RM regardless of what you say in the hostfile. You can 
>>> check this by adding --display-allocation to your cmd line. You 
>>> probably need to tell slurm to allocate more cpus/node.
>>>
>>>
 On Sep 7, 2017, at 3:33 AM, Maksym Planeta 
  wrote:

 Hello,

 I'm trying to tell OpenMPI how many processes per node I want to 
 use, but mpirun seems to ignore the configuration I provide.

 I create following hostfile:

 $ cat hostfile.16
 taurusi6344 slots=16
 taurusi6348 slots=16

 And then start the app as follows:

 $ mpirun --display-map   -machinefile hostfile.16 -np 2 hostname
 Data for JOB [42099,1] offset 0

    JOB MAP   

 Data for node: taurusi6344 Num slots: 1Max slots: 0Num 
 procs: 1
  Process OMPI jobid: [42099,1] App: 0 Process rank: 0 Bound: 
 socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 
 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 
 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], socket 0[core 7[hwt 0]], 
 socket 0[core 8[hwt 0]], socket 0[core 9[hwt 0]], socket 0[core 
 10[hwt 0]], socket 0[core 11[hwt 
 0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]

 Data for node: taurusi6348 Num slots: 1Max slots: 0Num 
 procs: 1
  Process OMPI jobid: [42099,1] App: 0 Process rank: 1 Bound: 
 socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 
 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 
 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], socket 0[core 7[hwt 0]], 
 socket 0[core 8[hwt 0]], socket 0[core 9[hwt 0]], socket 0[core 
 10[hwt 0]], socket 0[core 11[hwt 
 0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]

 =
 taurusi6344
 taurusi6348

 If I put anything more than 2 in "-np 2", I get following error 
 message:

 $ mpirun --display-map   -machinefile hostfile.16 -np 4 hostname
 -- 

 There are not enough slots available in the system to satisfy the 4 
 slots
 that were requested by the application:
hostname

 Either request fewer slots for your application, or make more slots 
 available
 for use.
 -- 


 The OpenMPI version is "mpirun (Open MPI) 2.1.0"

 Also there is SLURM installed with version "slurm 
 16.05.7-Bull.1.1-20170512-1252"

 Could you help me to enforce OpenMPI to respect slots paremeter?
 -- 
 Regards,
 Maksym Planeta

 ___
 users mailing list
 users@lists.open-mpi.org
 https://lists.open-mpi.org/mailman/listinfo/users
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
>>>
>>
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users

-- 
Regards,
Maksym Planeta



smime.p7s
Description: S/MIME Cryptographic Signature
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Slot count parameter in hostfile ignored

2017-09-08 Thread Maksym Planeta




by any chance, would you be able to test the latest openmpi 2.1.2rc3 ?



OpenMPI 2.1.0 is the latest on our cluster.

--
Regards,
Maksym Planeta



smime.p7s
Description: S/MIME Cryptographic Signature
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Slot count parameter in hostfile ignored

2017-09-08 Thread Gilles Gouaillardet

Thanks, now i can reproduce the issue


Cheers,


Gilles


On 9/8/2017 5:20 PM, Maksym Planeta wrote:

I run start an interactive allocation and I just noticed that the problem 
happens, when I join this allocation from another shell.

Here is how I join:

srun --pty --x11 --jobid=$(squeue -u $USER -o %A | tail -n 1) bash

And here is how I create the allocation:

srun --pty --nodes 8 --ntasks-per-node 24 --mem 50G --time=3:00:00 
--partition=haswell --x11 bash


On 09/08/2017 09:58 AM, Gilles Gouaillardet wrote:

Maxsym,


can you please post your sbatch script ?

fwiw, i am unable to reproduce the issue with the latest v2.x from github.


by any chance, would you be able to test the latest openmpi 2.1.2rc3 ?


Cheers,


Gilles


On 9/8/2017 4:19 PM, Maksym Planeta wrote:

Indeed mpirun shows slots=1 per node, but I create allocation with
--ntasks-per-node 24, so I do have all cores of the node allocated.

When I use srun I can get all the cores.

On 09/07/2017 02:12 PM, r...@open-mpi.org wrote:

My best guess is that SLURM has only allocated 2 slots, and we
respect the RM regardless of what you say in the hostfile. You can
check this by adding --display-allocation to your cmd line. You
probably need to tell slurm to allocate more cpus/node.



On Sep 7, 2017, at 3:33 AM, Maksym Planeta
 wrote:

Hello,

I'm trying to tell OpenMPI how many processes per node I want to
use, but mpirun seems to ignore the configuration I provide.

I create following hostfile:

$ cat hostfile.16
taurusi6344 slots=16
taurusi6348 slots=16

And then start the app as follows:

$ mpirun --display-map   -machinefile hostfile.16 -np 2 hostname
Data for JOB [42099,1] offset 0

   JOB MAP   

Data for node: taurusi6344 Num slots: 1Max slots: 0Num
procs: 1
  Process OMPI jobid: [42099,1] App: 0 Process rank: 0 Bound:
socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core
2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket
0[core 5[hwt 0]], socket 0[core 6[hwt 0]], socket 0[core 7[hwt 0]],
socket 0[core 8[hwt 0]], socket 0[core 9[hwt 0]], socket 0[core
10[hwt 0]], socket 0[core 11[hwt
0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]

Data for node: taurusi6348 Num slots: 1Max slots: 0Num
procs: 1
  Process OMPI jobid: [42099,1] App: 0 Process rank: 1 Bound:
socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core
2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket
0[core 5[hwt 0]], socket 0[core 6[hwt 0]], socket 0[core 7[hwt 0]],
socket 0[core 8[hwt 0]], socket 0[core 9[hwt 0]], socket 0[core
10[hwt 0]], socket 0[core 11[hwt
0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]

=
taurusi6344
taurusi6348

If I put anything more than 2 in "-np 2", I get following error
message:

$ mpirun --display-map   -machinefile hostfile.16 -np 4 hostname
--

There are not enough slots available in the system to satisfy the 4
slots
that were requested by the application:
hostname

Either request fewer slots for your application, or make more slots
available
for use.
--


The OpenMPI version is "mpirun (Open MPI) 2.1.0"

Also there is SLURM installed with version "slurm
16.05.7-Bull.1.1-20170512-1252"

Could you help me to enforce OpenMPI to respect slots paremeter?
--
Regards,
Maksym Planeta

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users



___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users



___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


[OMPI users] Build Open-MPI without OpenCL support

2017-09-08 Thread Nilesh Kokane
Hello,

How can I compile openmpi without the support of open-cl?

The only link I could find is [1], but openmpi doesn't configure this
option.
The reason why I'm trying to build openmpi without open-cl is it throws the
following errors even with the nvidia installed opencl.


./mpicc -I/usr/local/cuda-8.0.61/lib64 -lcuda test_cuda_aware.c -o myapp
./mpicc: /usr/local/cuda-8.0.61/lib64/libOpenCL.so.1: no version
information available (required by /home/kokanen/opt/lib/libopen-pal.so.20)
/tmp/cc7KaPDe.o: In function `main':
test_cuda_aware.c:(.text+0x5b): undefined reference to `cudaMalloc'
test_cuda_aware.c:(.text+0xd3): undefined reference to `cudaFree'
/home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to
`clGetPlatformInfo@OPENCL_1.0'
/home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to
`clGetPlatformIDs@OPENCL_1.0'
/home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to
`clGetDeviceInfo@OPENCL_1.0'
/home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to
`clGetDeviceIDs@OPENCL_1.0'




[1]. https://www.open-mpi.org/projects/hwloc/doc/v1.7.2/a00014.php

-- 
Regards,
Nilesh Kokane
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OMPI users] Slot count parameter in hostfile ignored

2017-09-08 Thread Gilles Gouaillardet
For the time being, you can
srun --ntasks-per-node 24 --jobid=...
When joining the allocation.

This use case looks a bit convoluted to me, so i am not even sure we should 
consider there is a bug in Open MPI.

Ralph, any thoughts  ?

Cheers,

Gilles

Gilles Gouaillardet  wrote:
>Thanks, now i can reproduce the issue
>
>
>Cheers,
>
>
>Gilles
>
>
>On 9/8/2017 5:20 PM, Maksym Planeta wrote:
>> I run start an interactive allocation and I just noticed that the problem 
>> happens, when I join this allocation from another shell.
>>
>> Here is how I join:
>>
>> srun --pty --x11 --jobid=$(squeue -u $USER -o %A | tail -n 1) bash
>>
>> And here is how I create the allocation:
>>
>> srun --pty --nodes 8 --ntasks-per-node 24 --mem 50G --time=3:00:00 
>> --partition=haswell --x11 bash
>>
>>
>> On 09/08/2017 09:58 AM, Gilles Gouaillardet wrote:
>>> Maxsym,
>>>
>>>
>>> can you please post your sbatch script ?
>>>
>>> fwiw, i am unable to reproduce the issue with the latest v2.x from github.
>>>
>>>
>>> by any chance, would you be able to test the latest openmpi 2.1.2rc3 ?
>>>
>>>
>>> Cheers,
>>>
>>>
>>> Gilles
>>>
>>>
>>> On 9/8/2017 4:19 PM, Maksym Planeta wrote:
 Indeed mpirun shows slots=1 per node, but I create allocation with
 --ntasks-per-node 24, so I do have all cores of the node allocated.

 When I use srun I can get all the cores.

 On 09/07/2017 02:12 PM, r...@open-mpi.org wrote:
> My best guess is that SLURM has only allocated 2 slots, and we
> respect the RM regardless of what you say in the hostfile. You can
> check this by adding --display-allocation to your cmd line. You
> probably need to tell slurm to allocate more cpus/node.
>
>
>> On Sep 7, 2017, at 3:33 AM, Maksym Planeta
>>  wrote:
>>
>> Hello,
>>
>> I'm trying to tell OpenMPI how many processes per node I want to
>> use, but mpirun seems to ignore the configuration I provide.
>>
>> I create following hostfile:
>>
>> $ cat hostfile.16
>> taurusi6344 slots=16
>> taurusi6348 slots=16
>>
>> And then start the app as follows:
>>
>> $ mpirun --display-map   -machinefile hostfile.16 -np 2 hostname
>> Data for JOB [42099,1] offset 0
>>
>>    JOB MAP   
>>
>> Data for node: taurusi6344 Num slots: 1Max slots: 0Num
>> procs: 1
>>   Process OMPI jobid: [42099,1] App: 0 Process rank: 0 Bound:
>> socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core
>> 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket
>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], socket 0[core 7[hwt 0]],
>> socket 0[core 8[hwt 0]], socket 0[core 9[hwt 0]], socket 0[core
>> 10[hwt 0]], socket 0[core 11[hwt
>> 0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]
>>
>> Data for node: taurusi6348 Num slots: 1Max slots: 0Num
>> procs: 1
>>   Process OMPI jobid: [42099,1] App: 0 Process rank: 1 Bound:
>> socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core
>> 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket
>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], socket 0[core 7[hwt 0]],
>> socket 0[core 8[hwt 0]], socket 0[core 9[hwt 0]], socket 0[core
>> 10[hwt 0]], socket 0[core 11[hwt
>> 0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]
>>
>> =
>> taurusi6344
>> taurusi6348
>>
>> If I put anything more than 2 in "-np 2", I get following error
>> message:
>>
>> $ mpirun --display-map   -machinefile hostfile.16 -np 4 hostname
>> --
>>
>> There are not enough slots available in the system to satisfy the 4
>> slots
>> that were requested by the application:
>> hostname
>>
>> Either request fewer slots for your application, or make more slots
>> available
>> for use.
>> --
>>
>>
>> The OpenMPI version is "mpirun (Open MPI) 2.1.0"
>>
>> Also there is SLURM installed with version "slurm
>> 16.05.7-Bull.1.1-20170512-1252"
>>
>> Could you help me to enforce OpenMPI to respect slots paremeter?
>> -- 
>> Regards,
>> Maksym Planeta
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>

 ___
 users mailing list
 users@lists.open-mpi.org
 https://lists.open-mpi.org/mailman/listinfo/users
>

Re: [OMPI users] Build Open-MPI without OpenCL support

2017-09-08 Thread Gilles Gouaillardet
Nilesh,

Can you
configure --without-nvidia ...
And see if it helps ?

Cheers,

Gilles

Nilesh Kokane  wrote:
>Hello, 
>
>
>How can I compile openmpi without the support of open-cl?
>
>
>The only link I could find is [1], but openmpi doesn't configure this option.
>
>The reason why I'm trying to build openmpi without open-cl is it throws the 
>following errors even with the nvidia installed opencl.
>
>
>
>./mpicc -I/usr/local/cuda-8.0.61/lib64 -lcuda test_cuda_aware.c -o myapp
>
>./mpicc: /usr/local/cuda-8.0.61/lib64/libOpenCL.so.1: no version information 
>available (required by /home/kokanen/opt/lib/libopen-pal.so.20)
>
>/tmp/cc7KaPDe.o: In function `main':
>
>test_cuda_aware.c:(.text+0x5b): undefined reference to `cudaMalloc'
>
>test_cuda_aware.c:(.text+0xd3): undefined reference to `cudaFree'
>
>/home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to 
>`clGetPlatformInfo@OPENCL_1.0'
>
>/home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to 
>`clGetPlatformIDs@OPENCL_1.0'
>
>/home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to 
>`clGetDeviceInfo@OPENCL_1.0'
>
>/home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to 
>`clGetDeviceIDs@OPENCL_1.0'
>
>
>
>
>
>[1]. https://www.open-mpi.org/projects/hwloc/doc/v1.7.2/a00014.php
>
>
>-- 
>
>Regards,
>Nilesh Kokane
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] [OMPI USERS] segmentation fault at startup

2017-09-08 Thread Alberto Ortiz
Hi,
I have a system running openmpi programs over archlinux. I had the programs
compiled and running on July when I was using the version 1.10.4 or .7 of
openmpi if I remember correctly. Just recently I updated the openmpi
version to 2.1.1 and tried running a compiled program and ran correctly.
The problem came when I tried compiling and running it again. Using mpicc
doesn't seem to give any problems, but when running the program with mpirun
it gives the next message:

mpirun noticed that process rank 0 with PID 0 on node alarm exited on
signal 11 (Segmentation fault).

I have tried putting a printf in the first line of the main function and it
doesn't reach that point, so I have assumed it must be a startup problem. I
have tried running simple programs that only say "hello world" and they run
correctly. What bothers me is that the same code compiled and ran correctly
with an earlier version of openmpi and now it doesn't.

If it helps I am running the programs with "sudo -H mpirun
--allow-run-as-root -hostfile hosts -n 8 main". I need to run it with root
privileges as I am combining SW with HW accelerators and I need to access
some files with root permissions in order to communicate with the HW
accelerators. There is no instantiation or use of those files until after
running some functions in the main program, so there should be no problem
or concern with that part.

Thank you in advance,
Alberto
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Build Open-MPI without OpenCL support

2017-09-08 Thread Nilesh Kokane
On Fri, Sep 8, 2017 at 3:33 PM, Gilles Gouaillardet
 wrote:
>
> Nilesh,
>
> Can you
> configure --without-nvidia ...
> And see if it helps ?

No, I need Nvidia cuda support.



//Nilesh Kokane
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Build Open-MPI without OpenCL support

2017-09-08 Thread Nilesh Kokane
On Fri, Sep 8, 2017 at 4:08 PM, Nilesh Kokane  wrote:
> On Fri, Sep 8, 2017 at 3:33 PM, Gilles Gouaillardet
>  wrote:
>>
>> Nilesh,
>>
>> Can you
>> configure --without-nvidia ...
>> And see if it helps ?
>
> No, I need Nvidia cuda support.


Or else do you have a way to solve this open-cl errors?

./mpicc -I/usr/local/cuda-8.0.61/lib64 -lcuda test_cuda_aware.c -o myapp
./mpicc: /usr/local/cuda-8.0.61/lib64/libOpenCL.so.1: no version
information available (required by
/home/kokanen/opt/lib/libopen-pal.so.20)
/tmp/cc7KaPDe.o: In function `main':
test_cuda_aware.c:(.text+0x5b): undefined reference to `cudaMalloc'
test_cuda_aware.c:(.text+0xd3): undefined reference to `cudaFree'
/home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to
`clGetPlatformInfo@OPENCL_1.0'
/home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to
`clGetPlatformIDs@OPENCL_1.0'
/home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to
`clGetDeviceInfo@OPENCL_1.0'
/home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to
`clGetDeviceIDs@OPENCL_1.0'



Pointing to -L/usr/local/cuda-8.0.61/lib64 while compiling with mpicc
didn't help.

Any clues?


-- 
Regards,
Nilesh Kokane
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Errors when compiled with Cygwin MinGW gcc

2017-09-08 Thread Marco Atzeri

On 08/09/2017 02:38, Llelan D. wrote:
Windows 10 64bit, Cygwin64, openmpi 1.10.7-1 (dev, c, c++, fortran), 
x86_64-w64-mingw32-gcc 6.3.0-1 (core, gcc, g++, fortran)


I am compiling the standard "hello_c.c" example with *mgicc* configured 
to use the Cygwin installed MinGW gcc compiler:


$ export OMPI_CC=x86_64-w64-mingw32-gcc
$ mpicc -idirafter /cygdrive/c/cygwin64/usr/include hello_c.c -o hello_c

For some unknown reason, I have to manually include the "usr/include" 
directory to pick up the "mpi.h" header, and it must be searched after 
the standard header directories to avoid "time_t" typedef conflicts. The 
showme:




you can not mix cygwin dll's with mingw compilations.
They use different world paradigma


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Build Open-MPI without OpenCL support

2017-09-08 Thread Gilles Gouaillardet
can you
./mpicc -showme -I/usr/local/cuda-8.0.61/lib64 -lcuda test_cuda_aware.c -o myapp
and double check -lcuda is *after* -lopen-pal ?

Cheers,

Gilles

On Fri, Sep 8, 2017 at 7:40 PM, Nilesh Kokane  wrote:
> On Fri, Sep 8, 2017 at 4:08 PM, Nilesh Kokane  
> wrote:
>> On Fri, Sep 8, 2017 at 3:33 PM, Gilles Gouaillardet
>>  wrote:
>>>
>>> Nilesh,
>>>
>>> Can you
>>> configure --without-nvidia ...
>>> And see if it helps ?
>>
>> No, I need Nvidia cuda support.
>
>
> Or else do you have a way to solve this open-cl errors?
>
> ./mpicc -I/usr/local/cuda-8.0.61/lib64 -lcuda test_cuda_aware.c -o myapp
> ./mpicc: /usr/local/cuda-8.0.61/lib64/libOpenCL.so.1: no version
> information available (required by
> /home/kokanen/opt/lib/libopen-pal.so.20)
> /tmp/cc7KaPDe.o: In function `main':
> test_cuda_aware.c:(.text+0x5b): undefined reference to `cudaMalloc'
> test_cuda_aware.c:(.text+0xd3): undefined reference to `cudaFree'
> /home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to
> `clGetPlatformInfo@OPENCL_1.0'
> /home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to
> `clGetPlatformIDs@OPENCL_1.0'
> /home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to
> `clGetDeviceInfo@OPENCL_1.0'
> /home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to
> `clGetDeviceIDs@OPENCL_1.0'
>
>
>
> Pointing to -L/usr/local/cuda-8.0.61/lib64 while compiling with mpicc
> didn't help.
>
> Any clues?
>
>
> --
> Regards,
> Nilesh Kokane
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


[OMPI users] Cuda Aware open mpi 2.0 problem with Nvidia 375 and cuda-8.0 in Ubuntu 16.04

2017-09-08 Thread umashankar
Hi,

I have a successful installation of Nvidia drivers and cuda which is
confirmed by
"nvcc -V"  and  "nvidia-smi".

after configuring the openmpi with
"./configure --with-cuda --prefix=/home/umashankar/softwares/openmpi-2.0.3"

"make all install"

and after exporting paths ,

I ended up with an error as following

mpirun: /usr/local/cuda/lib64/libOpenCL.so.1: no version information
available (required by
/home/umashankar/softwares/openmpi-2.0.3/lib/libopen-pal.so.20)

tried to lookup online for the solution, but none of them helped me.
please have a look at this
https://github.com/mbevand/silentarmy/issues/63
 which suggests to edit the opencl headers and library path before make,
but i could not find where should i change.

By the way I have installed OpenCL through

sudo apt-get nvidia-opencl-dev
and
sudo apt-get install nvidia-opencl-icd-375

and the null point behaviour of "clinfo" gave me this

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]No platform
  clCreateContext(NULL, ...) [other]  Success [NV]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No platform



So please can any one through some light on where i'm going wrong? is it
with Mpirun, or with opencl ?

thank you,
Umashankar
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] OpenMPI 1.10.5 oversubscribing cores

2017-09-08 Thread twurgl

I posted this question last year and we ended up not upgrading to the newer
openmpi.  Now I need to change to openmpi 1.10.5 and have the same issue.

Specifically, using 1.4.2, I can run two 12 core jobs on a 24 core node and the
processes would bind to cores and only have 1 process per core.  ie not
oversubscribe.

What I used with 1.4.2 was:
mpirun --mca mpi_paffinity_alone 1 --mca btl openib,tcp,sm,self ...

Now with 1.10.5, I have tried multiple combinations of map-to core, bind-to core
etc and cannot run 2 jobs on the same node without oversubcribing.

Is there a solution to this?

Thanks for any info
tom
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Slot count parameter in hostfile ignored

2017-09-08 Thread r...@open-mpi.org
It isn’t an issue as there is nothing wrong with OMPI. Your method of joining 
the allocation is a problem. What you have done is to create a job step that 
has only 1 slot/node. We have no choice but to honor that constraint and run 
within it.

What you should be doing is to use salloc to create the allocation. This places 
you inside the main allocation so we can use all of it.


> On Sep 8, 2017, at 1:27 AM, Gilles Gouaillardet  wrote:
> 
> Thanks, now i can reproduce the issue
> 
> 
> Cheers,
> 
> 
> Gilles
> 
> 
> On 9/8/2017 5:20 PM, Maksym Planeta wrote:
>> I run start an interactive allocation and I just noticed that the problem 
>> happens, when I join this allocation from another shell.
>> 
>> Here is how I join:
>> 
>> srun --pty --x11 --jobid=$(squeue -u $USER -o %A | tail -n 1) bash
>> 
>> And here is how I create the allocation:
>> 
>> srun --pty --nodes 8 --ntasks-per-node 24 --mem 50G --time=3:00:00 
>> --partition=haswell --x11 bash
>> 
>> 
>> On 09/08/2017 09:58 AM, Gilles Gouaillardet wrote:
>>> Maxsym,
>>> 
>>> 
>>> can you please post your sbatch script ?
>>> 
>>> fwiw, i am unable to reproduce the issue with the latest v2.x from github.
>>> 
>>> 
>>> by any chance, would you be able to test the latest openmpi 2.1.2rc3 ?
>>> 
>>> 
>>> Cheers,
>>> 
>>> 
>>> Gilles
>>> 
>>> 
>>> On 9/8/2017 4:19 PM, Maksym Planeta wrote:
 Indeed mpirun shows slots=1 per node, but I create allocation with
 --ntasks-per-node 24, so I do have all cores of the node allocated.
 
 When I use srun I can get all the cores.
 
 On 09/07/2017 02:12 PM, r...@open-mpi.org wrote:
> My best guess is that SLURM has only allocated 2 slots, and we
> respect the RM regardless of what you say in the hostfile. You can
> check this by adding --display-allocation to your cmd line. You
> probably need to tell slurm to allocate more cpus/node.
> 
> 
>> On Sep 7, 2017, at 3:33 AM, Maksym Planeta
>>  wrote:
>> 
>> Hello,
>> 
>> I'm trying to tell OpenMPI how many processes per node I want to
>> use, but mpirun seems to ignore the configuration I provide.
>> 
>> I create following hostfile:
>> 
>> $ cat hostfile.16
>> taurusi6344 slots=16
>> taurusi6348 slots=16
>> 
>> And then start the app as follows:
>> 
>> $ mpirun --display-map   -machinefile hostfile.16 -np 2 hostname
>> Data for JOB [42099,1] offset 0
>> 
>>    JOB MAP   
>> 
>> Data for node: taurusi6344 Num slots: 1Max slots: 0Num
>> procs: 1
>>  Process OMPI jobid: [42099,1] App: 0 Process rank: 0 Bound:
>> socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core
>> 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket
>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], socket 0[core 7[hwt 0]],
>> socket 0[core 8[hwt 0]], socket 0[core 9[hwt 0]], socket 0[core
>> 10[hwt 0]], socket 0[core 11[hwt
>> 0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]
>> 
>> Data for node: taurusi6348 Num slots: 1Max slots: 0Num
>> procs: 1
>>  Process OMPI jobid: [42099,1] App: 0 Process rank: 1 Bound:
>> socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core
>> 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket
>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], socket 0[core 7[hwt 0]],
>> socket 0[core 8[hwt 0]], socket 0[core 9[hwt 0]], socket 0[core
>> 10[hwt 0]], socket 0[core 11[hwt
>> 0]]:[B/B/B/B/B/B/B/B/B/B/B/B][./././././././././././.]
>> 
>> =
>> taurusi6344
>> taurusi6348
>> 
>> If I put anything more than 2 in "-np 2", I get following error
>> message:
>> 
>> $ mpirun --display-map   -machinefile hostfile.16 -np 4 hostname
>> --
>> 
>> There are not enough slots available in the system to satisfy the 4
>> slots
>> that were requested by the application:
>>hostname
>> 
>> Either request fewer slots for your application, or make more slots
>> available
>> for use.
>> --
>> 
>> 
>> The OpenMPI version is "mpirun (Open MPI) 2.1.0"
>> 
>> Also there is SLURM installed with version "slurm
>> 16.05.7-Bull.1.1-20170512-1252"
>> 
>> Could you help me to enforce OpenMPI to respect slots paremeter?
>> -- 
>> Regards,
>> Maksym Planeta
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
> ___
> users mailing list
> users@lists.open-mpi.org
> https://li

Re: [OMPI users] OpenMPI 1.10.5 oversubscribing cores

2017-09-08 Thread r...@open-mpi.org
What you probably want to do is add --cpu-list a,b,c... to each mpirun command, 
where each one lists the cores you want to assign to that job.


> On Sep 8, 2017, at 6:46 AM, twu...@goodyear.com wrote:
> 
> 
> I posted this question last year and we ended up not upgrading to the newer
> openmpi.  Now I need to change to openmpi 1.10.5 and have the same issue.
> 
> Specifically, using 1.4.2, I can run two 12 core jobs on a 24 core node and 
> the
> processes would bind to cores and only have 1 process per core.  ie not
> oversubscribe.
> 
> What I used with 1.4.2 was:
> mpirun --mca mpi_paffinity_alone 1 --mca btl openib,tcp,sm,self ...
> 
> Now with 1.10.5, I have tried multiple combinations of map-to core, bind-to 
> core
> etc and cannot run 2 jobs on the same node without oversubcribing.
> 
> Is there a solution to this?
> 
> Thanks for any info
> tom
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] OpenMPI 1.10.5 oversubscribing cores

2017-09-08 Thread Jeff Squyres (jsquyres)
Tom --

If you're going to upgrade, can you upgrade to the latest Open MPI (2.1.1)?  
I.e., unless you have a reason for wanting to stay back at an already-old 
version, you might as well upgrade to the latest latest latest to give you the 
longest shelf life.

I mention this because we are immanently about to release Open MPI v3.0.0, 
which will knock v1.10.x into the "unsupported" category (i.e., we'll be 
supporting v3.0.x, v2.1.x, and v2.0.x).

All that being said: if you need to stay back at the 1.10 series for some 
reason, you should update to the latest 1.10.x: v1.10.7 (not v1.10.5).


> On Sep 8, 2017, at 10:10 AM, r...@open-mpi.org wrote:
> 
> What you probably want to do is add --cpu-list a,b,c... to each mpirun 
> command, where each one lists the cores you want to assign to that job.
> 
> 
>> On Sep 8, 2017, at 6:46 AM, twu...@goodyear.com wrote:
>> 
>> 
>> I posted this question last year and we ended up not upgrading to the newer
>> openmpi.  Now I need to change to openmpi 1.10.5 and have the same issue.
>> 
>> Specifically, using 1.4.2, I can run two 12 core jobs on a 24 core node and 
>> the
>> processes would bind to cores and only have 1 process per core.  ie not
>> oversubscribe.
>> 
>> What I used with 1.4.2 was:
>> mpirun --mca mpi_paffinity_alone 1 --mca btl openib,tcp,sm,self ...
>> 
>> Now with 1.10.5, I have tried multiple combinations of map-to core, bind-to 
>> core
>> etc and cannot run 2 jobs on the same node without oversubcribing.
>> 
>> Is there a solution to this?
>> 
>> Thanks for any info
>> tom
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users


-- 
Jeff Squyres
jsquy...@cisco.com

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Issues with Large Window Allocations

2017-09-08 Thread Joseph Schuchart
We are currently discussing internally how to proceed with this issue on 
our machine. We did a little survey to see the setup of some of the 
machines we have access to, which includes an IBM, a Bull machine, and 
two Cray XC40 machines. To summarize our findings:


1) On the Cray systems, both /tmp and /dev/shm are mounted tmpfs and 
each limited to half of the main memory size per node.
2) On the IBM system, nodes have 64GB and /tmp is limited to 20 GB and 
mounted from a disk partition. /dev/shm, on the other hand, is sized at 
63GB.
3) On the above systems, /proc/sys/kernel/shm* is set up to allow the 
full memory of the node to be used as System V shared memory.
4) On the Bull machine, /tmp is mounted from a disk and fixed to ~100GB 
while /dev/shm is limited to half the node's memory (there are nodes 
with 2TB memory, huge page support is available). System V shmem on the 
other hand is limited to 4GB.


Overall, it seems that there is no globally optimal allocation strategy 
as the best matching source of shared memory is machine dependent.


Open MPI treats System V shared memory as the least favorable option, 
even giving it a lower priority than POSIX shared memory, where 
conflicting names might occur. What's the reason for preferring /tmp and 
POSIX shared memory over System V? It seems to me that the latter is a 
cleaner and safer way (provided that shared memory is not constrained by 
/proc, which could easily be detected) while mmap'ing large files feels 
somewhat hacky. Maybe I am missing an important aspect here though.


The reason I am interested in this issue is that our PGAS library is 
build on top of MPI and allocates pretty much all memory exposed to the 
user through MPI windows. Thus, any limitation from the underlying MPI 
implementation (or system for that matter) limits the amount of usable 
memory for our users.


Given our observations above, I would like to propose a change to the 
shared memory allocator: the priorities would be derived from the 
percentage of main memory each component can cover, i.e.,


Priority = 99*(min(Memory, SpaceAvail) / Memory)

At startup, each shm component would determine the available size (by 
looking at /tmp, /dev/shm, and /proc/sys/kernel/shm*, respectively) and 
set its priority between 0 and 99. A user could force Open MPI to use a 
specific component by manually settings its priority to 100 (which of 
course has to be documented). The priority could factor in other aspects 
as well, such as whether /tmp is actually tmpfs or disk-based if that 
makes a difference in performance.


This proposal of course assumes that shared memory size is the sole 
optimization goal. Maybe there are other aspects to consider? I'd be 
happy to work on a patch but would like to get some feedback before 
getting my hands dirty. IMO, the current situation is less than ideal 
and prone to cause pain to the average user. In my recent experience, 
debugging this has been tedious and the user in general shouldn't have 
to care about how shared memory is allocated (and administrators don't 
always seem to care, see above).


Any feedback is highly appreciated.

Joseph


On 09/04/2017 03:13 PM, Joseph Schuchart wrote:

Jeff, all,

Unfortunately, I (as a user) have no control over the page size on our 
cluster. My interest in this is more of a general nature because I am 
concerned that our users who use Open MPI underneath our code run into 
this issue on their machine.


I took a look at the code for the various window creation methods and 
now have a better picture of the allocation process in Open MPI. I 
realized that memory in windows allocated through MPI_Win_alloc or 
created through MPI_Win_create is registered with the IB device using 
ibv_reg_mr, which takes significant time for large allocations (I assume 
this is where hugepages would help?). In contrast to this, it seems that 
memory attached through MPI_Win_attach is not registered, which explains 
the lower latency for these allocation I am observing (I seem to 
remember having observed higher communication latencies as well).


Regarding the size limitation of /tmp: I found an opal/mca/shmem/posix 
component that uses shmem_open to create a POSIX shared memory object 
instead of a file on disk, which is then mmap'ed. Unfortunately, if I 
raise the priority of this component above that of the default mmap 
component I end up with a SIGBUS during MPI_Init. No other errors are 
reported by MPI. Should I open a ticket on Github for this?


As an alternative, would it be possible to use anonymous shared memory 
mappings to avoid the backing file for large allocations (maybe above a 
certain threshold) on systems that support MAP_ANONYMOUS and distribute 
the result of the mmap call among the processes on the node?


Thanks,
Joseph

On 08/29/2017 06:12 PM, Jeff Hammond wrote:
I don't know any reason why you shouldn't be able to use IB for 
intra-node transfers.  There are, of course, arguments against doi

Re: [OMPI users] Issues with Large Window Allocations

2017-09-08 Thread Gilles Gouaillardet
Joseph,

Thanks for sharing this !

sysv is imho the worst option because if something goes really wrong, Open MPI 
might leave some shared memory segments behind when a job crashes. From that 
perspective, leaving a big file in /tmp can be seen as the lesser evil.
That being said, there might be other reasons that drove this design

Cheers,

Gilles

Joseph Schuchart  wrote:
>We are currently discussing internally how to proceed with this issue on 
>our machine. We did a little survey to see the setup of some of the 
>machines we have access to, which includes an IBM, a Bull machine, and 
>two Cray XC40 machines. To summarize our findings:
>
>1) On the Cray systems, both /tmp and /dev/shm are mounted tmpfs and 
>each limited to half of the main memory size per node.
>2) On the IBM system, nodes have 64GB and /tmp is limited to 20 GB and 
>mounted from a disk partition. /dev/shm, on the other hand, is sized at 
>63GB.
>3) On the above systems, /proc/sys/kernel/shm* is set up to allow the 
>full memory of the node to be used as System V shared memory.
>4) On the Bull machine, /tmp is mounted from a disk and fixed to ~100GB 
>while /dev/shm is limited to half the node's memory (there are nodes 
>with 2TB memory, huge page support is available). System V shmem on the 
>other hand is limited to 4GB.
>
>Overall, it seems that there is no globally optimal allocation strategy 
>as the best matching source of shared memory is machine dependent.
>
>Open MPI treats System V shared memory as the least favorable option, 
>even giving it a lower priority than POSIX shared memory, where 
>conflicting names might occur. What's the reason for preferring /tmp and 
>POSIX shared memory over System V? It seems to me that the latter is a 
>cleaner and safer way (provided that shared memory is not constrained by 
>/proc, which could easily be detected) while mmap'ing large files feels 
>somewhat hacky. Maybe I am missing an important aspect here though.
>
>The reason I am interested in this issue is that our PGAS library is 
>build on top of MPI and allocates pretty much all memory exposed to the 
>user through MPI windows. Thus, any limitation from the underlying MPI 
>implementation (or system for that matter) limits the amount of usable 
>memory for our users.
>
>Given our observations above, I would like to propose a change to the 
>shared memory allocator: the priorities would be derived from the 
>percentage of main memory each component can cover, i.e.,
>
>Priority = 99*(min(Memory, SpaceAvail) / Memory)
>
>At startup, each shm component would determine the available size (by 
>looking at /tmp, /dev/shm, and /proc/sys/kernel/shm*, respectively) and 
>set its priority between 0 and 99. A user could force Open MPI to use a 
>specific component by manually settings its priority to 100 (which of 
>course has to be documented). The priority could factor in other aspects 
>as well, such as whether /tmp is actually tmpfs or disk-based if that 
>makes a difference in performance.
>
>This proposal of course assumes that shared memory size is the sole 
>optimization goal. Maybe there are other aspects to consider? I'd be 
>happy to work on a patch but would like to get some feedback before 
>getting my hands dirty. IMO, the current situation is less than ideal 
>and prone to cause pain to the average user. In my recent experience, 
>debugging this has been tedious and the user in general shouldn't have 
>to care about how shared memory is allocated (and administrators don't 
>always seem to care, see above).
>
>Any feedback is highly appreciated.
>
>Joseph
>
>
>On 09/04/2017 03:13 PM, Joseph Schuchart wrote:
>> Jeff, all,
>> 
>> Unfortunately, I (as a user) have no control over the page size on our 
>> cluster. My interest in this is more of a general nature because I am 
>> concerned that our users who use Open MPI underneath our code run into 
>> this issue on their machine.
>> 
>> I took a look at the code for the various window creation methods and 
>> now have a better picture of the allocation process in Open MPI. I 
>> realized that memory in windows allocated through MPI_Win_alloc or 
>> created through MPI_Win_create is registered with the IB device using 
>> ibv_reg_mr, which takes significant time for large allocations (I assume 
>> this is where hugepages would help?). In contrast to this, it seems that 
>> memory attached through MPI_Win_attach is not registered, which explains 
>> the lower latency for these allocation I am observing (I seem to 
>> remember having observed higher communication latencies as well).
>> 
>> Regarding the size limitation of /tmp: I found an opal/mca/shmem/posix 
>> component that uses shmem_open to create a POSIX shared memory object 
>> instead of a file on disk, which is then mmap'ed. Unfortunately, if I 
>> raise the priority of this component above that of the default mmap 
>> component I end up with a SIGBUS during MPI_Init. No other errors are 
>> reported by MPI. Should I open a ti

Re: [OMPI users] Issues with Large Window Allocations

2017-09-08 Thread Jeff Hammond
In my experience, POSIX is much more reliable than Sys5.  Sys5 depends on
the value of shmmax, which is often set to a small fraction of node
memory.  I've probably seen the error described on
http://verahill.blogspot.com/2012/04/solution-to-nwchem-shmmax-too-small.html
with NWChem a 1000 times because of this.  POSIX, on the other hand, isn't
limited by SHMMAX (https://community.oracle.com/thread/3828422).

POSIX is newer than Sys5, and while Sys5 is supported by Linux and thus
almost ubiquitous, it wasn't supported by Blue Gene, so in an HPC context,
one can argue that POSIX is more portable.

Jeff

On Fri, Sep 8, 2017 at 9:16 AM, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> Joseph,
>
> Thanks for sharing this !
>
> sysv is imho the worst option because if something goes really wrong, Open
> MPI might leave some shared memory segments behind when a job crashes. From
> that perspective, leaving a big file in /tmp can be seen as the lesser evil.
> That being said, there might be other reasons that drove this design
>
> Cheers,
>
> Gilles
>
> Joseph Schuchart  wrote:
> >We are currently discussing internally how to proceed with this issue on
> >our machine. We did a little survey to see the setup of some of the
> >machines we have access to, which includes an IBM, a Bull machine, and
> >two Cray XC40 machines. To summarize our findings:
> >
> >1) On the Cray systems, both /tmp and /dev/shm are mounted tmpfs and
> >each limited to half of the main memory size per node.
> >2) On the IBM system, nodes have 64GB and /tmp is limited to 20 GB and
> >mounted from a disk partition. /dev/shm, on the other hand, is sized at
> >63GB.
> >3) On the above systems, /proc/sys/kernel/shm* is set up to allow the
> >full memory of the node to be used as System V shared memory.
> >4) On the Bull machine, /tmp is mounted from a disk and fixed to ~100GB
> >while /dev/shm is limited to half the node's memory (there are nodes
> >with 2TB memory, huge page support is available). System V shmem on the
> >other hand is limited to 4GB.
> >
> >Overall, it seems that there is no globally optimal allocation strategy
> >as the best matching source of shared memory is machine dependent.
> >
> >Open MPI treats System V shared memory as the least favorable option,
> >even giving it a lower priority than POSIX shared memory, where
> >conflicting names might occur. What's the reason for preferring /tmp and
> >POSIX shared memory over System V? It seems to me that the latter is a
> >cleaner and safer way (provided that shared memory is not constrained by
> >/proc, which could easily be detected) while mmap'ing large files feels
> >somewhat hacky. Maybe I am missing an important aspect here though.
> >
> >The reason I am interested in this issue is that our PGAS library is
> >build on top of MPI and allocates pretty much all memory exposed to the
> >user through MPI windows. Thus, any limitation from the underlying MPI
> >implementation (or system for that matter) limits the amount of usable
> >memory for our users.
> >
> >Given our observations above, I would like to propose a change to the
> >shared memory allocator: the priorities would be derived from the
> >percentage of main memory each component can cover, i.e.,
> >
> >Priority = 99*(min(Memory, SpaceAvail) / Memory)
> >
> >At startup, each shm component would determine the available size (by
> >looking at /tmp, /dev/shm, and /proc/sys/kernel/shm*, respectively) and
> >set its priority between 0 and 99. A user could force Open MPI to use a
> >specific component by manually settings its priority to 100 (which of
> >course has to be documented). The priority could factor in other aspects
> >as well, such as whether /tmp is actually tmpfs or disk-based if that
> >makes a difference in performance.
> >
> >This proposal of course assumes that shared memory size is the sole
> >optimization goal. Maybe there are other aspects to consider? I'd be
> >happy to work on a patch but would like to get some feedback before
> >getting my hands dirty. IMO, the current situation is less than ideal
> >and prone to cause pain to the average user. In my recent experience,
> >debugging this has been tedious and the user in general shouldn't have
> >to care about how shared memory is allocated (and administrators don't
> >always seem to care, see above).
> >
> >Any feedback is highly appreciated.
> >
> >Joseph
> >
> >
> >On 09/04/2017 03:13 PM, Joseph Schuchart wrote:
> >> Jeff, all,
> >>
> >> Unfortunately, I (as a user) have no control over the page size on our
> >> cluster. My interest in this is more of a general nature because I am
> >> concerned that our users who use Open MPI underneath our code run into
> >> this issue on their machine.
> >>
> >> I took a look at the code for the various window creation methods and
> >> now have a better picture of the allocation process in Open MPI. I
> >> realized that memory in windows allocated through MPI_Win_alloc or
> >> created through MPI_Win_

Re: [OMPI users] Errors when compiled with Cygwin MinGW gcc

2017-09-08 Thread Llelan D.

On 09/08/2017 8:16 AM, Marco Atzeri wrote:

please reply in the mailing list
Oops! My apologies. I'm not used to a mailing list without the reply-to 
set to the mailing list.


Can a version of open mpi be built using x86_64-w64-mingw32-gcc so 
that it will work with code compiled with x86_64-w64-mingw32-gcc?

Not that I am aware
Now that's a problem. Many clients insist that, while you can use Cygwin 
to develop and debug, you can only use a MinGW production version to 
avoid license entanglements (Yes, I know that there is an explicit 
exception to deal with this but the client lawyers just don't care).


I strongly suggest that, to make open-mpi useful in all arenas, both 
Cygwin and Cygwin-MinGW versions be built and distributed through 
Cygwin(64) to settle our persnickety clients (the majority ).


Thank you for your help with my questions.


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Build Open-MPI without OpenCL support

2017-09-08 Thread Sylvain Jeaugey
To solve the undefined references to cudaMalloc and cudaFree, you need 
to link the CUDA runtime. So you should replace -lcuda by -lcudart.


For the OPENCL undefined references, I don't know where those are coming 
from ... could it be that hwloc is compiling OpenCL support but not 
adding -lOpenCL to the mpicc command, thus causing this issue ?


To work around the issue, I would try to uninstall the opencl libraries 
before recompiling Open MPI. Another way could be to add manually the 
OpenCL library with -lOpenCL.


On 09/08/2017 04:15 AM, Gilles Gouaillardet wrote:

can you
./mpicc -showme -I/usr/local/cuda-8.0.61/lib64 -lcuda test_cuda_aware.c -o myapp
and double check -lcuda is *after* -lopen-pal ?

Cheers,

Gilles

On Fri, Sep 8, 2017 at 7:40 PM, Nilesh Kokane  wrote:

On Fri, Sep 8, 2017 at 4:08 PM, Nilesh Kokane  wrote:

On Fri, Sep 8, 2017 at 3:33 PM, Gilles Gouaillardet
 wrote:

Nilesh,

Can you
configure --without-nvidia ...
And see if it helps ?

No, I need Nvidia cuda support.


Or else do you have a way to solve this open-cl errors?

./mpicc -I/usr/local/cuda-8.0.61/lib64 -lcuda test_cuda_aware.c -o myapp
./mpicc: /usr/local/cuda-8.0.61/lib64/libOpenCL.so.1: no version
information available (required by
/home/kokanen/opt/lib/libopen-pal.so.20)
/tmp/cc7KaPDe.o: In function `main':
test_cuda_aware.c:(.text+0x5b): undefined reference to `cudaMalloc'
test_cuda_aware.c:(.text+0xd3): undefined reference to `cudaFree'
/home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to
`clGetPlatformInfo@OPENCL_1.0'
/home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to
`clGetPlatformIDs@OPENCL_1.0'
/home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to
`clGetDeviceInfo@OPENCL_1.0'
/home/kokanen/opt/lib/libopen-pal.so.20: undefined reference to
`clGetDeviceIDs@OPENCL_1.0'



Pointing to -L/usr/local/cuda-8.0.61/lib64 while compiling with mpicc
didn't help.

Any clues?


--
Regards,
Nilesh Kokane
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users



---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Build Open-MPI without OpenCL support

2017-09-08 Thread Nilesh Kokane
On Fri, Sep 8, 2017 at 4:45 PM, Gilles Gouaillardet
 wrote:
> can you
> ./mpicc -showme -I/usr/local/cuda-8.0.61/lib64 -lcuda test_cuda_aware.c -o 
> myapp
> and double check -lcuda is *after* -lopen-pal ?

gcc -I/usr/local/cuda-8.0.61/lib64 -lcuda test_cuda_aware.c -o myapp
-I/home/kokanen/opt/include -pthread -Wl,-rpath
-Wl,/home/kokanen/opt/lib -Wl,--enable-new-dtags
-L/home/kokanen/opt/lib -lmpi


-- 
Regards,
Nilesh Kokane
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Cygwin64 mpiexec freezes

2017-09-08 Thread Marco Atzeri

On 07/09/2017 21:56, Marco Atzeri wrote:

On 07/09/2017 21:12, Llelan D. wrote:
Windows 10 64bit, Cygwin64, openmpi 1.10.7-1 (dev, c, c++, fortran), 
GCC 6.3.0-2 (core, gcc, g++, fortran)



However, when I run it using mpiexec:

$ mpiexec -n 4 ./hello_c

$ ^C

Nothing is displayed and I have to ^C out. If I insert a puts("Start") 
just before the call to MPI_Init(&argc, &argv), and a puts("MPI_Init 
done.") just after, mpiexec will print "Start" for each process (4 
times for the above example) and then freeze. It is never returning 
from the call to MPI_Init(...).


This is a freshly installed Cygwin64 and other non-mpi programs work 
fine. Can anyone give me an idea of what is going on?




same here.
I will investigate to check if is a side effect of the
  new 6.3.0-2 compiler or of the latest cygwin



I take back. It works fine.

$ cygcheck -cd openmpi cygwin gcc-core
Cygwin Package Information
Package  Version
cygwin   2.9.0-2
gcc-core 6.3.0-2
openmpi  1.10.7-1

 $  time mpirun -n 2 ./hello_c.exe
Hello, world, I am 0 of 2, (Open MPI v1.10.7, package: Open MPI 
marco@GE-MATZERI-EU Distribution, ident: 1.10.7, repo rev: 
v1.10.6-48-g5e373bf, May 16, 2017, 129)
Hello, world, I am 1 of 2, (Open MPI v1.10.7, package: Open MPI 
marco@GE-MATZERI-EU Distribution, ident: 1.10.7, repo rev: 
v1.10.6-48-g5e373bf, May 16, 2017, 129)


real0m3.500s
user0m1.309s
sys 0m2.851s


The most likely cause, that also caused my first reaction is some
network interface, usually a virtual one to be seen as active
but not operative.

In my case if the "PANGP Virtual Ethernet Adapter" is active
it causes mpirun/orterun to wait forever.


Looks at you network interface on

   Control Panel\Network and Internet\Network Connections

check for possible candidates and try disabling them.

Regards
Marco





___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users