[OMPI users] OpenMPI against multiple, evolving SLURM versions

2016-01-29 Thread William Law
Hi,

Our group can't find anyway to do this and it'd be helpful.

We use slurm and keep upgrading the slurm environment.  OpenMPI bombs out 
against PMI each time the libslurm stuff changes, which seems to be fairly 
regularly.  Is there a way to compile against slurm but insulate ourselves from 
the libslurm chaos?  Obvious will ask the slurm folks too.

[wlaw@some-node /scratch/users/wlaw/imb/src]$ mpirun -n 2 --mca grpcomm ^pmi 
./IMB-MPI1 
[some-node.local:42584] mca: base: component_find: unable to open 
/share/sw/free/openmpi/1.6.5/intel/13sp1up1/lib/openmpi/mca_ess_pmi: 
libslurm.so.28: cannot open shared object file: No such file or directory 
(ignored)
[some-node.local:42585] mca: base: component_find: unable to open 
/share/sw/free/openmpi/1.6.5/intel/13sp1up1/lib/openmpi/mca_pubsub_pmi: 
libslurm.so.28: cannot open shared object file: No such file or directory 
(ignored)
[some-node.local:42586] mca: base: component_find: unable to open 
/share/sw/free/openmpi/1.6.5/intel/13sp1up1/lib/openmpi/mca_pubsub_pmi: 
libslurm.so.28: cannot open shared object file: No such file or directory 
(ignored)

(sent it via the wrong email so it bounced. heh)

Upon further investigation it seems like the most appropriate thing would be to 
point it at compile time to libslurm.so instead of libslurm.so.xx; does that 
make sense?

Thanks,

Will

Re: [OMPI users] OpenMPI against multiple, evolving SLURM versions

2016-01-29 Thread Ralph Castain
It makes sense - but isn’t it slurm that is linking libpmi against libslurm? I 
don’t think we are making that connection, so it would be a slurm issue to 
change it.


> On Jan 28, 2016, at 10:12 PM, William Law  wrote:
> 
> Hi,
> 
> Our group can't find anyway to do this and it'd be helpful.
> 
> We use slurm and keep upgrading the slurm environment.  OpenMPI bombs out 
> against PMI each time the libslurm stuff changes, which seems to be fairly 
> regularly.  Is there a way to compile against slurm but insulate ourselves 
> from the libslurm chaos?  Obvious will ask the slurm folks too.
> 
> [wlaw@some-node /scratch/users/wlaw/imb/src]$ mpirun -n 2 --mca grpcomm ^pmi 
> ./IMB-MPI1 
> [some-node.local:42584] mca: base: component_find: unable to open 
> /share/sw/free/openmpi/1.6.5/intel/13sp1up1/lib/openmpi/mca_ess_pmi: 
> libslurm.so.28: cannot open shared object file: No such file or directory 
> (ignored)
> [some-node.local:42585] mca: base: component_find: unable to open 
> /share/sw/free/openmpi/1.6.5/intel/13sp1up1/lib/openmpi/mca_pubsub_pmi: 
> libslurm.so.28: cannot open shared object file: No such file or directory 
> (ignored)
> [some-node.local:42586] mca: base: component_find: unable to open 
> /share/sw/free/openmpi/1.6.5/intel/13sp1up1/lib/openmpi/mca_pubsub_pmi: 
> libslurm.so.28: cannot open shared object file: No such file or directory 
> (ignored)
> 
> (sent it via the wrong email so it bounced. heh)
> 
> Upon further investigation it seems like the most appropriate thing would be 
> to point it at compile time to libslurm.so instead of libslurm.so.xx; does 
> that make sense?
> 
> Thanks,
> 
> Will
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2016/01/28408.php



[OMPI users] difference between OpenMPI - intel MPI mpi_waitall

2016-01-29 Thread Diego Avesani
Dear all,

I have created a program in fortran and OpenMPI, I test it on my laptop and
it works.
I would like to use it on a cluster that has, unfortunately, intel MPI.

The program crushes on the cluster and I get the following error:

*Fatal error in MPI_Waitall: Invalid MPI_Request, error stack:*
*MPI_Waitall(271): MPI_Waitall(count=3, req_array=0x7445f0,
status_array=0x744600) failed*
*MPI_Waitall(119): The supplied request in array element 2 was invalid
(kind=0)*

Do OpenMPI and MPI have some difference that I do not know?

this is my code

 REQUEST = MPI_REQUEST_NULL
 !send data share with left
 IF(MPIdata%rank.NE.0)THEN
MsgLength = MPIdata%imaxN
DO icount=1,MPIdata%imaxN
iNode = MPIdata%nodeFromUp(icount)
send_messageL(icount) = R1(iNode)
ENDDO
CALL MPI_ISEND(send_messageL, MsgLength, MPIdata%AUTO_COMP,
MPIdata%rank-1, MPIdata%rank, MPI_COMM_WORLD, REQUEST(1), MPIdata%iErr)
 ENDIF
 !
 !recive message FROM RIGHT CPU
 IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
MsgLength = MPIdata%imaxN
CALL MPI_IRECV(recv_messageR, MsgLength, MPIdata%AUTO_COMP,
MPIdata%rank+1, MPIdata%rank+1, MPI_COMM_WORLD, REQUEST(2), MPIdata%iErr)
 ENDIF
 CALL MPI_WAITALL(2,REQUEST,send_status_list,MPIdata%iErr)
 IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
DO i=1,MPIdata%imaxN
   iNode=MPIdata%nodeList2Up(i)
   R1(iNode)=recv_messageR(i)
ENDDO
 ENDIF

Thank a lot your help



Diego


Re: [OMPI users] difference between OpenMPI - intel MPI mpi_waitall

2016-01-29 Thread Gilles Gouaillardet
Diego,

your code snippet does MPI_Waitall(2,...)
but the error is about MPI_Waitall(3,...)

Cheers,

Gilles

On Friday, January 29, 2016, Diego Avesani  wrote:

> Dear all,
>
> I have created a program in fortran and OpenMPI, I test it on my laptop
> and it works.
> I would like to use it on a cluster that has, unfortunately, intel MPI.
>
> The program crushes on the cluster and I get the following error:
>
> *Fatal error in MPI_Waitall: Invalid MPI_Request, error stack:*
> *MPI_Waitall(271): MPI_Waitall(count=3, req_array=0x7445f0,
> status_array=0x744600) failed*
> *MPI_Waitall(119): The supplied request in array element 2 was invalid
> (kind=0)*
>
> Do OpenMPI and MPI have some difference that I do not know?
>
> this is my code
>
>  REQUEST = MPI_REQUEST_NULL
>  !send data share with left
>  IF(MPIdata%rank.NE.0)THEN
> MsgLength = MPIdata%imaxN
> DO icount=1,MPIdata%imaxN
> iNode = MPIdata%nodeFromUp(icount)
> send_messageL(icount) = R1(iNode)
> ENDDO
> CALL MPI_ISEND(send_messageL, MsgLength, MPIdata%AUTO_COMP,
> MPIdata%rank-1, MPIdata%rank, MPI_COMM_WORLD, REQUEST(1), MPIdata%iErr)
>  ENDIF
>  !
>  !recive message FROM RIGHT CPU
>  IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
> MsgLength = MPIdata%imaxN
> CALL MPI_IRECV(recv_messageR, MsgLength, MPIdata%AUTO_COMP,
> MPIdata%rank+1, MPIdata%rank+1, MPI_COMM_WORLD, REQUEST(2), MPIdata%iErr)
>  ENDIF
>  CALL MPI_WAITALL(2,REQUEST,send_status_list,MPIdata%iErr)
>  IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
> DO i=1,MPIdata%imaxN
>iNode=MPIdata%nodeList2Up(i)
>R1(iNode)=recv_messageR(i)
> ENDDO
>  ENDIF
>
> Thank a lot your help
>
>
>
> Diego
>
>


Re: [OMPI users] OpenMPI against multiple, evolving SLURM versions

2016-01-29 Thread Gilles Gouaillardet
Is openmpi linked with a static libpmi.a that requires a dynamic libslurm ?
that can be checked with ldd mca_ess_pmi.so

btw, do slurm folks increase the libpmi.so version each time slurm is
upgraded ?
that could be a part of the issue ...
but if they increase lib version because of abi changes, it might be a bad
idea to open libxxx.so instead of libxxx.so.y
generally speaking, libxxx.so.y is provided by libxxx package, and
libxxx.so is provided by libxxx-devel package, which means it might not be
available on compute nodes.
we could also dlopen libxxx instead of linking with it, and have the
sysadmin configure openmpi so it finds the right lib (this approach is used
by a prominent vendor, and has other pros but also cons)

Cheers,

Gilles

On Friday, January 29, 2016, Ralph Castain  wrote:

> It makes sense - but isn’t it slurm that is linking libpmi against
> libslurm? I don’t think we are making that connection, so it would be a
> slurm issue to change it.
>
>
> On Jan 28, 2016, at 10:12 PM, William Law  > wrote:
>
> Hi,
>
> Our group can't find anyway to do this and it'd be helpful.
>
> We use slurm and keep upgrading the slurm environment.  OpenMPI bombs out
> against PMI each time the libslurm stuff changes, which seems to be fairly
> regularly.  Is there a way to compile against slurm but insulate ourselves
> from the libslurm chaos?  Obvious will ask the slurm folks too.
>
> [*wlaw*@some-node /scratch/users/wlaw/imb/src]$ mpirun -n 2 --mca grpcomm
> ^pmi ./IMB-MPI1
> [some-node.local:42584] mca: base: component_find: unable to open
> /share/sw/free/openmpi/1.6.5/intel/13sp1up1/lib/openmpi/mca_ess_pmi:
> libslurm.so.28: cannot open shared object file: No such file or directory
> (ignored)
> [some-node.local:42585] mca: base: component_find: unable to open
> /share/sw/free/openmpi/1.6.5/intel/13sp1up1/lib/openmpi/mca_pubsub_pmi:
> libslurm.so.28: cannot open shared object file: No such file or directory
> (ignored)
> [some-node.local:42586] mca: base: component_find: unable to open
> /share/sw/free/openmpi/1.6.5/intel/13sp1up1/lib/openmpi/mca_pubsub_pmi:
> libslurm.so.28: cannot open shared object file: No such file or directory
> (ignored)
>
> (sent it via the wrong email so it bounced. heh)
>
> Upon further investigation it seems like the most appropriate thing would
> be to point it at compile time to libslurm.so instead of libslurm.so.xx;
> does that make sense?
>
> Thanks,
>
> Will
> ___
> users mailing list
> us...@open-mpi.org 
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/01/28408.php
>
>
>


Re: [OMPI users] difference between OpenMPI - intel MPI mpi_waitall

2016-01-29 Thread Diego Avesani
Dear all, Dear Gilles,

I do not understand, I am sorry.
I did a "grep" on my code and I find only "MPI_WAITALL(2", so I am not able
to find the error.


Thanks a lot



Diego


On 29 January 2016 at 11:58, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> Diego,
>
> your code snippet does MPI_Waitall(2,...)
> but the error is about MPI_Waitall(3,...)
>
> Cheers,
>
> Gilles
>
>
> On Friday, January 29, 2016, Diego Avesani 
> wrote:
>
>> Dear all,
>>
>> I have created a program in fortran and OpenMPI, I test it on my laptop
>> and it works.
>> I would like to use it on a cluster that has, unfortunately, intel MPI.
>>
>> The program crushes on the cluster and I get the following error:
>>
>> *Fatal error in MPI_Waitall: Invalid MPI_Request, error stack:*
>> *MPI_Waitall(271): MPI_Waitall(count=3, req_array=0x7445f0,
>> status_array=0x744600) failed*
>> *MPI_Waitall(119): The supplied request in array element 2 was invalid
>> (kind=0)*
>>
>> Do OpenMPI and MPI have some difference that I do not know?
>>
>> this is my code
>>
>>  REQUEST = MPI_REQUEST_NULL
>>  !send data share with left
>>  IF(MPIdata%rank.NE.0)THEN
>> MsgLength = MPIdata%imaxN
>> DO icount=1,MPIdata%imaxN
>> iNode = MPIdata%nodeFromUp(icount)
>> send_messageL(icount) = R1(iNode)
>> ENDDO
>> CALL MPI_ISEND(send_messageL, MsgLength, MPIdata%AUTO_COMP,
>> MPIdata%rank-1, MPIdata%rank, MPI_COMM_WORLD, REQUEST(1), MPIdata%iErr)
>>  ENDIF
>>  !
>>  !recive message FROM RIGHT CPU
>>  IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
>> MsgLength = MPIdata%imaxN
>> CALL MPI_IRECV(recv_messageR, MsgLength, MPIdata%AUTO_COMP,
>> MPIdata%rank+1, MPIdata%rank+1, MPI_COMM_WORLD, REQUEST(2), MPIdata%iErr)
>>  ENDIF
>>  CALL MPI_WAITALL(2,REQUEST,send_status_list,MPIdata%iErr)
>>  IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
>> DO i=1,MPIdata%imaxN
>>iNode=MPIdata%nodeList2Up(i)
>>R1(iNode)=recv_messageR(i)
>> ENDDO
>>  ENDIF
>>
>> Thank a lot your help
>>
>>
>>
>> Diego
>>
>>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/01/28411.php
>


Re: [OMPI users] difference between OpenMPI - intel MPI mpi_waitall

2016-01-29 Thread Jeff Squyres (jsquyres)
You must have an error elsewhere in your code; as Gilles pointed, the error 
message states that you are calling MPI_WAITALL with a first argument of 3:

--
MPI_Waitall(271): MPI_Waitall(count=3, req_array=0x7445f0, 
status_array=0x744600) failed
--

We can't really help you with problems with Intel MPI; sorry.  You'll need to 
contact their tech support for assistance.



> On Jan 29, 2016, at 6:11 AM, Diego Avesani  wrote:
> 
> Dear all, Dear Gilles,
> 
> I do not understand, I am sorry. 
> I did a "grep" on my code and I find only "MPI_WAITALL(2", so I am not able 
> to find the error.
> 
> 
> Thanks a lot
> 
> 
> 
> Diego
> 
> 
> On 29 January 2016 at 11:58, Gilles Gouaillardet 
>  wrote:
> Diego, 
> 
> your code snippet does MPI_Waitall(2,...)
> but the error is about MPI_Waitall(3,...)
> 
> Cheers,
> 
> Gilles
> 
> 
> On Friday, January 29, 2016, Diego Avesani  wrote:
> Dear all, 
> 
> I have created a program in fortran and OpenMPI, I test it on my laptop and 
> it works.
> I would like to use it on a cluster that has, unfortunately, intel MPI.
> 
> The program crushes on the cluster and I get the following error:
> 
> Fatal error in MPI_Waitall: Invalid MPI_Request, error stack:
> MPI_Waitall(271): MPI_Waitall(count=3, req_array=0x7445f0, 
> status_array=0x744600) failed
> MPI_Waitall(119): The supplied request in array element 2 was invalid (kind=0)
> 
> Do OpenMPI and MPI have some difference that I do not know?
> 
> this is my code
> 
>  REQUEST = MPI_REQUEST_NULL
>  !send data share with left
>  IF(MPIdata%rank.NE.0)THEN
> MsgLength = MPIdata%imaxN
> DO icount=1,MPIdata%imaxN
> iNode = MPIdata%nodeFromUp(icount)
> send_messageL(icount) = R1(iNode)
> ENDDO
> CALL MPI_ISEND(send_messageL, MsgLength, MPIdata%AUTO_COMP, 
> MPIdata%rank-1, MPIdata%rank, MPI_COMM_WORLD, REQUEST(1), MPIdata%iErr)
>  ENDIF
>  !
>  !recive message FROM RIGHT CPU
>  IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
> MsgLength = MPIdata%imaxN
> CALL MPI_IRECV(recv_messageR, MsgLength, MPIdata%AUTO_COMP, 
> MPIdata%rank+1, MPIdata%rank+1, MPI_COMM_WORLD, REQUEST(2), MPIdata%iErr)
>  ENDIF
>  CALL MPI_WAITALL(2,REQUEST,send_status_list,MPIdata%iErr)
>  IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
> DO i=1,MPIdata%imaxN
>iNode=MPIdata%nodeList2Up(i)
>R1(iNode)=recv_messageR(i)
> ENDDO
>  ENDIF
> 
> Thank a lot your help
> 
> 
> 
> Diego
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2016/01/28411.php
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2016/01/28413.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI users] OpenMPI against multiple, evolving SLURM versions

2016-01-29 Thread Gilles Gouaillardet
on second thought, is there any chance your sysadmin removed the old
libslurm.so.x but kept the old libpmix.so.y ?
in this case, the real issue would be hidden
your sysadmin "broke" the old libpmi, but you want to use the new one
indeed.

Cheers,

Gilles

On Friday, January 29, 2016, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> Is openmpi linked with a static libpmi.a that requires a dynamic libslurm ?
> that can be checked with ldd mca_ess_pmi.so
>
> btw, do slurm folks increase the libpmi.so version each time slurm is
> upgraded ?
> that could be a part of the issue ...
> but if they increase lib version because of abi changes, it might be a bad
> idea to open libxxx.so instead of libxxx.so.y
> generally speaking, libxxx.so.y is provided by libxxx package, and
> libxxx.so is provided by libxxx-devel package, which means it might not be
> available on compute nodes.
> we could also dlopen libxxx instead of linking with it, and have the
> sysadmin configure openmpi so it finds the right lib (this approach is used
> by a prominent vendor, and has other pros but also cons)
>
> Cheers,
>
> Gilles
>
> On Friday, January 29, 2016, Ralph Castain  > wrote:
>
>> It makes sense - but isn’t it slurm that is linking libpmi against
>> libslurm? I don’t think we are making that connection, so it would be a
>> slurm issue to change it.
>>
>>
>> On Jan 28, 2016, at 10:12 PM, William Law  wrote:
>>
>> Hi,
>>
>> Our group can't find anyway to do this and it'd be helpful.
>>
>> We use slurm and keep upgrading the slurm environment.  OpenMPI bombs out
>> against PMI each time the libslurm stuff changes, which seems to be fairly
>> regularly.  Is there a way to compile against slurm but insulate ourselves
>> from the libslurm chaos?  Obvious will ask the slurm folks too.
>>
>> [*wlaw*@some-node /scratch/users/wlaw/imb/src]$ mpirun -n 2 --mca
>> grpcomm ^pmi ./IMB-MPI1
>> [some-node.local:42584] mca: base: component_find: unable to open
>> /share/sw/free/openmpi/1.6.5/intel/13sp1up1/lib/openmpi/mca_ess_pmi:
>> libslurm.so.28: cannot open shared object file: No such file or directory
>> (ignored)
>> [some-node.local:42585] mca: base: component_find: unable to open
>> /share/sw/free/openmpi/1.6.5/intel/13sp1up1/lib/openmpi/mca_pubsub_pmi:
>> libslurm.so.28: cannot open shared object file: No such file or directory
>> (ignored)
>> [some-node.local:42586] mca: base: component_find: unable to open
>> /share/sw/free/openmpi/1.6.5/intel/13sp1up1/lib/openmpi/mca_pubsub_pmi:
>> libslurm.so.28: cannot open shared object file: No such file or directory
>> (ignored)
>>
>> (sent it via the wrong email so it bounced. heh)
>>
>> Upon further investigation it seems like the most appropriate thing would
>> be to point it at compile time to libslurm.so instead of libslurm.so.xx;
>> does that make sense?
>>
>> Thanks,
>>
>> Will
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/01/28408.php
>>
>>
>>


Re: [OMPI users] difference between OpenMPI - intel MPI mpi_waitall

2016-01-29 Thread Diego Avesani
Dear all, Dear Jeff, Dear Gilles,

I am sorry, porblably I am a stubborn.

In all my code I have

CALL MPI_WAITALL(2,REQUEST,send_status_list,MPIdata%iErr)

how can it became "3"?

the only thing that I can think is that MPI starts to allocate the vector
from "0", while fortran starts from 1. Indeed I allocate REQUEST(2)

what do you think?

Diego



Diego


On 29 January 2016 at 12:43, Jeff Squyres (jsquyres) 
wrote:

> You must have an error elsewhere in your code; as Gilles pointed, the
> error message states that you are calling MPI_WAITALL with a first argument
> of 3:
>
> --
> MPI_Waitall(271): MPI_Waitall(count=3, req_array=0x7445f0,
> status_array=0x744600) failed
> --
>
> We can't really help you with problems with Intel MPI; sorry.  You'll need
> to contact their tech support for assistance.
>
>
>
> > On Jan 29, 2016, at 6:11 AM, Diego Avesani 
> wrote:
> >
> > Dear all, Dear Gilles,
> >
> > I do not understand, I am sorry.
> > I did a "grep" on my code and I find only "MPI_WAITALL(2", so I am not
> able to find the error.
> >
> >
> > Thanks a lot
> >
> >
> >
> > Diego
> >
> >
> > On 29 January 2016 at 11:58, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com> wrote:
> > Diego,
> >
> > your code snippet does MPI_Waitall(2,...)
> > but the error is about MPI_Waitall(3,...)
> >
> > Cheers,
> >
> > Gilles
> >
> >
> > On Friday, January 29, 2016, Diego Avesani 
> wrote:
> > Dear all,
> >
> > I have created a program in fortran and OpenMPI, I test it on my laptop
> and it works.
> > I would like to use it on a cluster that has, unfortunately, intel MPI.
> >
> > The program crushes on the cluster and I get the following error:
> >
> > Fatal error in MPI_Waitall: Invalid MPI_Request, error stack:
> > MPI_Waitall(271): MPI_Waitall(count=3, req_array=0x7445f0,
> status_array=0x744600) failed
> > MPI_Waitall(119): The supplied request in array element 2 was invalid
> (kind=0)
> >
> > Do OpenMPI and MPI have some difference that I do not know?
> >
> > this is my code
> >
> >  REQUEST = MPI_REQUEST_NULL
> >  !send data share with left
> >  IF(MPIdata%rank.NE.0)THEN
> > MsgLength = MPIdata%imaxN
> > DO icount=1,MPIdata%imaxN
> > iNode = MPIdata%nodeFromUp(icount)
> > send_messageL(icount) = R1(iNode)
> > ENDDO
> > CALL MPI_ISEND(send_messageL, MsgLength, MPIdata%AUTO_COMP,
> MPIdata%rank-1, MPIdata%rank, MPI_COMM_WORLD, REQUEST(1), MPIdata%iErr)
> >  ENDIF
> >  !
> >  !recive message FROM RIGHT CPU
> >  IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
> > MsgLength = MPIdata%imaxN
> > CALL MPI_IRECV(recv_messageR, MsgLength, MPIdata%AUTO_COMP,
> MPIdata%rank+1, MPIdata%rank+1, MPI_COMM_WORLD, REQUEST(2), MPIdata%iErr)
> >  ENDIF
> >  CALL MPI_WAITALL(2,REQUEST,send_status_list,MPIdata%iErr)
> >  IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
> > DO i=1,MPIdata%imaxN
> >iNode=MPIdata%nodeList2Up(i)
> >R1(iNode)=recv_messageR(i)
> > ENDDO
> >  ENDIF
> >
> > Thank a lot your help
> >
> >
> >
> > Diego
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/01/28411.php
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/01/28413.php
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/01/28414.php
>


Re: [OMPI users] difference between OpenMPI - intel MPI mpi_waitall

2016-01-29 Thread Jeff Squyres (jsquyres)
On Jan 29, 2016, at 7:55 AM, Diego Avesani  wrote:
> 
> Dear all, Dear Jeff, Dear Gilles,
> 
> I am sorry, porblably I am a stubborn.
> 
> In all my code I have 
> 
> CALL MPI_WAITALL(2,REQUEST,send_status_list,MPIdata%iErr)
> 
> how can it became "3"?

I don't know.  You'll need to check your code, verify that you sent us the 
right error message, and/or contact Intel MPI technical support.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI users] difference between OpenMPI - intel MPI mpi_waitall

2016-01-29 Thread Gilles Gouaillardet
Diego,

First, you can double check the program you are running has been compiled
from your sources.

then you can run your program under a debugger, and browse the stack when
it crashes.

there could be a bug in intelmpi, that incorrectly translates 2 in Fortran
to 3 in C...
but as far as I am concerned, this is extremely unlikely.

Cheers,

Gilles

On Friday, January 29, 2016, Diego Avesani  wrote:

> Dear all, Dear Jeff, Dear Gilles,
>
> I am sorry, porblably I am a stubborn.
>
> In all my code I have
>
> CALL MPI_WAITALL(2,REQUEST,send_status_list,MPIdata%iErr)
>
> how can it became "3"?
>
> the only thing that I can think is that MPI starts to allocate the vector
> from "0", while fortran starts from 1. Indeed I allocate REQUEST(2)
>
> what do you think?
>
> Diego
>
>
>
> Diego
>
>
> On 29 January 2016 at 12:43, Jeff Squyres (jsquyres)  > wrote:
>
>> You must have an error elsewhere in your code; as Gilles pointed, the
>> error message states that you are calling MPI_WAITALL with a first argument
>> of 3:
>>
>> --
>> MPI_Waitall(271): MPI_Waitall(count=3, req_array=0x7445f0,
>> status_array=0x744600) failed
>> --
>>
>> We can't really help you with problems with Intel MPI; sorry.  You'll
>> need to contact their tech support for assistance.
>>
>>
>>
>> > On Jan 29, 2016, at 6:11 AM, Diego Avesani > > wrote:
>> >
>> > Dear all, Dear Gilles,
>> >
>> > I do not understand, I am sorry.
>> > I did a "grep" on my code and I find only "MPI_WAITALL(2", so I am not
>> able to find the error.
>> >
>> >
>> > Thanks a lot
>> >
>> >
>> >
>> > Diego
>> >
>> >
>> > On 29 January 2016 at 11:58, Gilles Gouaillardet <
>> gilles.gouaillar...@gmail.com
>> > wrote:
>> > Diego,
>> >
>> > your code snippet does MPI_Waitall(2,...)
>> > but the error is about MPI_Waitall(3,...)
>> >
>> > Cheers,
>> >
>> > Gilles
>> >
>> >
>> > On Friday, January 29, 2016, Diego Avesani > > wrote:
>> > Dear all,
>> >
>> > I have created a program in fortran and OpenMPI, I test it on my laptop
>> and it works.
>> > I would like to use it on a cluster that has, unfortunately, intel MPI.
>> >
>> > The program crushes on the cluster and I get the following error:
>> >
>> > Fatal error in MPI_Waitall: Invalid MPI_Request, error stack:
>> > MPI_Waitall(271): MPI_Waitall(count=3, req_array=0x7445f0,
>> status_array=0x744600) failed
>> > MPI_Waitall(119): The supplied request in array element 2 was invalid
>> (kind=0)
>> >
>> > Do OpenMPI and MPI have some difference that I do not know?
>> >
>> > this is my code
>> >
>> >  REQUEST = MPI_REQUEST_NULL
>> >  !send data share with left
>> >  IF(MPIdata%rank.NE.0)THEN
>> > MsgLength = MPIdata%imaxN
>> > DO icount=1,MPIdata%imaxN
>> > iNode = MPIdata%nodeFromUp(icount)
>> > send_messageL(icount) = R1(iNode)
>> > ENDDO
>> > CALL MPI_ISEND(send_messageL, MsgLength, MPIdata%AUTO_COMP,
>> MPIdata%rank-1, MPIdata%rank, MPI_COMM_WORLD, REQUEST(1), MPIdata%iErr)
>> >  ENDIF
>> >  !
>> >  !recive message FROM RIGHT CPU
>> >  IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
>> > MsgLength = MPIdata%imaxN
>> > CALL MPI_IRECV(recv_messageR, MsgLength, MPIdata%AUTO_COMP,
>> MPIdata%rank+1, MPIdata%rank+1, MPI_COMM_WORLD, REQUEST(2), MPIdata%iErr)
>> >  ENDIF
>> >  CALL MPI_WAITALL(2,REQUEST,send_status_list,MPIdata%iErr)
>> >  IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
>> > DO i=1,MPIdata%imaxN
>> >iNode=MPIdata%nodeList2Up(i)
>> >R1(iNode)=recv_messageR(i)
>> > ENDDO
>> >  ENDIF
>> >
>> > Thank a lot your help
>> >
>> >
>> >
>> > Diego
>> >
>> >
>> > ___
>> > users mailing list
>> > us...@open-mpi.org 
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> > Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/01/28411.php
>> >
>> > ___
>> > users mailing list
>> > us...@open-mpi.org 
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> > Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/01/28413.php
>>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com 
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> ___
>> users mailing list
>> us...@open-mpi.org 
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/01/28414.php
>>
>
>


Re: [OMPI users] difference between OpenMPI - intel MPI mpi_waitall

2016-01-29 Thread Diego Avesani
Dear all,

I am really sorry for the time that you dedicated to me.

this is what I found:

 REQUEST = MPI_REQUEST_NULL
 !send data share with UP
 IF(MPIdata%rank.NE.0)THEN
MsgLength = MPIdata%imaxN
DO icount=1,MPIdata%imaxN
iNode = MPIdata%nodeFromUp(icount)
send_messageL(icount) = R1(iNode)
ENDDO
CALL MPI_ISEND(send_messageL, MsgLength, MPIdata%AUTO_COMP,
MPIdata%rank-1, MPIdata%rank, MPI_COMM_WORLD, REQUEST(1), MPIdata%iErr)
 ENDIF
 !
 !recive message FROM up CPU
 IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
MsgLength = MPIdata%imaxN
CALL MPI_IRECV(recv_messageR, MsgLength, MPIdata%AUTO_COMP,
MPIdata%rank+1, MPIdata%rank+1, MPI_COMM_WORLD, REQUEST(2), MPIdata%iErr)
 ENDIF
 CALL MPI_WAITALL(nMsg,REQUEST,send_status_list,MPIdata%iErr)
 IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
DO i=1,MPIdata%imaxN
   iNode=MPIdata%nodeList2Up(i)
   R1(iNode)=recv_messageR(i)
ENDDO
 ENDIF

As you can see there is a nMsg which is set equal to "3". Do I have to set
it equal to? Am I right?





Diego


On 29 January 2016 at 14:09, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> Diego,
>
> First, you can double check the program you are running has been compiled
> from your sources.
>
> then you can run your program under a debugger, and browse the stack when
> it crashes.
>
> there could be a bug in intelmpi, that incorrectly translates 2 in Fortran
> to 3 in C...
> but as far as I am concerned, this is extremely unlikely.
>
> Cheers,
>
> Gilles
>
> On Friday, January 29, 2016, Diego Avesani 
> wrote:
>
>> Dear all, Dear Jeff, Dear Gilles,
>>
>> I am sorry, porblably I am a stubborn.
>>
>> In all my code I have
>>
>> CALL MPI_WAITALL(2,REQUEST,send_status_list,MPIdata%iErr)
>>
>> how can it became "3"?
>>
>> the only thing that I can think is that MPI starts to allocate the vector
>> from "0", while fortran starts from 1. Indeed I allocate REQUEST(2)
>>
>> what do you think?
>>
>> Diego
>>
>>
>>
>> Diego
>>
>>
>> On 29 January 2016 at 12:43, Jeff Squyres (jsquyres) 
>> wrote:
>>
>>> You must have an error elsewhere in your code; as Gilles pointed, the
>>> error message states that you are calling MPI_WAITALL with a first argument
>>> of 3:
>>>
>>> --
>>> MPI_Waitall(271): MPI_Waitall(count=3, req_array=0x7445f0,
>>> status_array=0x744600) failed
>>> --
>>>
>>> We can't really help you with problems with Intel MPI; sorry.  You'll
>>> need to contact their tech support for assistance.
>>>
>>>
>>>
>>> > On Jan 29, 2016, at 6:11 AM, Diego Avesani 
>>> wrote:
>>> >
>>> > Dear all, Dear Gilles,
>>> >
>>> > I do not understand, I am sorry.
>>> > I did a "grep" on my code and I find only "MPI_WAITALL(2", so I am not
>>> able to find the error.
>>> >
>>> >
>>> > Thanks a lot
>>> >
>>> >
>>> >
>>> > Diego
>>> >
>>> >
>>> > On 29 January 2016 at 11:58, Gilles Gouaillardet <
>>> gilles.gouaillar...@gmail.com> wrote:
>>> > Diego,
>>> >
>>> > your code snippet does MPI_Waitall(2,...)
>>> > but the error is about MPI_Waitall(3,...)
>>> >
>>> > Cheers,
>>> >
>>> > Gilles
>>> >
>>> >
>>> > On Friday, January 29, 2016, Diego Avesani 
>>> wrote:
>>> > Dear all,
>>> >
>>> > I have created a program in fortran and OpenMPI, I test it on my
>>> laptop and it works.
>>> > I would like to use it on a cluster that has, unfortunately, intel MPI.
>>> >
>>> > The program crushes on the cluster and I get the following error:
>>> >
>>> > Fatal error in MPI_Waitall: Invalid MPI_Request, error stack:
>>> > MPI_Waitall(271): MPI_Waitall(count=3, req_array=0x7445f0,
>>> status_array=0x744600) failed
>>> > MPI_Waitall(119): The supplied request in array element 2 was invalid
>>> (kind=0)
>>> >
>>> > Do OpenMPI and MPI have some difference that I do not know?
>>> >
>>> > this is my code
>>> >
>>> >  REQUEST = MPI_REQUEST_NULL
>>> >  !send data share with left
>>> >  IF(MPIdata%rank.NE.0)THEN
>>> > MsgLength = MPIdata%imaxN
>>> > DO icount=1,MPIdata%imaxN
>>> > iNode = MPIdata%nodeFromUp(icount)
>>> > send_messageL(icount) = R1(iNode)
>>> > ENDDO
>>> > CALL MPI_ISEND(send_messageL, MsgLength, MPIdata%AUTO_COMP,
>>> MPIdata%rank-1, MPIdata%rank, MPI_COMM_WORLD, REQUEST(1), MPIdata%iErr)
>>> >  ENDIF
>>> >  !
>>> >  !recive message FROM RIGHT CPU
>>> >  IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
>>> > MsgLength = MPIdata%imaxN
>>> > CALL MPI_IRECV(recv_messageR, MsgLength, MPIdata%AUTO_COMP,
>>> MPIdata%rank+1, MPIdata%rank+1, MPI_COMM_WORLD, REQUEST(2), MPIdata%iErr)
>>> >  ENDIF
>>> >  CALL MPI_WAITALL(2,REQUEST,send_status_list,MPIdata%iErr)
>>> >  IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
>>> > DO i=1,MPIdata%imaxN
>>> >iNode=MPIdata%nodeList2Up(i)
>>> >R1(iNode)=recv_messageR(i)
>>> > ENDDO
>>> >  ENDIF
>>> >
>>> > Thank a lot your help
>>> >
>>> >
>>> >
>>> > Diego
>>> >
>>> >
>>> > ___
>>> > users mailing list
>>> > us...@open-mpi.org
>>> > Subscription: http://www.open-mpi

Re: [OMPI users] difference between OpenMPI - intel MPI mpi_waitall

2016-01-29 Thread Jeff Squyres (jsquyres)
> On Jan 29, 2016, at 9:43 AM, Diego Avesani  wrote:
> 
> Dear all,
>  
> I am really sorry for the time that you dedicated to me.
> 
> this is what I found:
> 
>  REQUEST = MPI_REQUEST_NULL

I'm not enough of a fortran expert to know -- does this assign MPI_REQUEST_NULL 
to every entry in the REQUEST array?

>  !send data share with UP
>  IF(MPIdata%rank.NE.0)THEN
> MsgLength = MPIdata%imaxN
> DO icount=1,MPIdata%imaxN
> iNode = MPIdata%nodeFromUp(icount)
> send_messageL(icount) = R1(iNode)
> ENDDO
> CALL MPI_ISEND(send_messageL, MsgLength, MPIdata%AUTO_COMP, 
> MPIdata%rank-1, MPIdata%rank, MPI_COMM_WORLD, REQUEST(1), MPIdata%iErr)
>  ENDIF
>  !
>  !recive message FROM up CPU
>  IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
> MsgLength = MPIdata%imaxN
> CALL MPI_IRECV(recv_messageR, MsgLength, MPIdata%AUTO_COMP, 
> MPIdata%rank+1, MPIdata%rank+1, MPI_COMM_WORLD, REQUEST(2), MPIdata%iErr)
>  ENDIF

I only see you setting REQUEST(1) and REQUEST(2) above, so I would assume that 
you need to send nMsg to 2.

That being said, it's valid to pass MPI_REQUEST_NULL in to any of the 
MPI_WAIT/TEST functions.  So it should be permissible to send 3 in, if a) 
REQUEST is long enough, b) REQUEST(3) has been initialized to MPI_REQUEST_NULL, 
and c) send_status_list is long enough (you didn't include the declaration for 
it anywhere).

A major point: if REQUEST or send_status_list is only of length 2, then nMsg 
should not be larger than 2.

>  CALL MPI_WAITALL(nMsg,REQUEST,send_status_list,MPIdata%iErr)
>  IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
> DO i=1,MPIdata%imaxN
>iNode=MPIdata%nodeList2Up(i)
>R1(iNode)=recv_messageR(i)
> ENDDO
>  ENDIF
> 
> As you can see there is a nMsg which is set equal to "3". Do I have to set it 
> equal to? Am I right?
> 
> 
> 
> 
> 
> Diego
> 
> 
> On 29 January 2016 at 14:09, Gilles Gouaillardet 
>  wrote:
> Diego,
> 
> First, you can double check the program you are running has been compiled 
> from your sources.
> 
> then you can run your program under a debugger, and browse the stack when it 
> crashes.
> 
> there could be a bug in intelmpi, that incorrectly translates 2 in Fortran to 
> 3 in C...
> but as far as I am concerned, this is extremely unlikely.
> 
> Cheers,
> 
> Gilles
> 
> On Friday, January 29, 2016, Diego Avesani  wrote:
> Dear all, Dear Jeff, Dear Gilles,
> 
> I am sorry, porblably I am a stubborn.
> 
> In all my code I have 
> 
> CALL MPI_WAITALL(2,REQUEST,send_status_list,MPIdata%iErr)
> 
> how can it became "3"?
> 
> the only thing that I can think is that MPI starts to allocate the vector 
> from "0", while fortran starts from 1. Indeed I allocate REQUEST(2)
> 
> what do you think?
> 
> Diego
> 
> 
> 
> Diego
> 
> 
> On 29 January 2016 at 12:43, Jeff Squyres (jsquyres)  
> wrote:
> You must have an error elsewhere in your code; as Gilles pointed, the error 
> message states that you are calling MPI_WAITALL with a first argument of 3:
> 
> --
> MPI_Waitall(271): MPI_Waitall(count=3, req_array=0x7445f0, 
> status_array=0x744600) failed
> --
> 
> We can't really help you with problems with Intel MPI; sorry.  You'll need to 
> contact their tech support for assistance.
> 
> 
> 
> > On Jan 29, 2016, at 6:11 AM, Diego Avesani  wrote:
> >
> > Dear all, Dear Gilles,
> >
> > I do not understand, I am sorry.
> > I did a "grep" on my code and I find only "MPI_WAITALL(2", so I am not able 
> > to find the error.
> >
> >
> > Thanks a lot
> >
> >
> >
> > Diego
> >
> >
> > On 29 January 2016 at 11:58, Gilles Gouaillardet 
> >  wrote:
> > Diego,
> >
> > your code snippet does MPI_Waitall(2,...)
> > but the error is about MPI_Waitall(3,...)
> >
> > Cheers,
> >
> > Gilles
> >
> >
> > On Friday, January 29, 2016, Diego Avesani  wrote:
> > Dear all,
> >
> > I have created a program in fortran and OpenMPI, I test it on my laptop and 
> > it works.
> > I would like to use it on a cluster that has, unfortunately, intel MPI.
> >
> > The program crushes on the cluster and I get the following error:
> >
> > Fatal error in MPI_Waitall: Invalid MPI_Request, error stack:
> > MPI_Waitall(271): MPI_Waitall(count=3, req_array=0x7445f0, 
> > status_array=0x744600) failed
> > MPI_Waitall(119): The supplied request in array element 2 was invalid 
> > (kind=0)
> >
> > Do OpenMPI and MPI have some difference that I do not know?
> >
> > this is my code
> >
> >  REQUEST = MPI_REQUEST_NULL
> >  !send data share with left
> >  IF(MPIdata%rank.NE.0)THEN
> > MsgLength = MPIdata%imaxN
> > DO icount=1,MPIdata%imaxN
> > iNode = MPIdata%nodeFromUp(icount)
> > send_messageL(icount) = R1(iNode)
> > ENDDO
> > CALL MPI_ISEND(send_messageL, MsgLength, MPIdata%AUTO_COMP, 
> > MPIdata%rank-1, MPIdata%rank, MPI_COMM_WORLD, REQUEST(1), MPIdata%iErr)
> >  ENDIF
> >  !
> >  !recive message FROM RIGHT CPU
> >  IF(MPIdata%rank.NE.MPIdata%nCPU-1)THEN
> > MsgLength = MPIdata%imaxN
> > CALL MPI_IRECV(recv_mess

Re: [OMPI users] difference between OpenMPI - intel MPI mpi_waitall

2016-01-29 Thread Jeff Hammond
On Fri, Jan 29, 2016 at 2:45 AM, Diego Avesani 
wrote:

> Dear all,
>
> I have created a program in fortran and OpenMPI, I test it on my laptop
> and it works.
> I would like to use it on a cluster that has, unfortunately, intel MPI.
>
>
You can install any open-source MPI implementation from user space.  This
includes Open-MPI, MPICH, and MVAPICH2.  If you like Open-MPI, try this:


cd $OMPI_DIR && mkdir build && cd build && ../configure
--prefix=$HOME/ompi-install && make -j && make install

...or something like that.  I'm sure the details are properly documented
online.

Jeff

-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/