[OMPI users] performance of MPI_Iallgatherv

2014-04-05 Thread Zehan Cui
Hi,

I'm testing the non-blocking collective of OpenMPI-1.8.

I have two nodes with Infiniband to perform allgather on totally 128MB data.

I split the 128MB data into eight pieces, and perform computation and
MPI_Iallgatherv() on one piece of data each iteration, hoping that the
MPI_Iallgatherv() of last iteration can be overlapped with computation of
current iteration. A MPI_Wait() is called at the end of last iteration.

However, the total communication time (including the final wait time) is
similar with that of the traditional blocking MPI_Allgatherv, even slightly
higher.


Following is the test pseudo-code, the source code are attached.

===

Using MPI_Allgatherv:

for( i=0; i<8; i++ )
{
  // computation
mytime( t_begin );
computation;
mytime( t_end );
comp_time += (t_end - t_begin);

  // communication
t_begin = t_end;
MPI_Allgatherv();
mytime( t_end );
comm_time += (t_end - t_begin);
}


Using MPI_Iallgatherv:

for( i=0; i<8; i++ )
{
  // computation
mytime( t_begin );
computation;
mytime( t_end );
comp_time += (t_end - t_begin);

  // communication
t_begin = t_end;
MPI_Iallgatherv();
mytime( t_end );
comm_time += (t_end - t_begin);
}

// wait for non-blocking allgather to complete
mytime( t_begin );
for( i=0; i<8; i++ )
MPI_Wait;
mytime( t_end );
wait_time = t_end - t_begin;

==

The results of Allgatherv is:
[cmy@gnode102 test_nbc]$ /home3/cmy/czh/opt/ompi-1.8/bin/mpirun -n 2 --host
gnode102,gnode103 ./Allgatherv 128 2 | grep time
Computation time  : 8481279 us
Communication time: 319803 us

The results of Iallgatherv is:
[cmy@gnode102 test_nbc]$ /home3/cmy/czh/opt/ompi-1.8/bin/mpirun -n 2 --host
gnode102,gnode103 ./Iallgatherv 128 2 | grep time
Computation time  : 8479177 us
Communication time: 199046 us
Wait time:  139841 us


So, does this mean that current OpenMPI implementation of MPI_Iallgatherv
doesn't support offloading of collective communication to dedicated cores
or network interface?

Best regards,
Zehan
#include "mpi.h"
#include 
#include 
#include 
#include 


#define NS  8   //  number of segment

struct timeval tv;
#define mytime(time)   do{  \
gettimeofday(&tv,NULL);  \
time=(unsigned long)(tv.tv_sec*100+tv.tv_usec);\
}while(0)

int main(int argc, char** argv)
{
MPI_Init(&argc,&argv);

int size;
int rank;

MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);

if(argc<2) {
printf("Usage: ./allgather m [n]\n");
printf("n=1, represent m KB;");
printf("n=2, represent m MB;");
exit(-1);
}

int global_size;// the amount of data to allgather
int local_size;// the amount of data that each process holds
if(argc >= 2) global_size = atoi(argv[1]);
if(argc >= 3)
{
if(atoi(argv[2])==2) global_size = global_size*1024*1024;   // n=2, xxMB
if(atoi(argv[2])==1) global_size = global_size*1024;// n=1, xxKB
}
local_size = global_size/size;  // each process holds 1/size of the data

int * global_buf;   // recvbuf
int * local_buf;// sendbuf
global_buf = (int *) malloc(global_size*sizeof(int));
local_buf = (int *) malloc(local_size*sizeof(int)); 
memset(global_buf,0,global_size*sizeof(int));
memset(local_buf,0,local_size*sizeof(int));

int i,j,k;

int *recvcnts;  // recvcnts of MPI_Allgatherv
int *displs;// displs of MPI_Allgatherv
recvcnts = (int *) malloc(size*sizeof(int));
displs = (int*) malloc(size*sizeof(int));
for(i=0; i#include "mpi.h"
#include 
#include 
#include 
#include 


#define NS  8   //  number of segment

struct timeval tv;
#define mytime(time)   do{  \
gettimeofday(&tv,NULL);  \
time=(unsigned long)(tv.tv_sec*100+tv.tv_usec);\
}while(0)

int main(int argc, char** argv)
{
MPI_Init(&argc,&argv);

int size;
int rank;

MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);

if(argc<2) {
printf("Usage: ./allgather m [n]\n");
printf("n=1, represent m KB;");
printf("n=2, represent m MB;");
exit(-1);
}

int global_size;// the amount of data to allgather
int local_size;// the amount of data that each process holds
if(argc >= 2) global_size = atoi(argv[1]);
if(argc >= 3)
{
if(atoi(argv[2])==2) global_size = global_size*1024*1024;   // n=2, xxMB
if(atoi(argv[2])==1) global_size = global_size*1024;// n=1, xxKB
}
local_size = global_size/size;

int * global_buf;   // recvbuf
int * local_buf;// sendbuf
global_buf = (int *) malloc(global_size*sizeof(int));
local_buf = (int *) malloc(local_size*sizeof(int));
memset(global_buf,0,global_size*sizeof(int));
memset(local_buf,0,lo

Re: [OMPI users] performance of MPI_Iallgatherv

2014-04-06 Thread Zehan Cui
Hi Matthieu,

Thanks for your suggestion. I tried MPI_Waitall(), but the results are
the same. It seems the communication didn't overlap with computation.

Regards,
Zehan

On 4/5/14, Matthieu Brucher  wrote:
> Hi,
>
> Try waiting on all gathers at the same time, not one by one (this is
> what non blocking collectives are made for!)
>
> Cheers,
>
> Matthieu
>
> 2014-04-05 10:35 GMT+01:00 Zehan Cui :
>> Hi,
>>
>> I'm testing the non-blocking collective of OpenMPI-1.8.
>>
>> I have two nodes with Infiniband to perform allgather on totally 128MB
>> data.
>>
>> I split the 128MB data into eight pieces, and perform computation and
>> MPI_Iallgatherv() on one piece of data each iteration, hoping that the
>> MPI_Iallgatherv() of last iteration can be overlapped with computation of
>> current iteration. A MPI_Wait() is called at the end of last iteration.
>>
>> However, the total communication time (including the final wait time) is
>> similar with that of the traditional blocking MPI_Allgatherv, even
>> slightly
>> higher.
>>
>>
>> Following is the test pseudo-code, the source code are attached.
>>
>> ===
>>
>> Using MPI_Allgatherv:
>>
>> for( i=0; i<8; i++ )
>> {
>>   // computation
>> mytime( t_begin );
>> computation;
>> mytime( t_end );
>> comp_time += (t_end - t_begin);
>>
>>   // communication
>> t_begin = t_end;
>> MPI_Allgatherv();
>> mytime( t_end );
>> comm_time += (t_end - t_begin);
>> }
>> 
>>
>> Using MPI_Iallgatherv:
>>
>> for( i=0; i<8; i++ )
>> {
>>   // computation
>> mytime( t_begin );
>> computation;
>> mytime( t_end );
>> comp_time += (t_end - t_begin);
>>
>>   // communication
>> t_begin = t_end;
>> MPI_Iallgatherv();
>> mytime( t_end );
>> comm_time += (t_end - t_begin);
>> }
>>
>> // wait for non-blocking allgather to complete
>> mytime( t_begin );
>> for( i=0; i<8; i++ )
>> MPI_Wait;
>> mytime( t_end );
>> wait_time = t_end - t_begin;
>>
>> ==
>>
>> The results of Allgatherv is:
>> [cmy@gnode102 test_nbc]$ /home3/cmy/czh/opt/ompi-1.8/bin/mpirun -n 2
>> --host
>> gnode102,gnode103 ./Allgatherv 128 2 | grep time
>> Computation time  : 8481279 us
>> Communication time: 319803 us
>>
>> The results of Iallgatherv is:
>> [cmy@gnode102 test_nbc]$ /home3/cmy/czh/opt/ompi-1.8/bin/mpirun -n 2
>> --host
>> gnode102,gnode103 ./Iallgatherv 128 2 | grep time
>> Computation time  : 8479177 us
>> Communication time: 199046 us
>> Wait time:  139841 us
>>
>>
>> So, does this mean that current OpenMPI implementation of MPI_Iallgatherv
>> doesn't support offloading of collective communication to dedicated cores
>> or
>> network interface?
>>
>> Best regards,
>> Zehan
>>
>>
>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> --
> Information System Engineer, Ph.D.
> Blog: http://matt.eifelle.com
> LinkedIn: http://www.linkedin.com/in/matthieubrucher
> Music band: http://liliejay.com/
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


-- 
Best Regards
Zehan Cui(崔泽汉)
---
Institute of Computing Technology, Chinese Academy of Sciences.
No.6 Kexueyuan South Road Zhongguancun,Haidian District Beijing,China


Re: [OMPI users] performance of MPI_Iallgatherv

2014-04-08 Thread Zehan Cui
Thanks, it looks like I have to do the overlapping myself.


On Tue, Apr 8, 2014 at 5:40 PM, Matthieu Brucher  wrote:

> Yes, usually the MPI libraries don't allow that. You can launch
> another thread for the computation, make calls to MPI_Test during that
> time and join at the end.
>
> Cheers,
>
> 2014-04-07 4:12 GMT+01:00 Zehan Cui :
> > Hi Matthieu,
> >
> > Thanks for your suggestion. I tried MPI_Waitall(), but the results are
> > the same. It seems the communication didn't overlap with computation.
> >
> > Regards,
> > Zehan
> >
> > On 4/5/14, Matthieu Brucher  wrote:
> >> Hi,
> >>
> >> Try waiting on all gathers at the same time, not one by one (this is
> >> what non blocking collectives are made for!)
> >>
> >> Cheers,
> >>
> >> Matthieu
> >>
> >> 2014-04-05 10:35 GMT+01:00 Zehan Cui :
> >>> Hi,
> >>>
> >>> I'm testing the non-blocking collective of OpenMPI-1.8.
> >>>
> >>> I have two nodes with Infiniband to perform allgather on totally 128MB
> >>> data.
> >>>
> >>> I split the 128MB data into eight pieces, and perform computation and
> >>> MPI_Iallgatherv() on one piece of data each iteration, hoping that the
> >>> MPI_Iallgatherv() of last iteration can be overlapped with computation
> of
> >>> current iteration. A MPI_Wait() is called at the end of last iteration.
> >>>
> >>> However, the total communication time (including the final wait time)
> is
> >>> similar with that of the traditional blocking MPI_Allgatherv, even
> >>> slightly
> >>> higher.
> >>>
> >>>
> >>> Following is the test pseudo-code, the source code are attached.
> >>>
> >>> ===
> >>>
> >>> Using MPI_Allgatherv:
> >>>
> >>> for( i=0; i<8; i++ )
> >>> {
> >>>   // computation
> >>> mytime( t_begin );
> >>> computation;
> >>> mytime( t_end );
> >>> comp_time += (t_end - t_begin);
> >>>
> >>>   // communication
> >>> t_begin = t_end;
> >>> MPI_Allgatherv();
> >>> mytime( t_end );
> >>> comm_time += (t_end - t_begin);
> >>> }
> >>> 
> >>>
> >>> Using MPI_Iallgatherv:
> >>>
> >>> for( i=0; i<8; i++ )
> >>> {
> >>>   // computation
> >>> mytime( t_begin );
> >>> computation;
> >>> mytime( t_end );
> >>> comp_time += (t_end - t_begin);
> >>>
> >>>   // communication
> >>> t_begin = t_end;
> >>> MPI_Iallgatherv();
> >>> mytime( t_end );
> >>> comm_time += (t_end - t_begin);
> >>> }
> >>>
> >>> // wait for non-blocking allgather to complete
> >>> mytime( t_begin );
> >>> for( i=0; i<8; i++ )
> >>> MPI_Wait;
> >>> mytime( t_end );
> >>> wait_time = t_end - t_begin;
> >>>
> >>> ==
> >>>
> >>> The results of Allgatherv is:
> >>> [cmy@gnode102 test_nbc]$ /home3/cmy/czh/opt/ompi-1.8/bin/mpirun -n 2
> >>> --host
> >>> gnode102,gnode103 ./Allgatherv 128 2 | grep time
> >>> Computation time  : 8481279 us
> >>> Communication time: 319803 us
> >>>
> >>> The results of Iallgatherv is:
> >>> [cmy@gnode102 test_nbc]$ /home3/cmy/czh/opt/ompi-1.8/bin/mpirun -n 2
> >>> --host
> >>> gnode102,gnode103 ./Iallgatherv 128 2 | grep time
> >>> Computation time  : 8479177 us
> >>> Communication time: 199046 us
> >>> Wait time:  139841 us
> >>>
> >>>
> >>> So, does this mean that current OpenMPI implementation of
> MPI_Iallgatherv
> >>> doesn't support offloading of collective communication to dedicated
> cores
> >>> or
> >>> network interface?
> >>>
> >>> Best regards,
> >>> Zehan
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> ___
> >>> users mailing list
> >>> us...@open-mpi.org
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >>
> >>
> >> --
> >> Information System Engineer, Ph.D.
> >> Blog: http://matt.eifelle.com
> >> LinkedIn: http://www.linkedin.com/in/matthieubrucher
> >> Music band: http://liliejay.com/
> >> ___
> >> users mailing list
> >> us...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >
> >
> > --
> > Best Regards
> > Zehan Cui(崔泽汉)
> > ---
> > Institute of Computing Technology, Chinese Academy of Sciences.
> > No.6 Kexueyuan South Road Zhongguancun,Haidian District Beijing,China
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> --
> Information System Engineer, Ph.D.
> Blog: http://matt.eifelle.com
> LinkedIn: http://www.linkedin.com/in/matthieubrucher
> Music band: http://liliejay.com/
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] how to get mpirun to scale from 16 to 64 cores

2014-06-16 Thread Zehan Cui
Hi Yuping,

Maybe using multi-threads inside a socket, and MPI among sockets is better
choice for such NUMA platform.

Multi-threads can exploit the benefit of share memory, and MPI can
alleviate the cost of non-uniform memory access.


regards,
Zehan




On Tue, Jun 17, 2014 at 6:19 AM, Yuping Sun  wrote:

> Dear All:
>
> I bought a 64 core workstation and installed NASA fun3d with open mpi
> 1.6.5. Then I started to test run fun3d using 16, 32, 48 cores. However the
> performance of the fun3d run is bad. I got data below:
>
> the run command is (it is for 32 core as an example)
> mpiexec -np 32 --bysocket --bind-to-socket
> ~ysun/Codes/NASA/fun3d-12.3-66687/Mpi/FUN3D_90/nodet_mpi
> --time_timestep_loop --animation_freq -1 > screen.dump_bs30
>
> CPUs timesiterationstime/it
> 60678s30it22.61s
> 48702s30it23.40s
> 32734s30it24.50s
> 16894s30it29.80s
>
> You can see using 60 cores, to run 30 iteration, FUN3D will complete in
> 678 seconds, roughly 22.61 second per iteration.
>
> Using 16 cores, to run 30 iteration, FUN3D will complete in 894 seconds,
> roughly 29.8 seconds per iteration.
>
> the data above shows FUN3D run using mpirun does not scale at all! I used
> to run fun3d with mpirun on a 8 core WS, and it scales well.
> The same job to run on a linux cluster scales well.
>
> Would you all give me some advice to improve the performance loss when I
> increase the use of more cores, or how to run mpirun with proper options to
> get a linear scaling when using 16 to 32 to 48 cores?
>
> Thank you.
>
> Yuping
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/06/24654.php
>


[OMPI users] unknow option "--tree-spawn" with OpenMPI-1.7.1

2013-06-14 Thread Zehan Cui
Hi,

I have just install OpenMPI-1.7.1 and cannot get it running.

Here is the error messages:

[cmy@gLoginNode1 test_nbc]$ mpirun -n 4 -host gnode100 ./hello
[gnode100:31789] Error: unknown option "--tree-spawn"
input in flex scanner failed
[gLoginNode1:14920] [[62542,0],0] ORTE_ERROR_LOG: A message is attempting
to be sent to a process whose contact information is unknown in file
rml_oob_send.c at line 362
[gLoginNode1:14920] [[62542,0],0] attempted to send to [[62542,0],1]: tag 15
[gLoginNode1:14920] [[62542,0],0] ORTE_ERROR_LOG: A message is attempting
to be sent to a process whose contact information is unknown in file
base/grpcomm_base_xcast.c at line 166

I have run it on several nodes, and got the same messages.


- Zehan Cui


Re: [OMPI users] unknow option "--tree-spawn" with OpenMPI-1.7.1

2013-06-14 Thread Zehan Cui
I think the PATH setting is ok. I forgot to mention that it run well on
local machine.

The PATH setting on the local machine is

[cmy@gLoginNode1 ~]$ echo $PATH
/home/cmy/clc/benchmarks/nasm-2.09.10:*/home3/cmy/czh/opt/ompi-1.7.1/bin/*
:/home3/cmy/czh/opt/autoconf-2.69/bin/:/home3/cmy/czh/opt/mvapich2-1.9/bin/:/home/cmy/wr/local/ft-mvapich2-1.8a2/bin:/home/cmy/wr/local/mvapich2-1.8a2/bin:/usr/mpi/gcc/mvapich2-1.4.1/bin:/home3/cmy/czh/ompi/bin/:/home/cmy/huangyb/gem5/gcc/gcc-4.3/bin:/home/cmy/huangyb/gem5/swig/bin/:/home/cmy/huangyb/gem5/scons/bin::/home/cmy/huangyb/local/mercurial/bin:/home/cmy/huangyb/local/python-2.7.3/bin/:/home/SOFT/intel/Compiler/11.0/083/bin/intel64:/usr/mpi/gcc/openmpi-1.4.2/bin/:/home/SOFT/intel/Compiler/11.0/083/bin/intel64:/home/cmy/tgm/cmake/bin:/usr/local/mvapich2/bin:/usr/local/mpich-pgi/bin:/opt/pgi/linux86-64/7.0-2/bin:/usr/bin:/usr/lib64/qt-3.3/bin:/usr/kerberos/bin:/opt/gridviewnew/pbs//dispatcher-sched//bin:/opt/gridviewnew/pbs//dispatcher-sched//sbin:/opt/gridviewnew/pbs//dispatcher//bin:/opt/gridviewnew/pbs//dispatcher//sbin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/home/cmy/zxx/work_spring_2011/iaca-lin32/bin:/home/cmy/bin:/home/tgm/ljj/software/dmidecode-2.11/:/usr/local/oski_2007/include
[cmy@gLoginNode1 ~]$ echo $LD_LIBRARY_PATH
*/home3/cmy/czh/opt/ompi-1.7.1/lib/*
:/home3/cmy/czh/opt/mvapich2-1.9/lib/:/home/cmy/wr/local/ft-mvapich2-1.8a2/lib:/home/cmy/wr/local/mvapich2-1.8a2/lib:/usr/mpi/gcc/mvapich2-1.4.1/lib:/home3/cmy/czh/ompi/lib/:/home/cmy/huangyb/gem5/gcc/gcc-4.3/lib64:/home/cmy/huangyb/gem5/gcc/gcc-4.3/lib/:/home/cmy/huangyb/local/python-2.7.3/lib/:/usr/local/lib64:/usr/local/lib:/home/cmy/clc/DRAMSim2:/home/SOFT/intel/Compiler/11.0/083/lib/intel64:/home/cmy/zxx/oski-icc/lib/oski:/usr/mpi/gcc/openmpi-1.4.2/lib/:/usr/lib/python2.4/config:/home/SOFT/intel/Compiler/11.0/083/mkl/lib/em64t:/home/cmy/tgm/hpx/build/linux/lib:/home/cmy/yanjie/boost/lib:/usr/local/mvapich2/lib:/home/cmy/yanjie/qthread/lib:/opt/gridviewnew/pbs//dispatcher//lib::/usr/local/lib64:/usr/local/lib:/home/cmy/zxx/work_spring_2011/iaca-lin32/lib


The path setting on gnode100 is the same too

[cmy@gnode100 ~]$
[cmy@gnode100 ~]$ echo $PATH
/home/cmy/clc/benchmarks/nasm-2.09.10:*/home3/cmy/czh/opt/ompi-1.7.1/bin/*
:/home3/cmy/czh/opt/autoconf-2.69/bin/:/home3/cmy/czh/opt/mvapich2-1.9/bin/:/home/cmy/wr/local/ft-mvapich2-1.8a2/bin:/home/cmy/wr/local/mvapich2-1.8a2/bin:/usr/mpi/gcc/mvapich2-1.4.1/bin:/home3/cmy/czh/ompi/bin/:/home/cmy/huangyb/gem5/gcc/gcc-4.3/bin:/home/cmy/huangyb/gem5/swig/bin/:/home/cmy/huangyb/gem5/scons/bin::/home/cmy/huangyb/local/mercurial/bin:/home/cmy/huangyb/local/python-2.7.3/bin/:/home/SOFT/intel/Compiler/11.0/083/bin/intel64:/usr/mpi/gcc/openmpi-1.4.2/bin/:/home/SOFT/intel/Compiler/11.0/083/bin/intel64:/home/cmy/tgm/cmake/bin:/usr/local/mvapich2/bin:/usr/local/mpich-pgi/bin:/opt/pgi/linux86-64/7.0-2/bin:/usr/bin:/usr/lib64/qt-3.3/bin:/usr/kerberos/bin:/opt/gridviewnew/pbs//dispatcher-sched//bin:/opt/gridviewnew/pbs//dispatcher-sched//sbin:/opt/gridviewnew/pbs//dispatcher//bin:/opt/gridviewnew/pbs//dispatcher//sbin:/usr/local/bin:/bin:/usr/bin:/home/cmy/zxx/work_spring_2011/iaca-lin32/bin:/home/cmy/bin:/home/tgm/ljj/software/dmidecode-2.11/:/usr/local/oski_2007/include
[cmy@gnode100 ~]$
[cmy@gnode100 ~]$ echo $LD_LIBRARY_PATH
*/home3/cmy/czh/opt/ompi-1.7.1/lib/*
:/home3/cmy/czh/opt/mvapich2-1.9/lib/:/home/cmy/wr/local/ft-mvapich2-1.8a2/lib:/home/cmy/wr/local/mvapich2-1.8a2/lib:/usr/mpi/gcc/mvapich2-1.4.1/lib:/home3/cmy/czh/ompi/lib/:/home/cmy/huangyb/gem5/gcc/gcc-4.3/lib64:/home/cmy/huangyb/gem5/gcc/gcc-4.3/lib/:/home/cmy/huangyb/local/python-2.7.3/lib/:/usr/local/lib64:/usr/local/lib:/home/cmy/clc/DRAMSim2:/home/SOFT/intel/Compiler/11.0/083/lib/intel64:/home/cmy/zxx/oski-icc/lib/oski:/usr/mpi/gcc/openmpi-1.4.2/lib/:/usr/lib/python2.4/config:/home/SOFT/intel/Compiler/11.0/083/mkl/lib/em64t:/home/cmy/tgm/hpx/build/linux/lib:/home/cmy/yanjie/boost/lib:/usr/local/mvapich2/lib:/home/cmy/yanjie/qthread/lib:/opt/gridviewnew/pbs//dispatcher//lib::/usr/local/lib64:/usr/local/lib:/home/cmy/zxx/work_spring_2011/iaca-lin32/lib
[cmy@gnode100 ~]$

Best Regards
Zehan Cui(崔泽汉)
---
Institute of Computing Technology, Chinese Academy of Sciences.
No.6 Kexueyuan South Road Zhongguancun,Haidian District Beijing,China



On Fri, Jun 14, 2013 at 9:32 PM, Ralph Castain  wrote:

> You aren't setting the path correctly on your backend machines, and so
> they are picking up an older version of OMPI.
>
> On Jun 14, 2013, at 2:08 AM, Zehan Cui  wrote:
>
> > Hi,
> >
> > I have just install OpenMPI-1.7.1 and cannot get it running.
> >
> > Here is the error messages:
> >
> > [cmy@gLoginNode1 test_nbc]$ mpirun -n 4 -host gnode100 ./hello
> > [gnode100:31789] Error: unknown option "--tree-spawn"
> > input in flex scanner failed
> > [gLoginNode1:14

Re: [OMPI users] unknow option "--tree-spawn" with OpenMPI-1.7.1

2013-06-14 Thread Zehan Cui
Thanks.

That's exactly the problem. When add prefix to the mpirun command,
everything goes fine.


- Zehan Cui



On Fri, Jun 14, 2013 at 10:25 PM, Jeff Squyres (jsquyres) <
jsquy...@cisco.com> wrote:

> Check the PATH you get when you run non-interactively on the remote
> machine:
>
> ssh gnode100 env | grep PATH
>
>
> On Jun 14, 2013, at 10:09 AM, Zehan Cui  wrote:
>
> > I think the PATH setting is ok. I forgot to mention that it run well on
> local machine.
> >
> > The PATH setting on the local machine is
> >
> > [cmy@gLoginNode1 ~]$ echo $PATH
> >
> /home/cmy/clc/benchmarks/nasm-2.09.10:/home3/cmy/czh/opt/ompi-1.7.1/bin/:/home3/cmy/czh/opt/autoconf-2.69/bin/:/home3/cmy/czh/opt/mvapich2-1.9/bin/:/home/cmy/wr/local/ft-mvapich2-1.8a2/bin:/home/cmy/wr/local/mvapich2-1.8a2/bin:/usr/mpi/gcc/mvapich2-1.4.1/bin:/home3/cmy/czh/ompi/bin/:/home/cmy/huangyb/gem5/gcc/gcc-4.3/bin:/home/cmy/huangyb/gem5/swig/bin/:/home/cmy/huangyb/gem5/scons/bin::/home/cmy/huangyb/local/mercurial/bin:/home/cmy/huangyb/local/python-2.7.3/bin/:/home/SOFT/intel/Compiler/11.0/083/bin/intel64:/usr/mpi/gcc/openmpi-1.4.2/bin/:/home/SOFT/intel/Compiler/11.0/083/bin/intel64:/home/cmy/tgm/cmake/bin:/usr/local/mvapich2/bin:/usr/local/mpich-pgi/bin:/opt/pgi/linux86-64/7.0-2/bin:/usr/bin:/usr/lib64/qt-3.3/bin:/usr/kerberos/bin:/opt/gridviewnew/pbs//dispatcher-sched//bin:/opt/gridviewnew/pbs//dispatcher-sched//sbin:/opt/gridviewnew/pbs//dispatcher//bin:/opt/gridviewnew/pbs//dispatcher//sbin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/home/cmy/zxx/work_spring_2011/iaca-lin32/bin:/home/cmy/bin:/home/tgm/ljj/software/dmidecode-2.11/:/usr/local/oski_2007/include
> > [cmy@gLoginNode1 ~]$ echo $LD_LIBRARY_PATH
> >
> /home3/cmy/czh/opt/ompi-1.7.1/lib/:/home3/cmy/czh/opt/mvapich2-1.9/lib/:/home/cmy/wr/local/ft-mvapich2-1.8a2/lib:/home/cmy/wr/local/mvapich2-1.8a2/lib:/usr/mpi/gcc/mvapich2-1.4.1/lib:/home3/cmy/czh/ompi/lib/:/home/cmy/huangyb/gem5/gcc/gcc-4.3/lib64:/home/cmy/huangyb/gem5/gcc/gcc-4.3/lib/:/home/cmy/huangyb/local/python-2.7.3/lib/:/usr/local/lib64:/usr/local/lib:/home/cmy/clc/DRAMSim2:/home/SOFT/intel/Compiler/11.0/083/lib/intel64:/home/cmy/zxx/oski-icc/lib/oski:/usr/mpi/gcc/openmpi-1.4.2/lib/:/usr/lib/python2.4/config:/home/SOFT/intel/Compiler/11.0/083/mkl/lib/em64t:/home/cmy/tgm/hpx/build/linux/lib:/home/cmy/yanjie/boost/lib:/usr/local/mvapich2/lib:/home/cmy/yanjie/qthread/lib:/opt/gridviewnew/pbs//dispatcher//lib::/usr/local/lib64:/usr/local/lib:/home/cmy/zxx/work_spring_2011/iaca-lin32/lib
> >
> >
> > The path setting on gnode100 is the same too
> >
> > [cmy@gnode100 ~]$
> > [cmy@gnode100 ~]$ echo $PATH
> >
> /home/cmy/clc/benchmarks/nasm-2.09.10:/home3/cmy/czh/opt/ompi-1.7.1/bin/:/home3/cmy/czh/opt/autoconf-2.69/bin/:/home3/cmy/czh/opt/mvapich2-1.9/bin/:/home/cmy/wr/local/ft-mvapich2-1.8a2/bin:/home/cmy/wr/local/mvapich2-1.8a2/bin:/usr/mpi/gcc/mvapich2-1.4.1/bin:/home3/cmy/czh/ompi/bin/:/home/cmy/huangyb/gem5/gcc/gcc-4.3/bin:/home/cmy/huangyb/gem5/swig/bin/:/home/cmy/huangyb/gem5/scons/bin::/home/cmy/huangyb/local/mercurial/bin:/home/cmy/huangyb/local/python-2.7.3/bin/:/home/SOFT/intel/Compiler/11.0/083/bin/intel64:/usr/mpi/gcc/openmpi-1.4.2/bin/:/home/SOFT/intel/Compiler/11.0/083/bin/intel64:/home/cmy/tgm/cmake/bin:/usr/local/mvapich2/bin:/usr/local/mpich-pgi/bin:/opt/pgi/linux86-64/7.0-2/bin:/usr/bin:/usr/lib64/qt-3.3/bin:/usr/kerberos/bin:/opt/gridviewnew/pbs//dispatcher-sched//bin:/opt/gridviewnew/pbs//dispatcher-sched//sbin:/opt/gridviewnew/pbs//dispatcher//bin:/opt/gridviewnew/pbs//dispatcher//sbin:/usr/local/bin:/bin:/usr/bin:/home/cmy/zxx/work_spring_2011/iaca-lin32/bin:/home/cmy/bin:/home/tgm/ljj/software/dmidecode-2.11/:/usr/local/oski_2007/include
> > [cmy@gnode100 ~]$
> > [cmy@gnode100 ~]$ echo $LD_LIBRARY_PATH
> >
> /home3/cmy/czh/opt/ompi-1.7.1/lib/:/home3/cmy/czh/opt/mvapich2-1.9/lib/:/home/cmy/wr/local/ft-mvapich2-1.8a2/lib:/home/cmy/wr/local/mvapich2-1.8a2/lib:/usr/mpi/gcc/mvapich2-1.4.1/lib:/home3/cmy/czh/ompi/lib/:/home/cmy/huangyb/gem5/gcc/gcc-4.3/lib64:/home/cmy/huangyb/gem5/gcc/gcc-4.3/lib/:/home/cmy/huangyb/local/python-2.7.3/lib/:/usr/local/lib64:/usr/local/lib:/home/cmy/clc/DRAMSim2:/home/SOFT/intel/Compiler/11.0/083/lib/intel64:/home/cmy/zxx/oski-icc/lib/oski:/usr/mpi/gcc/openmpi-1.4.2/lib/:/usr/lib/python2.4/config:/home/SOFT/intel/Compiler/11.0/083/mkl/lib/em64t:/home/cmy/tgm/hpx/build/linux/lib:/home/cmy/yanjie/boost/lib:/usr/local/mvapich2/lib:/home/cmy/yanjie/qthread/lib:/opt/gridviewnew/pbs//dispatcher//lib::/usr/local/lib64:/usr/local/lib:/home/cmy/zxx/work_spring_2011/iaca-lin32/lib
> > [cmy@gnode100 ~]$
> >
> > Best Regards
> > Zehan Cui(崔泽汉)
> > ---
> > Institute of Computing Technology, Chinese Academy of Sciences.
> > No.6 Kexueyuan South Road 

[OMPI users] MPI_Iallgatherv performance

2013-06-14 Thread Zehan Cui
Hi,

OpenMPI-1.7.1 is announce support MPI-3 functionality such as non-blocking
collectives.

I have test MPI_Iallgatherv on a 8-node cluster, however, I got bad
performance. The MPI_Iallgatherv block the program for even longer time
than traditional MPI_Allgatherv.

Following is the test pseudo-code and result.

===

Using MPI_Allgatherv:

for( i=0; i<8; i++ )
{
  // computation
mytime( t_begin );
computation;
mytime( t_end );
comp_time += (t_end - t_begin);

  // communication
t_begin = t_end;
MPI_Allgatherv();
mytime( t_end );
comm_time += (t_end - t_begin);
}

result:
comp_time = 811,630 us
comm_time = 342,284 us



Using MPI_Iallgatherv:

for( i=0; i<8; i++ )
{
  // computation
mytime( t_begin );
computation;
mytime( t_end );
comp_time += (t_end - t_begin);

  // communication
t_begin = t_end;
MPI_Iallgatherv();
mytime( t_end );
comm_time += (t_end - t_begin);
}

// wait for non-blocking allgather to complete
mytime( t_begin );
for( i=0; i<8; i++ )
MPI_Wait;
mytime( t_end );
wait_time = t_end - t_begin;

result:
comp_time = 817,397 us
comm_time = 1,183,511 us
wait_time = 1,294,330 us

==

>From the result, we can tell that MPI_Iallgatherv block the program for
1,183,511 us, much longer than that of MPI_Allgatherv, which is 342,284 us.
Even worse, it still take 1,294,330 us to wait for the non-blocking
MPI_Iallgatherv to finish.


- Zehan Cui