Re: [OMPI users] Low performance of Open MPI-1.3 over Gigabit

2009-03-04 Thread Sangamesh B
Hi all,

Now LAM-MPI is also installed and tested the fortran application by
running with LAM-MPI.

But LAM-MPI is performing still worse than Open MPI

No of nodes:3 cores per node:8  total core: 3*8=24

   CPU TIME :1 HOURS 51 MINUTES 23.49 SECONDS
   ELAPSED TIME :7 HOURS 28 MINUTES  2.23 SECONDS

No of nodes:6  cores used per node:4  total core: 6*4=24

   CPU TIME :0 HOURS 51 MINUTES 50.41 SECONDS
   ELAPSED TIME :6 HOURS  6 MINUTES 38.67 SECONDS

Any help/suggetsions to diagnose this problem.

Thanks,
Sangamesh

On Wed, Feb 25, 2009 at 12:51 PM, Sangamesh B  wrote:
> Dear All,
>
>    A fortran application is installed with Open MPI-1.3 + Intel
> compilers on a Rocks-4.3 cluster with Intel Xeon Dual socket Quad core
> processor @ 3GHz (8cores/node).
>
>    The time consumed for different tests over a Gigabit connected
> nodes are as follows: (Each node has 8 GB memory).
>
> No of Nodes used:6  No of cores used/node:4 total mpi processes:24
>       CPU TIME :    1 HOURS 19 MINUTES 14.39 SECONDS
>   ELAPSED TIME :    2 HOURS 41 MINUTES  8.55 SECONDS
>
> No of Nodes used:6  No of cores used/node:8 total mpi processes:48
>       CPU TIME :    4 HOURS 19 MINUTES 19.29 SECONDS
>   ELAPSED TIME :    9 HOURS 15 MINUTES 46.39 SECONDS
>
> No of Nodes used:3  No of cores used/node:8 total mpi processes:24
>       CPU TIME :    2 HOURS 41 MINUTES 27.98 SECONDS
>   ELAPSED TIME :    4 HOURS 21 MINUTES  0.24 SECONDS
>
> But the same application performs well on another Linux cluster with
> LAM-MPI-7.1.3
>
> No of Nodes used:6  No of cores used/node:4 total mpi processes:24
> CPU TIME :    1hours:30min:37.25s
> ELAPSED TIME  1hours:51min:10.00S
>
> No of Nodes used:12  No of cores used/node:4 total mpi processes:48
> CPU TIME :    0hours:46min:13.98s
> ELAPSED TIME  1hours:02min:26.11s
>
> No of Nodes used:6  No of cores used/node:8 total mpi processes:48
> CPU TIME :     1hours:13min:09.17s
> ELAPSED TIME  1hours:47min:14.04s
>
> So there is a huge difference between CPU TIME & ELAPSED TIME for Open MPI 
> jobs.
>
> Note: On the same cluster Open MPI gives better performance for
> inifiniband nodes.
>
> What could be the problem for Open MPI over Gigabit?
> Any flags need to be used?
> Or is it not that good to use Open MPI on Gigabit?
>
> Thanks,
> Sangamesh
>



[OMPI users] metahosts (like in MP-MPICH)

2009-03-04 Thread Yury Tarasievich
Can't find this in FAQ... Can I create the metahost in OpenMPI (a la
MP-MPICH), to execute the MPI application simultaneously on several
physically different machines connected by TCP/IP?

--


Re: [OMPI users] metahosts (like in MP-MPICH)

2009-03-04 Thread Jeff Squyres

I'm not quite sure what an MP-MPICH meta host is.

Open MPI allows you to specify multiple hosts in a hostfile and run a  
single MPI job across all of them, assuming they're connected by at  
least some common TCP network.



On Mar 4, 2009, at 4:42 AM, Yury Tarasievich wrote:


Can't find this in FAQ... Can I create the metahost in OpenMPI (a la
MP-MPICH), to execute the MPI application simultaneously on several
physically different machines connected by TCP/IP?

--
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] libnuma under ompi 1.3

2009-03-04 Thread Jeff Squyres

Hmm; that's odd.

Is icc / icpc able to find libnuma with no -L, but ifort is unable to  
find it without a -L?


On Mar 3, 2009, at 10:00 PM, Terry Frankcombe wrote:

Having just downloaded and installed Open MPI 1.3 with ifort and  
gcc, I

merrily went off to compile my application.

In my final link with mpif90 I get the error:

/usr/bin/ld: cannot find -lnuma

Adding --showme reveals that

-I/home/terry/bin/Local/include -pthread -I/home/terry/bin/Local/lib

is added to the compile early in the aggregated ifort command, and

-L/home/terry/bin/Local/lib -lmpi_f90 -lmpi_f77 -lmpi -lopen-rte
-lopen-pal -lpbs -lnuma -ldl -Wl,--export-dynamic -lnsl -lutil -lm - 
ldl


is added to the end.

I note than when compiling Open MPI -lnuma was visible in the gcc
arguments, with no added -L.

On this system libnuma.so exists in /usr/lib64.  My (somewhat long!)
configure command was

./configure --enable-static --disable-shared
--prefix=/home/terry/bin/Local --enable-picky --disable-heterogeneous
--without-slurm --without-alps --without-xgrid --without-sge
--without-loadleveler --without-lsf F77=ifort


Should mpif90 have bundled a -L/usr/lib64 in there somewhere?

Regards
Terry


--
Dr. Terry Frankcombe
Research School of Chemistry, Australian National University
Ph: (+61) 0417 163 509Skype: terry.frankcombe

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] MPI-IO Inconsistency over Lustre using OMPI 1.3

2009-03-04 Thread Jeff Squyres
Unfortunately, we don't have a whole lot of insight into how the  
internals of the IO support work -- we mainly bundle the ROMIO package  
from MPICH2 into Open MPI.  Our latest integration was the ROMIO from  
MPICH2 v1.0.7.


Do you see the same behavior if you run your application under MPICH2  
compiled with Lustre ROMIO support?



On Mar 3, 2009, at 12:51 PM, Nathan Baca wrote:


Hello,

I am seeing inconsistent mpi-io behavior when writing to a Lustre  
file system using open mpi 1.3 with romio. What follows is a simple  
reproducer and output. Essentially one or more of the running  
processes does not read or write the correct amount of data to its  
part of a file residing on a Lustre (parallel) file system.


Any help figuring out what is happening is greatly appreciated.  
Thanks, Nate


program gcrm_test_io
  implicit none
  include "mpif.h"

  integer X_SIZE

  integer w_me, w_nprocs
  integer  my_info

  integer i
  integer (kind=4) :: ierr
  integer (kind=4) :: fileID

  integer (kind=MPI_OFFSET_KIND):: mylen
  integer (kind=MPI_OFFSET_KIND):: offset
  integer status(MPI_STATUS_SIZE)
  integer count
  integer ncells
  real (kind=4), allocatable, dimension (:) :: array2
  logical sync

  call mpi_init(ierr)
  call MPI_COMM_SIZE(MPI_COMM_WORLD,w_nprocs,ierr)
  call MPI_COMM_RANK(MPI_COMM_WORLD,w_me,ierr)

  call mpi_info_create(my_info, ierr)
! optional ways to set things in mpi-io
! call mpi_info_set   (my_info, "romio_ds_read" , "enable"   ,  
ierr)
! call mpi_info_set   (my_info, "romio_ds_write", "enable"   ,  
ierr)
! call mpi_info_set   (my_info, "romio_cb_write", "enable",  
ierr)


  x_size = 410011  ! A 'big' number, with bigger numbers it is  
more likely to fail

  sync = .true.  ! Extra file synchronization

  ncells = (X_SIZE * w_nprocs)

!  Use node zero to fill it with nines
  if (w_me .eq. 0) then
  call MPI_FILE_OPEN  (MPI_COMM_SELF, "output.dat",  
MPI_MODE_CREATE+MPI_MODE_WRONLY, my_info, fileID, ierr)

  allocate (array2(ncells))
  array2(:) = 9.0
  mylen = ncells
  offset = 0 * 4
  call MPI_FILE_SET_VIEW(fileID,offset, MPI_REAL,MPI_REAL,  
"native",MPI_INFO_NULL,ierr)
  call MPI_File_write(fileID, array2, mylen , MPI_REAL,  
status,ierr)

  call MPI_Get_count(status,MPI_INTEGER, count, ierr)
  if (count .ne. mylen) print*, "Wrong initial write  
count:", count,mylen

  deallocate(array2)
  if (sync) call MPI_FILE_SYNC (fileID,ierr)
  call MPI_FILE_CLOSE (fileID,ierr)
  endif

!  All nodes now fill their area with ones
  call MPI_BARRIER(MPI_COMM_WORLD,ierr)
  allocate (array2( X_SIZE))
  array2(:) = 1.0
  offset = (w_me * X_SIZE) * 4 ! multiply by four, since it is  
real*4

  mylen = X_SIZE
  call MPI_FILE_OPEN   
(MPI_COMM_WORLD,"output.dat",MPI_MODE_WRONLY, my_info, fileID, ierr)
  print*,"node",w_me,"starting",(offset/4) + 1,"ending",(offset/ 
4)+mylen
  call MPI_FILE_SET_VIEW(fileID,offset, MPI_REAL,MPI_REAL,  
"native",MPI_INFO_NULL,ierr)
  call MPI_File_write(fileID, array2, mylen , MPI_REAL,  
status,ierr)

  call MPI_Get_count(status,MPI_INTEGER, count, ierr)
  if (count .ne. mylen) print*, "Wrong write count:",  
count,mylen,w_me

  deallocate(array2)
  if (sync) call MPI_FILE_SYNC (fileID,ierr)
  call MPI_FILE_CLOSE (fileID,ierr)

!  Read it back on node zero to see if it is ok data
  if (w_me .eq. 0) then
  call MPI_FILE_OPEN  (MPI_COMM_SELF, "output.dat",  
MPI_MODE_RDONLY, my_info, fileID, ierr)

  mylen = ncells
  allocate (array2(ncells))
  call MPI_File_read(fileID, array2, mylen , MPI_REAL,  
status,ierr)

  call MPI_Get_count(status,MPI_INTEGER, count, ierr)
  if (count .ne. mylen) print*, "Wrong read count:",  
count,mylen

  do i=1,ncells
   if (array2(i) .ne. 1) then
  print*, "ERROR", i,array2(i), ((i-1)*4), ((i-1)*4)/ 
(1024d0*1024d0) ! Index, value, # of good bytes,MB

  goto 999
   end if
  end do
  print*, "All done with nothing wrong"
 999  deallocate(array2)
  call MPI_FILE_CLOSE (fileID,ierr)
  call MPI_file_delete ("output.dat",MPI_INFO_NULL,ierr)
  endif

  call mpi_finalize(ierr)

end program gcrm_test_io

1.3 Open MPI
 node   0 starting 1  
ending410011
 node   1 starting410012  
ending820022
 node   2 starting820023  
ending   1230033
 node   3 starting   1230034  
ending   1640044
 node   4 starting   1640045  
ending   2050055
 node   5 starting   2050056  
ending   2460066

 All done with nothi

Re: [OMPI users] Calculation stuck in MPI

2009-03-04 Thread Jeff Squyres
No, it is not obvious, unfortunately.  Can you send all the  
information listed here:


http://www.open-mpi.org/community/help/


On Mar 3, 2009, at 5:22 AM, Ondrej Marsalek wrote:


Dear everyone,

I have a calculation (the CP2K program) using MPI over Infiniband and
it is stuck. All processes (16 on 4 nodes) are running, taking 100%
CPU. Attaching a debugger reveals this (only the end of the stack
shown here):

(gdb) backtrace
#0  0x2b3460916dbf in btl_openib_component_progress () from
/home/marsalek/opt/openmpi-1.3-intel/lib/openmpi/mca_btl_openib.so
#1  0x2b345c22c778 in opal_progress () from
/home/marsalek/opt/openmpi-1.3-intel/lib/libopen-pal.so.0
#2  0x2b345bd2d66d in ompi_request_default_wait_any () from
/home/marsalek/opt/openmpi-1.3-intel/lib/libmpi.so.0
#3  0x2b345bd6021a in PMPI_Waitany () from
/home/marsalek/opt/openmpi-1.3-intel/lib/libmpi.so.0
#4  0x2b345bae77f1 in pmpi_waitany__ () from
/home/marsalek/opt/openmpi-1.3-intel/lib/libmpi_f77.so.0

It has survived a restart of the IB switch, unlike "healthy" runs. My
question is - is it obvious at what level the problem is? IB, Open
MPI, application?I would be glad to provide detailed information, if
anyone was willing to help. I want to work on this, but unfortunately
I am not sure where to begin.

Best regards,
Ondrej Marsalek
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Lahey 64 bit and openmpi 1.3?

2009-03-04 Thread Jeff Squyres

On Mar 2, 2009, at 10:17 AM, Tiago Silva wrote:

Has anyone had success building openmpi with the 64 bit Lahey  
fortran compiler? I have seen a previous thread about the problems  
with 1.2.6 and am wondering if any progress has been made.


I can build individual libraries by removing -rpath and -soname, and  
by compiling the respective objects with -KPIC. Neverheless I  
couldn't come up with FCFLAGS and LDFLAGS that would both pass the  
makefile tests and build sucessfully.


Unfortunately, I don't think any of us test with the Lahey compiler.   
So it's quite possible that there may be some issues there.


Do you know if GNU Libtool supports the Lahey compiler?  We basically  
support what Libtool supports because Libtool essentially *is* our  
building process.  So if Libtool doesn't support it, then we likely  
don't either.


How do I find the libtool generated script, as suggested in the  
previous thread?


I'm not sure which specific script you're referring to.  The "libtool"  
script itself should generated after you run "configure" -- it should  
be in the top-level Open MPI directory.



openmpi 1.3
Lahey Linux64 8.10a
CentOS 5.2
Rocks 5.1
libtool 1.5.22


FWIW, the version of Libtool that you have installed on your system is  
likely not too important here.  Open MPI tarballs come bootstrapped  
with the Libtool that we used to build the tarball -- *that* included  
Libtool is used to build Open MPI, not the one installed on your  
system.  We use Libtool 2.2.6a to build Open MPI v1.3.


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] libnuma under ompi 1.3

2009-03-04 Thread Prentice Bisbal
Terry Frankcombe wrote:
> Having just downloaded and installed Open MPI 1.3 with ifort and gcc, I
> merrily went off to compile my application.
> 
> In my final link with mpif90 I get the error:
> 
> /usr/bin/ld: cannot find -lnuma
> 
> Adding --showme reveals that
> 
> -I/home/terry/bin/Local/include -pthread -I/home/terry/bin/Local/lib
> 
> is added to the compile early in the aggregated ifort command, and 
> 
> -L/home/terry/bin/Local/lib -lmpi_f90 -lmpi_f77 -lmpi -lopen-rte
> -lopen-pal -lpbs -lnuma -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl
> 
> is added to the end.
> 
> I note than when compiling Open MPI -lnuma was visible in the gcc
> arguments, with no added -L.
> 
> On this system libnuma.so exists in /usr/lib64.  My (somewhat long!)
> configure command was
> 
> ./configure --enable-static --disable-shared
> --prefix=/home/terry/bin/Local --enable-picky --disable-heterogeneous
> --without-slurm --without-alps --without-xgrid --without-sge
> --without-loadleveler --without-lsf F77=ifort
> 
> 
> Should mpif90 have bundled a -L/usr/lib64 in there somewhere?
> 
> Regards
> Terry
> 
> 

I had the same exact problem with my PGI compilers, (no problems
reported yet with my intel compilers). I have a fix for you.

You would think that the compiler would automatically look in
/usr/lib64, since that's one of the system's default lib directories,
but the PGI compilers don't for some reason.

A quick fix is to do

OMPI_LDFLAGS="-L/usr/lib64"

or

OMPI_MPIF90_LDFLAGS="-L/usr/lib64"

A more permanent fix is to edit
INSTALL_DIR/share/openmpi/mpif90-wrapper-data.txt

and change

linker_flags=

to

linker_flags=-L/usr/lib64

In my case, I also had to add the OpenMPI lib directory for the PGI
compilers, too. You may or may not need to add them, too:

linker_flags=-L/usr/lib64 -L/usr/local/openmpi/pgi/x86_64/lib

You may want to test all your compilers and makea similar change to all
your *wrapper-data.txt files

Not sure if this is a problem with the compilers not picking up the
system's lib dirs, or an OpenMPI configuration/build problem.

-- 
Prentice


Re: [OMPI users] libnuma under ompi 1.3

2009-03-04 Thread Prentice Bisbal
Jeff,

See my reply to Dr. Frankcombe's original e-mail. I've experienced this
same problem with the PGI compilers, so this isn't limited to just the
Intel compilers. I provided a fix, but I think OpenMPI should be able to
figure out and add the correct linker flags during the
configuration/build stage.

Jeff Squyres wrote:
> Hmm; that's odd.
> 
> Is icc / icpc able to find libnuma with no -L, but ifort is unable to
> find it without a -L?
> 
> On Mar 3, 2009, at 10:00 PM, Terry Frankcombe wrote:
> 
>> Having just downloaded and installed Open MPI 1.3 with ifort and gcc, I
>> merrily went off to compile my application.
>>
>> In my final link with mpif90 I get the error:
>>
>> /usr/bin/ld: cannot find -lnuma
>>
>> Adding --showme reveals that
>>
>> -I/home/terry/bin/Local/include -pthread -I/home/terry/bin/Local/lib
>>
>> is added to the compile early in the aggregated ifort command, and
>>
>> -L/home/terry/bin/Local/lib -lmpi_f90 -lmpi_f77 -lmpi -lopen-rte
>> -lopen-pal -lpbs -lnuma -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl
>>
>> is added to the end.
>>
>> I note than when compiling Open MPI -lnuma was visible in the gcc
>> arguments, with no added -L.
>>
>> On this system libnuma.so exists in /usr/lib64.  My (somewhat long!)
>> configure command was
>>
>> ./configure --enable-static --disable-shared
>> --prefix=/home/terry/bin/Local --enable-picky --disable-heterogeneous
>> --without-slurm --without-alps --without-xgrid --without-sge
>> --without-loadleveler --without-lsf F77=ifort
>>
>>
>> Should mpif90 have bundled a -L/usr/lib64 in there somewhere?
>>
>> Regards
>> Terry
>>
>>
>> -- 
>> Dr. Terry Frankcombe
>> Research School of Chemistry, Australian National University
>> Ph: (+61) 0417 163 509Skype: terry.frankcombe
> 


-- 
Prentice


Re: [OMPI users] Low performance of Open MPI-1.3 over Gigabit

2009-03-04 Thread Mattijs Janssens
Your Intel processors are I assume not the new Nehalem/I7 ones? The older 
quad-core ones are seriously memory bandwidth limited when running a memory 
intensive application. That might explain why using all 8 cores per node 
slows down your calculation.

Why do you get such a difference between cpu time and elapsed time? Is your 
code doing any file IO or maybe waiting for one of the processors? Do you use 
non-blocking communication wherever possible?

Regards,

Mattijs

On Wednesday 04 March 2009 05:46, Sangamesh B wrote:
> Hi all,
>
> Now LAM-MPI is also installed and tested the fortran application by
> running with LAM-MPI.
>
> But LAM-MPI is performing still worse than Open MPI
>
> No of nodes:3 cores per node:8  total core: 3*8=24
>
>CPU TIME :1 HOURS 51 MINUTES 23.49 SECONDS
>ELAPSED TIME :7 HOURS 28 MINUTES  2.23 SECONDS
>
> No of nodes:6  cores used per node:4  total core: 6*4=24
>
>CPU TIME :0 HOURS 51 MINUTES 50.41 SECONDS
>ELAPSED TIME :6 HOURS  6 MINUTES 38.67 SECONDS
>
> Any help/suggetsions to diagnose this problem.
>
> Thanks,
> Sangamesh
>
> On Wed, Feb 25, 2009 at 12:51 PM, Sangamesh B  wrote:
> > Dear All,
> >
> >    A fortran application is installed with Open MPI-1.3 + Intel
> > compilers on a Rocks-4.3 cluster with Intel Xeon Dual socket Quad core
> > processor @ 3GHz (8cores/node).
> >
> >    The time consumed for different tests over a Gigabit connected
> > nodes are as follows: (Each node has 8 GB memory).
> >
> > No of Nodes used:6  No of cores used/node:4 total mpi processes:24
> >       CPU TIME :    1 HOURS 19 MINUTES 14.39 SECONDS
> >   ELAPSED TIME :    2 HOURS 41 MINUTES  8.55 SECONDS
> >
> > No of Nodes used:6  No of cores used/node:8 total mpi processes:48
> >       CPU TIME :    4 HOURS 19 MINUTES 19.29 SECONDS
> >   ELAPSED TIME :    9 HOURS 15 MINUTES 46.39 SECONDS
> >
> > No of Nodes used:3  No of cores used/node:8 total mpi processes:24
> >       CPU TIME :    2 HOURS 41 MINUTES 27.98 SECONDS
> >   ELAPSED TIME :    4 HOURS 21 MINUTES  0.24 SECONDS
> >
> > But the same application performs well on another Linux cluster with
> > LAM-MPI-7.1.3
> >
> > No of Nodes used:6  No of cores used/node:4 total mpi processes:24
> > CPU TIME :    1hours:30min:37.25s
> > ELAPSED TIME  1hours:51min:10.00S
> >
> > No of Nodes used:12  No of cores used/node:4 total mpi processes:48
> > CPU TIME :    0hours:46min:13.98s
> > ELAPSED TIME  1hours:02min:26.11s
> >
> > No of Nodes used:6  No of cores used/node:8 total mpi processes:48
> > CPU TIME :     1hours:13min:09.17s
> > ELAPSED TIME  1hours:47min:14.04s
> >
> > So there is a huge difference between CPU TIME & ELAPSED TIME for Open
> > MPI jobs.
> >
> > Note: On the same cluster Open MPI gives better performance for
> > inifiniband nodes.
> >
> > What could be the problem for Open MPI over Gigabit?
> > Any flags need to be used?
> > Or is it not that good to use Open MPI on Gigabit?
> >
> > Thanks,
> > Sangamesh
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 

Mattijs Janssens

OpenCFD Ltd.
9 Albert Road,
Caversham,
Reading RG4 7AN.
Tel: +44 (0)118 9471030
Email: m.janss...@opencfd.co.uk
URL: http://www.OpenCFD.co.uk



Re: [OMPI users] Low performance of Open MPI-1.3 over Gigabit

2009-03-04 Thread Ralph H. Castain
It would also help to have some idea how you installed and ran this -
e.g., did you set mpi_paffinity_alone so that the processes would bind to
their processors? That could explain the cpu vs. elapsed time since it
helps the processes from being swapped out as much.

Ralph


> Your Intel processors are I assume not the new Nehalem/I7 ones? The older
> quad-core ones are seriously memory bandwidth limited when running a
> memory
> intensive application. That might explain why using all 8 cores per node
> slows down your calculation.
>
> Why do you get such a difference between cpu time and elapsed time? Is
> your
> code doing any file IO or maybe waiting for one of the processors? Do you
> use
> non-blocking communication wherever possible?
>
> Regards,
>
> Mattijs
>
> On Wednesday 04 March 2009 05:46, Sangamesh B wrote:
>> Hi all,
>>
>> Now LAM-MPI is also installed and tested the fortran application by
>> running with LAM-MPI.
>>
>> But LAM-MPI is performing still worse than Open MPI
>>
>> No of nodes:3 cores per node:8  total core: 3*8=24
>>
>>CPU TIME :1 HOURS 51 MINUTES 23.49 SECONDS
>>ELAPSED TIME :7 HOURS 28 MINUTES  2.23 SECONDS
>>
>> No of nodes:6  cores used per node:4  total core: 6*4=24
>>
>>CPU TIME :0 HOURS 51 MINUTES 50.41 SECONDS
>>ELAPSED TIME :6 HOURS  6 MINUTES 38.67 SECONDS
>>
>> Any help/suggetsions to diagnose this problem.
>>
>> Thanks,
>> Sangamesh
>>
>> On Wed, Feb 25, 2009 at 12:51 PM, Sangamesh B 
>> wrote:
>> > Dear All,
>> >
>> >    A fortran application is installed with Open MPI-1.3 + Intel
>> > compilers on a Rocks-4.3 cluster with Intel Xeon Dual socket Quad core
>> > processor @ 3GHz (8cores/node).
>> >
>> >    The time consumed for different tests over a Gigabit connected
>> > nodes are as follows: (Each node has 8 GB memory).
>> >
>> > No of Nodes used:6  No of cores used/node:4 total mpi processes:24
>> >       CPU TIME :    1 HOURS 19 MINUTES 14.39 SECONDS
>> >   ELAPSED TIME :    2 HOURS 41 MINUTES  8.55 SECONDS
>> >
>> > No of Nodes used:6  No of cores used/node:8 total mpi processes:48
>> >       CPU TIME :    4 HOURS 19 MINUTES 19.29 SECONDS
>> >   ELAPSED TIME :    9 HOURS 15 MINUTES 46.39 SECONDS
>> >
>> > No of Nodes used:3  No of cores used/node:8 total mpi processes:24
>> >       CPU TIME :    2 HOURS 41 MINUTES 27.98 SECONDS
>> >   ELAPSED TIME :    4 HOURS 21 MINUTES  0.24 SECONDS
>> >
>> > But the same application performs well on another Linux cluster with
>> > LAM-MPI-7.1.3
>> >
>> > No of Nodes used:6  No of cores used/node:4 total mpi processes:24
>> > CPU TIME :    1hours:30min:37.25s
>> > ELAPSED TIME  1hours:51min:10.00S
>> >
>> > No of Nodes used:12  No of cores used/node:4 total mpi processes:48
>> > CPU TIME :    0hours:46min:13.98s
>> > ELAPSED TIME  1hours:02min:26.11s
>> >
>> > No of Nodes used:6  No of cores used/node:8 total mpi processes:48
>> > CPU TIME :     1hours:13min:09.17s
>> > ELAPSED TIME  1hours:47min:14.04s
>> >
>> > So there is a huge difference between CPU TIME & ELAPSED TIME for Open
>> > MPI jobs.
>> >
>> > Note: On the same cluster Open MPI gives better performance for
>> > inifiniband nodes.
>> >
>> > What could be the problem for Open MPI over Gigabit?
>> > Any flags need to be used?
>> > Or is it not that good to use Open MPI on Gigabit?
>> >
>> > Thanks,
>> > Sangamesh
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> --
>
> Mattijs Janssens
>
> OpenCFD Ltd.
> 9 Albert Road,
> Caversham,
> Reading RG4 7AN.
> Tel: +44 (0)118 9471030
> Email: m.janss...@opencfd.co.uk
> URL: http://www.OpenCFD.co.uk
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



Re: [OMPI users] metahosts (like in MP-MPICH)

2009-03-04 Thread Yury Tarasievich

Jeff Squyres wrote:

I'm not quite sure what an MP-MPICH meta host is.

Open MPI allows you to specify multiple hosts in a hostfile and run a 
single MPI job across all of them, assuming they're connected by at 
least some common TCP network.


What I need is one MPI job put for distributed 
computation on several actual machines, 
connected by TCP/IP (so, kind of cluster 
computation). Machines may have heterogenous 
OSes on them (MP-MPICH accounts for that with 
its HETERO option).


I'm somewhat new to MPI. It's possible, that 
what I describe is an inherent option of MPI 
implementations. Please advise.


--



Re: [OMPI users] libnuma under ompi 1.3

2009-03-04 Thread Ralph Castain
Problem is that some systems install both 32 and 64 bit support, and  
build OMPI both ways. So we really can't just figure it out without  
some help.


At our location, we simply take care to specify the -L flag to point  
to the correct version so we avoid any confusion.



On Mar 4, 2009, at 8:37 AM, Prentice Bisbal wrote:


Jeff,

See my reply to Dr. Frankcombe's original e-mail. I've experienced  
this

same problem with the PGI compilers, so this isn't limited to just the
Intel compilers. I provided a fix, but I think OpenMPI should be  
able to

figure out and add the correct linker flags during the
configuration/build stage.

Jeff Squyres wrote:

Hmm; that's odd.

Is icc / icpc able to find libnuma with no -L, but ifort is unable to
find it without a -L?

On Mar 3, 2009, at 10:00 PM, Terry Frankcombe wrote:

Having just downloaded and installed Open MPI 1.3 with ifort and  
gcc, I

merrily went off to compile my application.

In my final link with mpif90 I get the error:

/usr/bin/ld: cannot find -lnuma

Adding --showme reveals that

-I/home/terry/bin/Local/include -pthread -I/home/terry/bin/Local/lib

is added to the compile early in the aggregated ifort command, and

-L/home/terry/bin/Local/lib -lmpi_f90 -lmpi_f77 -lmpi -lopen-rte
-lopen-pal -lpbs -lnuma -ldl -Wl,--export-dynamic -lnsl -lutil -lm  
-ldl


is added to the end.

I note than when compiling Open MPI -lnuma was visible in the gcc
arguments, with no added -L.

On this system libnuma.so exists in /usr/lib64.  My (somewhat long!)
configure command was

./configure --enable-static --disable-shared
--prefix=/home/terry/bin/Local --enable-picky --disable- 
heterogeneous

--without-slurm --without-alps --without-xgrid --without-sge
--without-loadleveler --without-lsf F77=ifort


Should mpif90 have bundled a -L/usr/lib64 in there somewhere?

Regards
Terry


--
Dr. Terry Frankcombe
Research School of Chemistry, Australian National University
Ph: (+61) 0417 163 509Skype: terry.frankcombe





--
Prentice
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] metahosts (like in MP-MPICH)

2009-03-04 Thread Jeff Squyres

On Mar 4, 2009, at 11:38 AM, Yury Tarasievich wrote:


I'm not quite sure what an MP-MPICH meta host is.
Open MPI allows you to specify multiple hosts in a hostfile and run  
a single MPI job across all of them, assuming they're connected by  
at least some common TCP network.


What I need is one MPI job put for distributed computation on  
several actual machines, connected by TCP/IP (so, kind of cluster  
computation). Machines may have heterogenous OSes on them (MP-MPICH  
accounts for that with its HETERO option).


I'm somewhat new to MPI. It's possible, that what I describe is an  
inherent option of MPI implementations. Please advise.



Yes, pretty much all MPI implementations support a single job spanning  
multiple hosts.


Open MPI also supports heterogeneity of data representation if you use  
the --enable-heterogeneous flag to OMPI's configure.


In general, you need both OMPI and your application compiled natively  
for each platform.  One easy way to do this is to install Open MPI  
locally on each node in the same filesystem location (e.g., /opt/ 
openmpi-).  You also want exactly the same version of Open  
MPI on all nodes.


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] metahosts (like in MP-MPICH)

2009-03-04 Thread Yury Tarasievich

Jeff Squyres wrote:

...
In general, you need both OMPI and your application compiled natively 
for each platform.  One easy way to do this is to install Open MPI 
locally on each node in the same filesystem location (e.g., 
/opt/openmpi-).  You also want exactly the same version of 
Open MPI on all nodes.



Thanks for the tip, I'll try this!

--



Re: [OMPI users] libnuma under ompi 1.3

2009-03-04 Thread Joshua Bernstein



Terry Frankcombe wrote:

Having just downloaded and installed Open MPI 1.3 with ifort and gcc, I
merrily went off to compile my application.

In my final link with mpif90 I get the error:

/usr/bin/ld: cannot find -lnuma

Adding --showme reveals that

-I/home/terry/bin/Local/include -pthread -I/home/terry/bin/Local/lib

is added to the compile early in the aggregated ifort command, and 


-L/home/terry/bin/Local/lib -lmpi_f90 -lmpi_f77 -lmpi -lopen-rte
-lopen-pal -lpbs -lnuma -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl

is added to the end.

I note than when compiling Open MPI -lnuma was visible in the gcc
arguments, with no added -L.

On this system libnuma.so exists in /usr/lib64.  My (somewhat long!)
configure command was


You shouldn't have to. The runtime loader should look inside of /usr/lib64 by 
itself. Unless of course, you've built either your application or OpenMPI using 
a 32-bit Intel complier instead (say fc instead of fce). In that case the 
runtime loader would look inside of /usr/lib to find libnuma, rather then 
/usr/lib64.


Are you sure you are using the 64-bit version of the Intel compilier? If you 
intend to use the 32-bit version of the compilier, and OpenMPI is 32-bits you 
may just need to install the numactl.i386 and numactl.x86_64 RPMS.


-Joshua Bernstein
Senior Software Engineer
Penguin Computing


Re: [OMPI users] openib RETRY EXCEEDED ERROR

2009-03-04 Thread Jeff Squyres

On Mar 1, 2009, at 7:24 PM, Brett Pemberton wrote:


I'd appreciate some advice on if I'm using OFED correctly.

I'm running OFED 1.4, however not the kernel modules, just userland.
Is this a bad idea?



I believe so.  I'm not a kernel guy, but I've always used the userland  
bits matched with the corresponding kernel bits.  If nothing else,  
getting them to match would eliminate one possible source of errors.



Basically, I recompile the ofed src.rpms for:

dapl, libibcm, libibcommon, libibmad, libibumad, libibverbs, libmthca,
librdmacm, libsdp, mstflint

And install onto CentOS, upgrading the in-distro versions.
Should I also be compiling ofa_kernel ?
Could this be causing problems ?



...could be?  I don't really know.  That would be a better question  
for the gene...@lists.openfabrics.org list.



As explained off-list, I'm running the most recent firmware for my
cards, although the release is quite old:

hca_id: mthca0
 fw_ver: 1.2.0



I *believe* that's fairly ancient.  You might want to check the  
support Mellanox web site and see if there's anything more recent for  
your HCA.


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] threading bug?

2009-03-04 Thread Jeff Squyres

On Feb 27, 2009, at 1:56 PM, Mahmoud Payami wrote:

I am using intel lc_prof-11 (and its own mkl) and have built  
openmpi-1.3.1 with connfigure options: "FC=ifort F77=ifort CC=icc  
CXX=icpc". Then I have built my application.
The linux box is 2Xamd64 quad. In the middle of running of my  
application (after some 15 iterations), I receive the message and  
stops.
I tried to configure openmpi using "--disable-mpi-threads" but it  
automatically assumes "posix".


This doesn't sound like a threading problem, thankfully.  Open MPI has  
two levels of threading issues:


- whether MPI_THREAD_MULTIPLE is supported or not (which is what -- 
enable|disable-mpi-threads does)
- whether thread support is present at all on the system (e.g.,  
solaris or posix threads)


You see "posix" in the configure output mainly because OMPI still  
detects that posix threads are available on the system.  It doesn't  
necessarily mean that threads will be used in your application's run.



This problem does not happen in openmpi-1.2.9.
Any comment is highly appreciated.
Best regards,
mahmoud payami


[hpc1:25353] *** Process received signal ***
[hpc1:25353] Signal: Segmentation fault (11)
[hpc1:25353] Signal code: Address not mapped (1)
[hpc1:25353] Failing at address: 0x51
[hpc1:25353] [ 0] /lib64/libpthread.so.0 [0x303be0dd40]
[hpc1:25353] [ 1] /opt/openmpi131_cc/lib/
openmpi/mca_pml_ob1.so [0x2e350d96]
[hpc1:25353] [ 2] /opt/openmpi131_cc/lib/
openmpi/mca_pml_ob1.so [0x2e3514a8]
[hpc1:25353] [ 3] /opt/openmpi131_cc/lib/openmpi/mca_btl_sm.so  
[0x2eb7c72a]
[hpc1:25353] [ 4] /opt/openmpi131_cc/lib/libopen-pal.so. 
0(opal_progress+0x89) [0x2b42b7d9]
[hpc1:25353] [ 5] /opt/openmpi131_cc/lib/openmpi/mca_pml_ob1.so  
[0x2e34d27c]
[hpc1:25353] [ 6] /opt/openmpi131_cc/lib/libmpi.so.0(PMPI_Recv 
+0x210) [0x2af46010]
[hpc1:25353] [ 7] /opt/openmpi131_cc/lib/libmpi_f77.so.0(mpi_recv 
+0xa4) [0x2acd6af4]
[hpc1:25353] [ 8] /opt/QE131_cc/bin/pw.x(parallel_toolkit_mp_zsqmred_ 
+0x13da) [0x513d8a]

[hpc1:25353] [ 9] /opt/QE131_cc/bin/pw.x(pcegterg_+0x6c3f) [0x6667ff]
[hpc1:25353] [10] /opt/QE131_cc/bin/pw.x(diag_bands_+0xb9e) [0x65654e]
[hpc1:25353] [11] /opt/QE131_cc/bin/pw.x(c_bands_+0x277) [0x6575a7]
[hpc1:25353] [12] /opt/QE131_cc/bin/pw.x(electrons_+0x53f) [0x58a54f]
[hpc1:25353] [13] /opt/QE131_cc/bin/pw.x(MAIN__+0x1fb) [0x458acb]
[hpc1:25353] [14] /opt/QE131_cc/bin/pw.x(main+0x3c) [0x4588bc]
[hpc1:25353] [15] /lib64/libc.so.6(__libc_start_main+0xf4)  
[0x303b21d8a4]

[hpc1:25353] [16] /opt/QE131_cc/bin/pw.x(realloc+0x1b9) [0x4587e9]
[hpc1:25353] *** End of error message ***
--
mpirun noticed that process rank 6 with PID 25353 on node hpc1  
exited on signal 11 (Segmentation fault).

--


What this stack trace tells us is that Open MPI crashed somewhere  
while trying to use shared memory for message passing, but it doesn't  
really tell us much else.  It's not clear, either, whether this is  
OMPI's fault or your app's fault (or something else).


Can you run your application through a memory-checking debugger to see  
if anything obvious pops out?


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] mpirun problem

2009-03-04 Thread Jeff Squyres
Sorry for the delay in replying; the usual INBOX deluge keeps me from  
being timely in replying to all mails...  More below.


On Feb 24, 2009, at 6:52 AM, Jovana Knezevic wrote:


I'm new to MPI, so I'm going to explain my problem in detail
I'm trying to compile a simple application using mpicc (on SUSE 10.0)
and run it - compilation passes well, but mpirun is the problem.
So, let's say the program is called 1.c, I tried the following:

mpicc -o 1 1.c

(and, just for the case, after problems with mpirun, I tried the  
following, too)


mpicc --showme:compile

mpicc --showme:link

mpicc -I/include -pthread 1.c -pthread -I/lib -lmpi -lopen-rte
-lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl -o 1

Both versions (wih or without flags) produced executables as expected
(so, when I write: ./1 it executes in expected manner),


Good.


but when I try this:

mpirun -np 4 ./1,

it terminates giving the following message:


ssh: (none): Name or service not known
--
A daemon (pid 6877) died unexpectedly with status 255 while attempting
to launch so we are aborting.


That's fun; it seems like OMPI is not recognizing localhost properly.

Can you use the --debug-daemons and --leave-session-attached options  
to mpirun and see what output you get?


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] mpirun problem

2009-03-04 Thread Ralph Castain
I suppose one initial question is: what version of Open MPI are you  
running? OMPI 1.3 should not be attempting to ssh a daemon on a local  
job like this - OMPI 1.2 -will-, so it is important to know which one  
we are talking about.


Just do "mpirun --version" and it should tell you.

Ralph


On Mar 4, 2009, at 1:09 PM, Jeff Squyres wrote:

Sorry for the delay in replying; the usual INBOX deluge keeps me  
from being timely in replying to all mails...  More below.


On Feb 24, 2009, at 6:52 AM, Jovana Knezevic wrote:


I'm new to MPI, so I'm going to explain my problem in detail
I'm trying to compile a simple application using mpicc (on SUSE 10.0)
and run it - compilation passes well, but mpirun is the problem.
So, let's say the program is called 1.c, I tried the following:

mpicc -o 1 1.c

(and, just for the case, after problems with mpirun, I tried the  
following, too)


mpicc --showme:compile

mpicc --showme:link

mpicc -I/include -pthread 1.c -pthread -I/lib -lmpi -lopen-rte
-lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl -o 1

Both versions (wih or without flags) produced executables as expected
(so, when I write: ./1 it executes in expected manner),


Good.


but when I try this:

mpirun -np 4 ./1,

it terminates giving the following message:


ssh: (none): Name or service not known
--
A daemon (pid 6877) died unexpectedly with status 255 while  
attempting

to launch so we are aborting.


That's fun; it seems like OMPI is not recognizing localhost properly.

Can you use the --debug-daemons and --leave-session-attached options  
to mpirun and see what output you get?


--
Jeff Squyres
Cisco Systems

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] RETRY EXCEEDED ERROR

2009-03-04 Thread Jan Lindheim
I found several reports on the openmpi users mailing list from users,
who need to bump up the default value for btl_openib_ib_timeout. 
We also have some applications on our cluster, that have problems,
unless we set this value from the default 10 to 15:

[24426,1],122][btl_openib_component.c:2905:handle_wc] from shc174 to: shc175
error polling LP CQ with status RETRY EXCEEDED ERROR status number 12 for 
wr_id 250450816 opcode 11048 qp_idx 3

This is seen with OpenMPI 1.3 and OpenFabrics 1.4.

Is this normal or is it an indicator of other problems, maybe related to
hardware?
Are there other parameters that need to be looked at too?

Thanks for any insight on this!

Regards,
Jan Lindheim


Re: [OMPI users] Gamess with openmpi

2009-03-04 Thread Jeff Squyres
Sorry for the delay in replying -- INBOX deluge makes me miss emails  
on the users list sometimes.


I'm unfortunately not familiar with gamess -- have you checked with  
their support lists or documentation?


Note that Open MPI's IB progression engine will spin hard to make  
progress for message passing.  Specifically, if you have processes  
that are "blocking" in message passing calls, those processes will  
actually be spinning trying to make progress (vs. actually blocking in  
the kernel).  So if you overload your hosts -- meaning that you run  
more Open MPI jobs than there are cores -- you could well experience  
dramatic slowdown in overall performance because every MPI job will be  
competing for CPU cycles.



On Feb 24, 2009, at 4:57 AM, Thomas Exner wrote:


Dear all:

Because I am new to this list, I would like to introduce myself as
Thomas Exner and please excuse silly questions, because I am only a  
chemist.


And now my problem, with which I am fiddling around for almost a  
week: I

try to use gamess with openmpi on infiniband. There is a good
description on how to  compile it with mpi and it can be done, even if
it is not easy. But then on run time everything gets weird. The
specialty of gamess is that it runs twice as much mpi jobs than used  
for
the computation. The second half is used as data server, requiring  
data

but with very little cpu load. Each one of these data servers is
connected to a specific compute job. Therefore, these two  
corresponding

jobs have to be run on the same node. On one node everything is fine
(2x4core machines in my case), because all the jobs are guarantied to
run on this node. If I try two nodes, at the beginning also everything
is fine. 8 compute jobs and 8 data server are running on each machine.
But after a short while, the entire set of processes (16) on the first
node start to accumulate CPU time, with nothing useful happening.  The
second node's processes go entirely to sleep. Is it possible that all
the compute jobs are for some reason been transfered to the first  
node?
This would explain the load of 16 on the first and 0 on the second  
node,
because 16 compute jobs (100 % cpu load) and 16 data servers (almost  
0%

load) are running, respectively. Strange thing is also that the same
version runs on gigabit and myrinet fine.

It would be great if somebody could help me on that. If you need more
information, I will be happy to share them with you.

Thanks very much.
Thomas


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Bug reporting [was: OpenMPI 1.3]

2009-03-04 Thread Jeff Squyres
Sorry for the delay; a bunch of higher priority stuff got in the way  
of finishing this thread.  Anyhoo...



On Feb 24, 2009, at 4:24 AM, Olaf Lenz wrote:

I think it would be also sufficient to place a short text and link  
to the Trac page, so that the developers that want to use the "Bug  
Tracking" link to get to Trac do not have to click once more, but if  
this is OK for you, then it's fine.


This is ok for us.  I think most of us have trac either bookmarked or  
sufficiently active in our history such that Firefox's awesome bar (or  
equivalent in other browsers) just pick it up and go directly there  
without going through the "Bug tracking" link on the OMPI web site.


Another option would be to simply give the link a less alluring  
name, like "Trac Bug Tracking System" or "Issue Tracker", or just  
"Trac".


Should bugs really be reported directly to the developer's list (as  
stated in 1. on the new page)? Or to the user's mailing list if they  
are not sure that it is a bug?



Good point; I softened the language in that first bullet.

--
Jeff Squyres
Cisco Systems



Re: [OMPI users] RETRY EXCEEDED ERROR

2009-03-04 Thread Jeff Squyres
This *usually* indicates a physical / layer 0 problem in your IB  
fabric.  You should do a diagnostic on your HCAs, cables, and switches.


Increasing the timeout value should only be necessary on very large IB  
fabrics and/or very congested networks.



On Mar 4, 2009, at 3:28 PM, Jan Lindheim wrote:


I found several reports on the openmpi users mailing list from users,
who need to bump up the default value for btl_openib_ib_timeout.
We also have some applications on our cluster, that have problems,
unless we set this value from the default 10 to 15:

[24426,1],122][btl_openib_component.c:2905:handle_wc] from shc174  
to: shc175
error polling LP CQ with status RETRY EXCEEDED ERROR status number  
12 for

wr_id 250450816 opcode 11048 qp_idx 3

This is seen with OpenMPI 1.3 and OpenFabrics 1.4.

Is this normal or is it an indicator of other problems, maybe  
related to

hardware?
Are there other parameters that need to be looked at too?

Thanks for any insight on this!

Regards,
Jan Lindheim
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] RETRY EXCEEDED ERROR

2009-03-04 Thread Jan Lindheim
On Wed, Mar 04, 2009 at 04:02:06PM -0500, Jeff Squyres wrote:
> This *usually* indicates a physical / layer 0 problem in your IB  
> fabric.  You should do a diagnostic on your HCAs, cables, and switches.
> 
> Increasing the timeout value should only be necessary on very large IB  
> fabrics and/or very congested networks.

Thanks Jeff!
What is considered to be very large IB fabrics?
I assume that with just over 180 compute nodes,
our cluster does not fall into this category.

Jan

> 
> 
> On Mar 4, 2009, at 3:28 PM, Jan Lindheim wrote:
> 
> >I found several reports on the openmpi users mailing list from users,
> >who need to bump up the default value for btl_openib_ib_timeout.
> >We also have some applications on our cluster, that have problems,
> >unless we set this value from the default 10 to 15:
> >
> >[24426,1],122][btl_openib_component.c:2905:handle_wc] from shc174  
> >to: shc175
> >error polling LP CQ with status RETRY EXCEEDED ERROR status number  
> >12 for
> >wr_id 250450816 opcode 11048 qp_idx 3
> >
> >This is seen with OpenMPI 1.3 and OpenFabrics 1.4.
> >
> >Is this normal or is it an indicator of other problems, maybe  
> >related to
> >hardware?
> >Are there other parameters that need to be looked at too?
> >
> >Thanks for any insight on this!
> >
> >Regards,
> >Jan Lindheim
> >___
> >users mailing list
> >us...@open-mpi.org
> >http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> -- 
> Jeff Squyres
> Cisco Systems
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 



Re: [OMPI users] RETRY EXCEEDED ERROR

2009-03-04 Thread Jeff Squyres

On Mar 4, 2009, at 4:16 PM, Jan Lindheim wrote:


On Wed, Mar 04, 2009 at 04:02:06PM -0500, Jeff Squyres wrote:
> This *usually* indicates a physical / layer 0 problem in your IB
> fabric.  You should do a diagnostic on your HCAs, cables, and  
switches.

>
> Increasing the timeout value should only be necessary on very  
large IB

> fabrics and/or very congested networks.

Thanks Jeff!
What is considered to be very large IB fabrics?
I assume that with just over 180 compute nodes,
our cluster does not fall into this category.



I was a little misleading in my note -- I should clarify.  It's really  
congestion that matters, not the size of the fabric.  Congestion is  
potentially more likely to happen in larger fabrics, since packets may  
have to flow through more switches, there's likely more apps running  
on the cluster, etc.  But it's all very application/cluster-specific;  
only you can know if your fabric is heavily congested based on what  
you run on it, etc.


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] RETRY EXCEEDED ERROR

2009-03-04 Thread Jan Lindheim
On Wed, Mar 04, 2009 at 04:34:49PM -0500, Jeff Squyres wrote:
> On Mar 4, 2009, at 4:16 PM, Jan Lindheim wrote:
> 
> >On Wed, Mar 04, 2009 at 04:02:06PM -0500, Jeff Squyres wrote:
> >> This *usually* indicates a physical / layer 0 problem in your IB
> >> fabric.  You should do a diagnostic on your HCAs, cables, and  
> >switches.
> >>
> >> Increasing the timeout value should only be necessary on very  
> >large IB
> >> fabrics and/or very congested networks.
> >
> >Thanks Jeff!
> >What is considered to be very large IB fabrics?
> >I assume that with just over 180 compute nodes,
> >our cluster does not fall into this category.
> >
> 
> I was a little misleading in my note -- I should clarify.  It's really  
> congestion that matters, not the size of the fabric.  Congestion is  
> potentially more likely to happen in larger fabrics, since packets may  
> have to flow through more switches, there's likely more apps running  
> on the cluster, etc.  But it's all very application/cluster-specific;  
> only you can know if your fabric is heavily congested based on what  
> you run on it, etc.
> 
> -- 
> Jeff Squyres
> Cisco Systems
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 

Thanks again Jeff!
Time to dig up diagnostics tools and look at port statistics.

Jan


Re: [OMPI users] libnuma under ompi 1.3

2009-03-04 Thread Terry Frankcombe

Thanks to everyone who contributed.

I no longer think this is Open MPI's problem.  This system is just
stupid.  Everything's 64 bit (which various probes with file confirm).

There's no icc, so I can't test with that.  gcc finds libnuma without
-L.  (Though a simple gcc -lnuma -Wl,-t reports that libnuma is found
through the rather convoluted
path /usr/lib64/gcc-lib/x86_64-suse-linux/3.3.4/../../../../lib64/libnuma.so.)

ifort -lnuma can't find libnuma.so, but then ifort -L/usr/lib64 -lnuma
can't find it either!  While everything else points to some mix up with
linking search paths, that last result confuses me greatly.  (Unless
there's some subtlety with libnuma.so being a link to libnuma.so.1.)

I can compile my app by replicating mpif90's --showme output directly on
the command line, with -lnuma replaced explicitly
with /usr/lib64/libnuma.so.  Then, even though I've told ifort -static,
ldd gives the three lines:

libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x2b3f58a3c000)
libc.so.6 => /lib64/tls/libc.so.6 (0x2b3f58b42000)
/lib/ld64.so.1 => /lib/ld64.so.1 (0x2b3f58925000)

While I don't understand what's going on here, I now have a working
binary.  It's the only app I use on this machine, so I'm no longer
concerned.  All other machines on which I use Open MPI work as expected
out of the box.  My workaround here is sufficient.

Once more, thanks for the suggestions.  I think this machine is just
pathological.

Ciao
Terry




Re: [OMPI users] libnuma under ompi 1.3

2009-03-04 Thread Doug Reeder

Terry,

Is there a libnuma.a on your system. If not the -static flag to ifort  
won't do any thing because there isn't a static library for it to link  
against.


Doug Reeder
On Mar 4, 2009, at 6:06 PM, Terry Frankcombe wrote:



Thanks to everyone who contributed.

I no longer think this is Open MPI's problem.  This system is just
stupid.  Everything's 64 bit (which various probes with file confirm).

There's no icc, so I can't test with that.  gcc finds libnuma without
-L.  (Though a simple gcc -lnuma -Wl,-t reports that libnuma is found
through the rather convoluted
path /usr/lib64/gcc-lib/x86_64-suse-linux/3.3.4/../../../../lib64/ 
libnuma.so.)


ifort -lnuma can't find libnuma.so, but then ifort -L/usr/lib64 -lnuma
can't find it either!  While everything else points to some mix up  
with

linking search paths, that last result confuses me greatly.  (Unless
there's some subtlety with libnuma.so being a link to libnuma.so.1.)

I can compile my app by replicating mpif90's --showme output  
directly on

the command line, with -lnuma replaced explicitly
with /usr/lib64/libnuma.so.  Then, even though I've told ifort - 
static,

ldd gives the three lines:

libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x2b3f58a3c000)
libc.so.6 => /lib64/tls/libc.so.6 (0x2b3f58b42000)
/lib/ld64.so.1 => /lib/ld64.so.1 (0x2b3f58925000)

While I don't understand what's going on here, I now have a working
binary.  It's the only app I use on this machine, so I'm no longer
concerned.  All other machines on which I use Open MPI work as  
expected

out of the box.  My workaround here is sufficient.

Once more, thanks for the suggestions.  I think this machine is just
pathological.

Ciao
Terry


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] mlx4 error - looking for guidance

2009-03-04 Thread Jeff Layton
Evening everyone,

I'm running a CFD code on IB and I've encountered an error I'm not sure about 
and I'm looking for some guidance on where to start looking. Here's the error:

mlx4: local QP operation err (QPN 260092, WQE index 9a9e, vendor syndrome 
6f, opcode = 5e)
[0,1,6][btl_openib_component.c:1392:btl_openib_component_progress] from 
compute-2-0.local to: compute-2-0.local erro
r polling HP CQ with status LOCAL QP OPERATION ERROR status number 2 for wr_id 
37742320 opcode 0
mpirun noticed that job rank 0 with PID 21220 on node compute-2-0.local exited 
on signal 15 (Terminated).
78 additional processes aborted (not shown)


This is openmpi-1.2.9rc2 (sorry - need to upgrade to 1.3.0). The code works 
correctly for smaller cases, but when I run larger cases I get this error.

I'm heading to bed but I'll check email tomorrow (so to sleep and run but it's 
been a long day).

TIA!

Jeff