Re: [OMPI users] compilation error with pgcc Unknown switch

2012-02-08 Thread Jeff Squyres (jsquyres)
Can you try building 1.5.4 with the same compilers?

Sent from my phone. No type good. 

On Feb 7, 2012, at 3:14 PM, "Abhinav Sarje"  wrote:

> I am trying to build open-mpi 1.4.4 (latest stable from open-mpi.org)
> using PGI compilers on a cray platform. PGI compilers' version is
> 11.9.0. I get the following error while building:
> -
> Making all in tools/wrappers
> make[2]: Entering directory `{my_installation_directory}/opal/tools/wrappers'
> source='opal_wrapper.c' object='opal_wrapper.o' libtool=no \
>DEPDIR=.deps depmode=none /bin/sh ../../../config/depcomp \
>cc "-DEXEEXT=\"\"" -I. -I../../../opal/include
> -I../../../orte/include -I../../../ompi/include
> -I../../../opal/mca/paffinity/linux/plpa/src/libplpa   -I../../..
> -D_REENTRANT  -O -DNDEBUG -fPIC  -c -o opal_wrapper.o opal_wrapper.c
> /bin/sh ../../../libtool --tag=CC   --mode=link cc  -O -DNDEBUG -fPIC
> -export-dynamic   -o opal_wrapper opal_wrapper.o
> ../../../opal/libopen-pal.la -lnsl -lutil
> libtool: link: cc -O -DNDEBUG -fPIC -o .libs/opal_wrapper
> opal_wrapper.o --export-dynamic  ../../../opal/.libs/libopen-pal.so
> -ldl -lnsl -lutil -rpath {my_installation_directory}/lib
> pgcc-Error-Unknown switch: --export-dynamic
> make[2]: *** [opal_wrapper] Error 1
> make[2]: Leaving directory `{my_installation_directory}/opal/tools/wrappers'
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory `{my_installation_directory}/opal'
> make: *** [all-recursive] Error 1
> -
> 
> I see that the libtool packaged with open-mpi is 2.2.6b
> When I try to compile this particular part with libtool versions 2.2.6
> or 2.4, I get the following error:
> -
> $ libtool --tag=CC   --mode=link cc  -O -DNDEBUG -fPIC
> -export-dynamic   -o opal_wrapper opal_wrapper.o
> ../../../opal/libopen-pal.la -lnsl -lutil
> libtool: link: cc -O -DNDEBUG -fPIC -o .libs/opal_wrapper
> opal_wrapper.o -Wl,--export-dynamic
> ../../../opal/.libs/libopen-pal.so -ldl -lnsl -lutil -Wl,-rpath
> -Wl,{my_installation_directory}/lib
> /usr/bin/ld: attempted static link of dynamic object
> `../../../opal/.libs/libopen-pal.so'
> -
> 
> Looking at earlier posts, apparently there was a bug with libtool a
> couple of years ago because of which the above error occurred. This
> was fixed in newer releases, but I am getting similar errors.
> 
> Does anyone have any information on how to fix this, or if I am doing
> something wrong here?
> 
> Thanks!
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



[OMPI users] Cross-compiling openmpi

2012-02-08 Thread Hossein Talebi
Hello All,

I try to cross-compile openmpi under linux for Windows. I have Ubuntu
linux 11.1 and x86_64-w64-mingw32. When I run the configure commands I
get the error message below:

*** Fortran 77 compiler
checking for x86_64-w64-mingw32-gfortran... x86_64-w64-mingw32-gfortran
checking whether we are using the GNU Fortran 77 compiler... yes
checking whether x86_64-w64-mingw32-gfortran accepts -g... yes
checking if Fortran 77 compiler works... links (cross compiling)
checking x86_64-w64-mingw32-gfortran external symbol convention...
single underscore
checking if C and Fortran 77 are link compatible... yes
checking to see if F77 compiler likes the C++ exception flags...
skipped (no C++ exceptions flags)
checking if Fortran 77 compiler supports LOGICAL... yes
checking size of Fortran 77 LOGICAL... configure: error: Can not
determine size of LOGICAL when cross-compiling


my configure options are:
 ./configure --prefix=/opt/mpich2.1.5_mingw64/
--host=x86_64-w64-mingw32  CXX=x86_64-w64-mingw32-g++
CC=x86_64-w64-mingw32-gcc  FC=x86_64-w64-mingw32-gfortran

Please help me to know if what I want to do is possible, if yes, how
can I fix this issue.

Thank you.


Cheers
Hossein


Re: [OMPI users] Spawn_multiple with tight integration to SGE grid engine

2012-02-08 Thread Tom Bryan

On 2/6/12 5:10 PM, "Reuti"  wrote:

> Am 06.02.2012 um 22:28 schrieb Tom Bryan:
> 
>> On 2/6/12 8:14 AM, "Reuti"  wrote:
>> 
 If I need MPI_THREAD_MULTIPLE, and openmpi is compiled with thread support,
 it's not clear to me whether MPI::Init_Thread() and
 MPI::Inint_Thread(MPI::THREAD_MULTIPLE) would give me the same behavior
 from
 Open MPI.
>>> 
>>> If you need thread support, you will need MPI::Init_Thread and it needs one
>>> argument (or three).
>> 
>> Sorry, typo on my side.  I meant to compare
>> MPI::Init_thread(MPI::THREAD_MULTIPLE) and MPI::Init().  I think that your
>> first reply mentioned replacing MPI::Init_thread by MPI::Init.
> 
> Yes, if you don't need threads, I don't see any reason why it should add
> anything to the environment what you could make use of.

Got it.  Unfortunately, we *definitely* need THREAD_MULTIPLE in our case.

>>> Yes, this should work across multiple machines. And it's using `qrsh
>>> -inherit
>>> ...` so it's failing somewhere in Open MPI - is it working with 1.4.4?
>> 
>> I'm not sure.  We no longer have our 1.4 test environment, so I'm in the
>> process of building that now.  I'll let you know once I have a chance to run
>> that experiment.

You said that both of these cases worked for you in 1.4.  Were you running a
modified version that did not use THREAD_MULTIPLE?  I ask because I'm
getting worse errors in 1.4.  I'm using the same code that was working (in
some cases) with 1.5.4.

I built 1.4.4 with (among other option)
--with-threads=posix --enable-mpi-threads

I rebuilt my code against 1.4.4.

When I run my test "e" from before, which is basically just
mpiexec -np 1 ./mpitest
I get the following in the output file for the job.

Calling init_thread
[vxr-lnx-11.cisco.com:64618] [[32207,1],0] ORTE_ERROR_LOG: Data unpack would
read past end of buffer in file util/nidmap.c at line 398
[vxr-lnx-11.cisco.com:64618] [[32207,1],0] ORTE_ERROR_LOG: Data unpack would
read past end of buffer in file base/ess_base_nidmap.c at line 62
[vxr-lnx-11.cisco.com:64618] [[32207,1],0] ORTE_ERROR_LOG: Data unpack would
read past end of buffer in file ess_env_module.c at line 173
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_base_build_nidmap failed
  --> Returned value Data unpack would read past end of buffer (-26) instead
of ORTE_SUCCESS
--
[vxr-lnx-11.cisco.com:64618] [[32207,1],0] ORTE_ERROR_LOG: Data unpack would
read past end of buffer in file runtime/orte_init.c at line 132
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Data unpack would read past end of buffer (-26) instead
of ORTE_SUCCESS
--
--
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: orte_init failed
  --> Returned "Data unpack would read past end of buffer" (-26) instead of
"Success" (0)
--
*** The MPI_Init_thread() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[vxr-lnx-11.cisco.com:64618] Abort before MPI_INIT completed successfully;
not able to guarantee that all other processes were killed!
--
mpiexec has exited due to process rank 0 with PID 64618 on
node vxr-lnx-11.cisco.com exiting improperly. There are two reasons this
could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termina

Re: [OMPI users] Spawn_multiple with tight integration to SGE grid engine

2012-02-08 Thread Tom Bryan
On 2/8/12 4:52 PM, "Tom Bryan"  wrote:

> Got it.  Unfortunately, we *definitely* need THREAD_MULTIPLE in our case.

> I rebuilt my code against 1.4.4.
> 
> When I run my test "e" from before, which is basically just
> mpiexec -np 1 ./mpitest
> I get the following [errors]

Talking to Jeff, it sounds like I should really stick with 1.5 since I'm
using THREAD_MULTIPLE.  There also seems to be a problem running ring_c on
my SGE grid, so we're going to investigate that first.  Perhaps there's a
problem with my SGE or Open MPI installation that is causing some of the
problems that I initially reported with mpitest.

I'll let you know what we find.

---Tom