Re: [OMPI users] Relocating an Installation

2006-12-27 Thread Allen Barnett
Upon reflecting on this more, I guess I see two issues. First, there's
the issue of allowing the user to install our software on the system
where ever they like. Some users may want to install it in their home
directories, others may have a sys admin install it in a common
location. This seems like a substantial reason to allow an OMPI
installation to be relocated. So, I would say this was a very important
capability.

On the other hand, I don't have access to all the third party headers
and libraries which are necessary to build some of the more interesting
OMPI modules, such as the Infiniband and Myrinet drivers and many of the
batch scheduling drivers (tm? LoadLeveler? PORTALS? Xgrid? [I'm not sure
what these are]. And maybe a related question: One customer uses NQS
(NQE?); can this be supported?) So, I would expect the user may want to
compile at least some of OMPI himself (or herself) in order to activate
these modules. Thus, perhaps I should supply a partially built
installation which completes compilation as part of the installation
process? This seems somewhat impractical since it would require
compilers, headers and libraries, etc, on every machine on which our
software is installed.

I also don't know what distribution restrictions are placed on all the
3rd party software OMPI can link against. This may limit what can be
redistributed with our product.

So, I guess I'm open to suggestions on how best to distribute our
software. Being able to relocate an installation and being able to build
specific modules at installation time would appear to be very helpful
capabilities.

Many thanks,
Allen

On Fri, 2006-12-15 at 19:45 -0500, Jeff Squyres wrote:
> Greetings Allen.
> 
> This problem has not yet been resolved, but I'm quite sure we have an  
> open ticket about this on our bug tracker.  I just replied to Patrick  
> on-list about a related issue (his use of --prefix); I'd have to  
> think about this a bit more, but a solution proposed by one of the  
> other OMPI developers in internal conversations may fix both issues.   
> It only hasn't been coded up because we didn't prioritize it high.
> 
> So my question to you is -- how high of a priority is this for you?   
> Part of what makes it into each OMPI release is driven by what users  
> want/need, so input like this helps us prioritize the work.
> 
> Thanks!
> 
> 
> On Dec 13, 2006, at 10:37 AM, Allen Barnett wrote:
> 
> > There was a thread back in November started by Patrick Jessee about
> > relocating an installation after it was built (the subject was:  
> > removing
> > hard-coded paths from OpenMPI shared libraries). I guess I'm in the  
> > same
> > boat now. I would like to distribute our OpenMPI-based parallel  
> > solver;
> > but I can't really dictate where a user will install our software. Has
> > any one succeeded in building a version of OpenMPI which can be
> > relocated?
> 
-- 
Allen Barnett
Transpire, Inc.
E-Mail: al...@transpireinc.com
Ph: 518-887-2930



[OMPI users] openmpi / mpirun problem on aix: poll failed with errno=25, opal_event_loop: ompi_evesel->dispatch() failed.

2006-12-27 Thread Michael Marti
Dear AllI am trying to get openmpi-1.1.2 to work on AIX 5.3 / power5.:: Compilation seems to have worked with the following sequence:setenv OBJECT_MODE 64setenv CC xlcsetenv CXX xlCsetenv F77 xlfsetenv FC xlf90setenv CFLAGS "-qthreaded -O3 -qmaxmem=-1 -qarch=pwr5x -qtune=pwr5 -q64"setenv CXXFLAGS "-qthreaded -O3 -qmaxmem=-1 -qarch=pwr5x -qtune=pwr5 -q64"setenv FFLAGS "-qthreaded -O3 -qmaxmem=-1 -qarch=pwr5x -qtune=pwr5 -q64" setenv FCFLAGS "-qthreaded -O3 -qmaxmem=-1 -qarch=pwr5x -qtune=pwr5 -q64" setenv LDFLAGS "-Wl,-brtl"./configure --prefix=/ist/openmpi-1.1.2 \  --disable-mpi-cxx \  --disable-mpi-cxx-seek \  --enable-mpi-threads \  --enable-progress-threads \  --enable-static \  --disable-shared \  --disable-io-romio :: After the compilation I ran make check and all 11 tests passed successfully.:: Now I'm trying to run the following command just for test:# mpirun -hostfile /gpfs/MICHAEL/MPI_hostfiles/mpinodes_b41-b44_1.asc -np 2 /usr/bin/hostname- The file /gpfs/MICHAEL/MPI_hostfiles/mpinodes_b41-b44_1.asc contains 4 hosts:    r1blade041 slots=1    r1blade042 slots=1    r1blade043 slots=1    r1blade044 slots=1- The mpirun command eventually hangs with the following message:    [r1blade041:418014] poll failed with errno=25    [r1blade041:418014] opal_event_loop: ompi_evesel->dispatch() failed.- In this state mpirun cannot be killed by hitting  only a kill -9 will do the trick.- While the mpirun still hangs I can see that the "orted" has been launched on both requested hosts.:: I turned on all debug options in openmpi-mca-params.conf. The output for the same call of mpirun is in the file mpirun-debug.txt.gz.

mpirun-debug.txt.gz
Description: GNU Zip compressed data
:: As sugested in the mailinglis rules I include config.log (config.log.gz) and the output of ompi_info (ompi_info.txt.gz).

config.log.gz
Description: GNU Zip compressed data
 

ompi_info.txt.gz
Description: GNU Zip compressed data
 :: As I am completely new to openmpi (I have some experience with lam) I am lost at this stage. I would really appreciate if someone could give me some hints as to what is going wrong and where I could get more info.Best regards,Michael Marti.-- Michael MartiCentro de Fisica dos PlasmasInstituto Superior TecnicoAv. Rovisco Pais1049-001 LisboaPortugalTel:       +351 218 419 379Fax:      +351 218 464 455Mobile:  +351 968 434 327