[OMPI users] Timers

2009-09-11 Thread amjad ali
Hi all,
I want to get the elapsed time from start to end of my parallel program
(OPENMPI based). It should give same time for the same problem always;
irrespective of whether the nodes are running some or programs or they are
running only that program. How to do this?

Regards.


Re: [OMPI users] undefined symbol error when built as a sharedlibrary

2009-09-11 Thread Jeff Squyres

On Sep 10, 2009, at 9:42 PM, Ashika Umanga Umagiliya wrote:


That fixed the problem !
You are indeed a voodoo master... could you explain the spell behind
your magic :)



The problem has to do with how plugins (aka dynamic shared objects,  
DSO's) are loaded.  When a DSO is loaded into a Linux process, it has  
the option of making all the public symbols in that DSO public to the  
rest of the process or private within its own scope.


Let's back up.  Remember that Open MPI is based on plugins (DSO's).   
It loads lots and lots of plugins during execution (mostly during  
MPI_INIT).  These plugins call functions in OMPI's public libraries  
(e.g., they call functions in libmpi.so).  Hence, when the plugin  
DSO's are loaded, they need to be able to resolve these symbols into  
actual code that can be invoked.  If the symbols cannot be resolved,  
the DSO load fails.


If libParallel.so is loaded into a private scope, then its linked  
libraries (e.g., libmpi.so) are also loaded into that same private  
scope.  Hence, all of libmpi.so's public symbols are only public  
within that single, private scope.  Then, when OMPI goes to load its  
own DSOs, since libmpi.so's public symbols are in a private scope,  
OMPI's DSO's can't find them -- and therefore they refuse to load.   
(private scopes are not inherited -- a new DSO load cannot "see"  
libParallel.so/libmpi.so's private scope).


It's an educated guess from your description that this is what was  
happening.


OMPI's --disable-dlopen configure option has Open MPI build in a  
different way.  Instead of building all of OMPI's plugins as DSOs,  
they are "slurped" up into libmpi.so (etc.).  So there's no "loading"  
of DSOs at MPI_INIT time -- the plugin code actually resides *in*  
libmpi.so itself.  Hence, resolution of all symbols is done when  
libParallel.so loads libmpi.so.  Additionally, there's no secondary  
private scope created when DSOs are loaded -- they're all self- 
contained within libmpi.so (etc.).  And therefore all the libmpi.so  
symbols that are required for the plugins are all able to be found/ 
resolved at load time.


Does that make sense?




Regards,
umanga


Jeff Squyres wrote:
> I'm guessing that this has to do with deep, dark voodoo involved  
with

> the run time linker.
>
> Can you try configuring/building Open MPI with --disable-dlopen
> configure option, and rebuilding your libParallel.so against the new
> libmpi.so?
>
> See if that fixes the problem for you.  If it does, I can explain in
> more detail (if you care).
>
>
> On Sep 10, 2009, at 3:24 AM, Ashika Umanga Umagiliya wrote:
>
>> Greetings all,
>>
>> My parallel application is build as a shared library  
(libParallel.so).

>> (I use Debian Lenny 64bit).
>>  A webservice is used to dynamically load libParallel.so and inturn
>> execute the parallel process .
>>
>> But during runtime I get the error :
>>
>> webservicestub: symbol lookup error:
>> /usr/local/lib/openmpi/mca_paffinity_linux.so: undefined symbol:
>> mca_base_param_reg_int
>>
>> which I cannot figure out.I followed every 'ldd' and 'nm' seems
>> everything is fine.
>> So I compiled and tested my parallel code as an executable and  
then it

>> worked fine.
>>
>> What could be the reason for this?
>>
>> Thanks in advance,
>> umanga
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI users] Timers

2009-09-11 Thread jody
Hi
I'm not sure if i completely understand your requirements,
but have you tried MPI_WTime?

Jody

On Fri, Sep 11, 2009 at 7:54 AM, amjad ali  wrote:
> Hi all,
> I want to get the elapsed time from start to end of my parallel program
> (OPENMPI based). It should give same time for the same problem always;
> irrespective of whether the nodes are running some or programs or they are
> running only that program. How to do this?
>
> Regards.
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


[OMPI users] Bad MPI_Bcast behaviour when running over openib

2009-09-11 Thread Ake Sandgren
Hi!

The following code shows a bad behaviour when running over openib.

Openmpi: 1.3.3
With openib it dies with "error polling HP CQ with status WORK REQUEST
FLUSHED ERROR status number 5 ", with tcp or shmem it works as expected.


#include 
#include 
#include 
#include "mpi.h"

int main(int argc, char *argv[])
{
int  rank;
int  n;

MPI_Init( &argc, &argv );

MPI_Comm_rank( MPI_COMM_WORLD, &rank );

fprintf(stderr, "I am %d at %d\n", rank, time(NULL));
fflush(stderr);

n = 4;
MPI_Bcast(&n, 1, MPI_INTEGER, 0, MPI_COMM_WORLD);
fprintf(stderr, "I am %d at %d\n", rank, time(NULL));
fflush(stderr);
if (rank == 0) {
sleep(60);
}
MPI_Barrier(MPI_COMM_WORLD);

MPI_Finalize( );
exit(0);
}

I know about the internal openmpi reason for it do behave as it does.
But i think that it should be allowed to behave as it does.

This example is a bit engineered but there are codes where a similar
situation can occur, i.e. the Bcast sender doing lots of other work
after the Bcast before the next MPI call. VASP is a candidate for this.

-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se



Re: [OMPI users] undefined symbol error when built as a sharedlibrary

2009-09-11 Thread Reuti

Am 11.09.2009 um 12:14 schrieb Jeff Squyres:


On Sep 10, 2009, at 9:42 PM, Ashika Umanga Umagiliya wrote:


That fixed the problem !
You are indeed a voodoo master... could you explain the spell behind
your magic :)



The problem has to do with how plugins (aka dynamic shared objects,  
DSO's) are loaded.  When a DSO is loaded into a Linux process, it  
has the option of making all the public symbols in that DSO public  
to the rest of the process or private within its own scope.


Let's back up.  Remember that Open MPI is based on plugins  
(DSO's).  It loads lots and lots of plugins during execution  
(mostly during MPI_INIT).  These plugins call functions in OMPI's  
public libraries (e.g., they call functions in libmpi.so).  Hence,  
when the plugin DSO's are loaded, they need to be able to resolve  
these symbols into actual code that can be invoked.  If the symbols  
cannot be resolved, the DSO load fails.


If libParallel.so is loaded into a private scope, then its linked  
libraries (e.g., libmpi.so) are also loaded into that same private  
scope.  Hence, all of libmpi.so's public symbols are only public  
within that single, private scope.  Then, when OMPI goes to load  
its own DSOs, since libmpi.so's public symbols are in a private  
scope, OMPI's DSO's can't find them -- and therefore they refuse to  
load.  (private scopes are not inherited -- a new DSO load cannot  
"see" libParallel.so/libmpi.so's private scope).


It's an educated guess from your description that this is what was  
happening.


OMPI's --disable-dlopen configure option has Open MPI build in a  
different way.


Aha - this might also explain what I faced some time ago. I tried to  
compile an application called Molpro with GlobalArrays which I  
compiled with Open MPI. I faced similar errors - the compilation  
worked without any problem, but I couldn't run the application, as it  
resulted in a similar error. Finally I gave up and stayed with mpich 
(1) for this.


I will try to build it with this switch in the next days - maybe it  
will also solve this issue.


-- Reuti


  Instead of building all of OMPI's plugins as DSOs, they are  
"slurped" up into libmpi.so (etc.).  So there's no "loading" of  
DSOs at MPI_INIT time -- the plugin code actually resides *in*  
libmpi.so itself.  Hence, resolution of all symbols is done when  
libParallel.so loads libmpi.so.  Additionally, there's no secondary  
private scope created when DSOs are loaded -- they're all self- 
contained within libmpi.so (etc.).  And therefore all the libmpi.so  
symbols that are required for the plugins are all able to be found/ 
resolved at load time.


Does that make sense?




Regards,
umanga


Jeff Squyres wrote:
> I'm guessing that this has to do with deep, dark voodoo involved  
with

> the run time linker.
>
> Can you try configuring/building Open MPI with --disable-dlopen
> configure option, and rebuilding your libParallel.so against the  
new

> libmpi.so?
>
> See if that fixes the problem for you.  If it does, I can  
explain in

> more detail (if you care).
>
>
> On Sep 10, 2009, at 3:24 AM, Ashika Umanga Umagiliya wrote:
>
>> Greetings all,
>>
>> My parallel application is build as a shared library  
(libParallel.so).

>> (I use Debian Lenny 64bit).
>>  A webservice is used to dynamically load libParallel.so and  
inturn

>> execute the parallel process .
>>
>> But during runtime I get the error :
>>
>> webservicestub: symbol lookup error:
>> /usr/local/lib/openmpi/mca_paffinity_linux.so: undefined symbol:
>> mca_base_param_reg_int
>>
>> which I cannot figure out.I followed every 'ldd' and 'nm' seems
>> everything is fine.
>> So I compiled and tested my parallel code as an executable and  
then it

>> worked fine.
>>
>> What could be the reason for this?
>>
>> Thanks in advance,
>> umanga
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Jeff Squyres
jsquy...@cisco.com

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] undefined symbol error when built as asharedlibrary

2009-09-11 Thread Jeff Squyres

On Sep 11, 2009, at 7:26 AM, Reuti wrote:


> OMPI's --disable-dlopen configure option has Open MPI build in a
> different way.

Aha - this might also explain what I faced some time ago. I tried to
compile an application called Molpro with GlobalArrays which I
compiled with Open MPI. I faced similar errors - the compilation
worked without any problem, but I couldn't run the application, as it
resulted in a similar error. Finally I gave up and stayed with mpich
(1) for this.




IMHO (and knowing very little about how linkers actually work), the  
problem is with linker namespaces.  If they could be inherited (e.g.,  
a *tree* of scopes could be private), then things might work.  It  
would probably be interesting to sit down with a run-time linker  
developer sometime and ask about this (I know that linkers are  
fantastically complicated; there might be Good reasons why such a  
scheme doesn't already exist).


--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI users] Bad MPI_Bcast behaviour when running over openib

2009-09-11 Thread Rolf Vandevaart
Hi, how exactly do you run this to get this error?  I tried and it 
worked for me.


burl-ct-x2200-16 50 =>mpirun -mca btl_openib_warn_default_gid_prefix 0 
-mca btl self,sm,openib -np 2 -host burl-ct-x2200-16,burl-ct-x2200-17 
-mca btl_openib_ib_timeout 16 a.out

I am 0 at 1252670691
I am 1 at 1252670559
I am 0 at 1252670692
I am 1 at 1252670559
 burl-ct-x2200-16 51 =>

Rolf

On 09/11/09 07:18, Ake Sandgren wrote:

Hi!

The following code shows a bad behaviour when running over openib.

Openmpi: 1.3.3
With openib it dies with "error polling HP CQ with status WORK REQUEST
FLUSHED ERROR status number 5 ", with tcp or shmem it works as expected.


#include 
#include 
#include 
#include "mpi.h"

int main(int argc, char *argv[])
{
int  rank;
int  n;

MPI_Init( &argc, &argv );

MPI_Comm_rank( MPI_COMM_WORLD, &rank );

fprintf(stderr, "I am %d at %d\n", rank, time(NULL));
fflush(stderr);

n = 4;
MPI_Bcast(&n, 1, MPI_INTEGER, 0, MPI_COMM_WORLD);
fprintf(stderr, "I am %d at %d\n", rank, time(NULL));
fflush(stderr);
if (rank == 0) {
sleep(60);
}
MPI_Barrier(MPI_COMM_WORLD);

MPI_Finalize( );
exit(0);
}

I know about the internal openmpi reason for it do behave as it does.
But i think that it should be allowed to behave as it does.

This example is a bit engineered but there are codes where a similar
situation can occur, i.e. the Bcast sender doing lots of other work
after the Bcast before the next MPI call. VASP is a candidate for this.




--

=
rolf.vandeva...@sun.com
781-442-3043
=


Re: [OMPI users] Bad MPI_Bcast behaviour when running over openib

2009-09-11 Thread Jeff Squyres
Cisco is no longer an IB vendor, but I seem to recall that these kinds  
of errors typically indicated a fabric problem.  Have you run layer 0  
and 1 diagnostics to ensure that the fabric is clean?



On Sep 11, 2009, at 8:09 AM, Rolf Vandevaart wrote:


Hi, how exactly do you run this to get this error?  I tried and it
worked for me.

burl-ct-x2200-16 50 =>mpirun -mca btl_openib_warn_default_gid_prefix 0
-mca btl self,sm,openib -np 2 -host burl-ct-x2200-16,burl-ct-x2200-17
-mca btl_openib_ib_timeout 16 a.out
I am 0 at 1252670691
I am 1 at 1252670559
I am 0 at 1252670692
I am 1 at 1252670559
  burl-ct-x2200-16 51 =>

Rolf

On 09/11/09 07:18, Ake Sandgren wrote:
> Hi!
>
> The following code shows a bad behaviour when running over openib.
>
> Openmpi: 1.3.3
> With openib it dies with "error polling HP CQ with status WORK  
REQUEST
> FLUSHED ERROR status number 5 ", with tcp or shmem it works as  
expected.

>
>
> #include 
> #include 
> #include 
> #include "mpi.h"
>
> int main(int argc, char *argv[])
> {
> int  rank;
> int  n;
>
> MPI_Init( &argc, &argv );
>
> MPI_Comm_rank( MPI_COMM_WORLD, &rank );
>
> fprintf(stderr, "I am %d at %d\n", rank, time(NULL));
> fflush(stderr);
>
> n = 4;
> MPI_Bcast(&n, 1, MPI_INTEGER, 0, MPI_COMM_WORLD);
> fprintf(stderr, "I am %d at %d\n", rank, time(NULL));
> fflush(stderr);
> if (rank == 0) {
>   sleep(60);
> }
> MPI_Barrier(MPI_COMM_WORLD);
>
> MPI_Finalize( );
> exit(0);
> }
>
> I know about the internal openmpi reason for it do behave as it  
does.

> But i think that it should be allowed to behave as it does.
>
> This example is a bit engineered but there are codes where a similar
> situation can occur, i.e. the Bcast sender doing lots of other work
> after the Bcast before the next MPI call. VASP is a candidate for  
this.

>


--

=
rolf.vandeva...@sun.com
781-442-3043
=
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Jeff Squyres
jsquy...@cisco.com



[OMPI users] application hangs when checkpointing application

2009-09-11 Thread Jean Potsam
Hi Everyone,
    I wrote a small program with a function to trigger the 
checkpointing mechanism as follows:
 

 
#include 
#include 
#include 
#include 
#include 
void trigger_checkpoint();
int main(int argc, char **argv)
{
int rank,size;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
printf("I am processor no %d of a total of %d procs \n", rank, size);
system("sleep 10");
trigger_checkpoint();
printf("I am processor no %d of a total of %d procs \n", rank, size);
system("sleep 10");
printf("I am processor no %d of a total of %d procs \n", rank, size);
system("sleep 10");
printf("bye \n");
MPI_Finalize();
return 0;
}
 
void trigger_checkpoint()
{
  printf("hi\n");
  system("ompi-checkpoint -v `pidof mpirun` ");
}
#
   
 
The application works fine on my laptop with ubuntu as the OS. However, when I 
tried running it on one of the machines at my uni, with suse linux installed, 
the application hangs as soon as the ompi-checkpoint is triggered. This is what 
I get:
 
 
 
##
I am processor no 0 of a total of 1 procs 
hi
I am processor no 0 of a total of 1 procs 
[sun06:15426] orte_checkpoint: Checkpointing...
[sun06:15426]    PID 15411
[sun06:15426]    Connected to Mpirun [[12727,0],0]
[sun06:15426] orte_checkpoint: notify_hnp: Contact Head Node Process PID 15411

does anyone has some ideas about this?
 
Thank a lot
 
Jean.

 


  

Re: [OMPI users] Bad MPI_Bcast behaviour when running over openib

2009-09-11 Thread Ake Sandgren
On Fri, 2009-09-11 at 13:18 +0200, Ake Sandgren wrote:
> Hi!
> 
> The following code shows a bad behaviour when running over openib.

Oops. Red Face big time.
I happened to run the IB test between two systems that don't have IB
connectivity.

Goes and hide in a dark corner...

-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se



[OMPI users] Application hangs when checkpointing application (update)

2009-09-11 Thread Jean Potsam
 
Hi Everyone,
  I noticed that it hangs just before displaying the following 
while trying to checkpoint the application.
 

[sun06:15252] orte_checkpoint: notify_hnp: Requested a checkpoint of jobid 
[INVALID] 
###
 
Can it be related to the above? 
 
Thanks
 
 
--
Hi Everyone,
    I wrote a small program with a function to trigger the 
checkpointing mechanism as follows:
 

 
#include 
#include 
#include 
#include 
#include 
void trigger_checkpoint();
int main(int argc, char **argv)
{
int rank,size;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
printf("I am processor no %d of a total of %d procs \n", rank, size);
system("sleep 10");
trigger_checkpoint();
printf("I am processor no %d of a total of %d procs \n", rank, size);
system("sleep 10");
printf("I am processor no %d of a total of %d procs \n", rank, size);
system("sleep 10");
printf("bye \n");
MPI_Finalize();
return 0;
}
 
void trigger_checkpoint()
{
  printf("hi\n");
  system("ompi-checkpoint -v `pidof mpirun` ");
}
#
   
 
The application works fine on my laptop with ubuntu as the OS. However, when I 
tried running it on one of the machines at my uni, with suse linux installed, 
the application hangs as soon as the ompi-checkpoint is triggered. This is what 
I get:
 
 
 
##
I am processor no 0 of a total of 1 procs 
hi
I am processor no 0 of a total of 1 procs 
[sun06:15426] orte_checkpoint: Checkpointing...
[sun06:15426]    PID 15411
[sun06:15426]    Connected to Mpirun [[12727,0],0]
[sun06:15426] orte_checkpoint: notify_hnp: Contact Head Node Process PID 15411
###

 
does anyone has some ideas about this?
 
Thanks a lot
 
Jean.


  

[OMPI users] OpenMPI on OS X - file is not of required architecture

2009-09-11 Thread Andreas Haselbacher
I've built openmpi version 1.3.3 on a MacPro with OS X 10.5.8 and the Intel 10.1.006 Fortran compiler and gcc 4.0.  As far as I can tell, the configure and make commands completed fine. There are some warnings, but it's not clear to me that they are critical - or the explanation for what's not working. After installing, I try to compile a simple F77 hello world code. The output is:% mpif77 helloworld_mpi.f -o helloworld_mpild: warning in /opt/openmpi/lib/libmpi_f77.a, file is not of required architectureUndefined symbols:  "_mpi_init_", referenced from:      _MAIN__ in ifortIsUNoZ.o  "_mpi_comm_size_", referenced from:      _MAIN__ in ifortIsUNoZ.o  "_mpi_finalize_", referenced from:      _MAIN__ in ifortIsUNoZ.o  "_mpi_comm_rank_", referenced from:      _MAIN__ in ifortIsUNoZ.old: symbol(s) not foundI don't know what the warning about the "required architecture" means and cannot find any relevant info in the archives or with google. I'd appreciate any help. More info is below, including the config.log file as an attachment. Here's my configure command:./configure --prefix=/opt/openmpi --enable-static --disable-shared CC=gcc CFLAGS=-m64 CXX=g++ CXXFLAGS=-m64 F77=ifort FC=ifort FFLAGS=-assume nounderscore FCFLAGS=-assume nounderscoreThe output of the ompi_info --all command is also attached. Thanks,Andreas

config.log.gz
Description: GNU Zip compressed data


open_info_all.out.gz
Description: GNU Zip compressed data
  

Re: [OMPI users] Disable use of Torque at run-time

2009-09-11 Thread jgans

Hi Ralph,

Thank you for you help. This is exactly what I wanted!

Regards,

Jason

Ralph Castain wrote:

Hmmm...well, here is one way to do it:

mpirun -n 1 -host n0 ./master_worker : -n N-1 -host +e ./master_worker

What this will do is put rank 0 on the first node in your allocation, 
and then all the remaining ranks on the remaining nodes in the 
allocation. All the ranks will be in the same comm_world.


Check out "man orte_hosts" for a detailed explanation (with examples) 
of this "relative node indexing" syntax.


HTH
Ralph

On Sep 10, 2009, at 3:57 PM, jgans wrote:


A single app:

mpirun -N ./master_worker

Regards,

Jason

Ralph Castain wrote:

Is the master a different app, or is the same app used?

In other words, do you run this as:

mpirun -n 1 ./master: -n N worker

or

mpirun -N ./master_worker

Either way, I can advise you on a better way to accomplish your goal

On Sep 10, 2009, at 2:58 PM, Jason D. Gans wrote:


Hi,

I have a master/worker bioinformatics application where the master 
has a

higher memory overhead than the workers. I want to restrict the master
node to a single slot (to prevent the master node from getting
oversubscribed and having workers compete for precious ram), while all
other non-master nodes can be oversubscribed (infinite max_slot).

Regards,

Jason


I guess I'm puzzled, then. First, hostfile and Torque work fine
together in the 1.3 series - it was the 1.2 series that had the 
problem.


Second, the default max_slot setting is taken from the slots 
allocated

to you by Torque. I don't see the purpose in changing them - you can
always oversubscribe the node anyway.

Perhaps you could explain more about what you are trying to do? You
may find that there is a much simpler solution already in place.


On Sep 10, 2009, at 2:07 PM, Jason D. Gans wrote:


What OMPI version are you talking about?



version 1.3.1



On Sep 10, 2009, at 1:40 PM, Jason D. Gans wrote:


Hello,

I would like to use a custom hostfile (that changes the default
max_slot
values for certain nodes). My understanding of the FAQ is that
this is
*not* possible with Torque. Therefore, is is possible to 
disable use

of
Torque at runtime (via an argument to mpirun), or do I need to
recompile
to remove Torque support altogether.

Regards,

Jason Gans
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


















Re: [OMPI users] OpenMPI on OS X - file is not of required architecture

2009-09-11 Thread Jeff Squyres

On Sep 11, 2009, at 10:05 AM, Andreas Haselbacher wrote:

I've built openmpi version 1.3.3 on a MacPro with OS X 10.5.8 and  
the Intel 10.1.006 Fortran compiler and gcc 4.0.  As far as I can  
tell, the configure and make commands completed fine. There are some  
warnings, but it's not clear to me that they are critical - or the  
explanation for what's not working. After installing, I try to  
compile a simple F77 hello world code. The output is:


% mpif77 helloworld_mpi.f -o helloworld_mpi
ld: warning in /opt/openmpi/lib/libmpi_f77.a, file is not of  
required architecture


This means that it skipped that library because it didn't match what  
you were trying to compile against.


Can you send the output of mpif77 --showme?


Undefined symbols:
  "_mpi_init_", referenced from:
  _MAIN__ in ifortIsUNoZ.o


None of these symbols were found because libmpi_f77.a was skipped.


Here's my configure command:

./configure --prefix=/opt/openmpi --enable-static --disable-shared  
CC=gcc CFLAGS=-m64 CXX=g++ CXXFLAGS=-m64 F77=ifort FC=ifort FFLAGS=- 
assume nounderscore FCFLAGS=-assume nounderscore


I do not have the intel compilers for Mac; do they default to  
producing 64 bit objects?  I ask because it looks like you forced the  
C and C++ compilers to produce 64 bit objects -- do you need to do the  
same with ifort?  (via the FCFLAGS and FFLAGS env variables)


Also, did you quote the "-assume nounderscore" arguments to FFLAGS/ 
FCFLAGS?  I.e., something like this:


"FFLAGS=-assume nounderscore"

--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI users] OpenMPI on OS X - file is not of required architecture

2009-09-11 Thread Andreas Haselbacher
On Fri, Sep 11, 2009 at 5:10 PM, Jeff Squyres  wrote:

> On Sep 11, 2009, at 10:05 AM, Andreas Haselbacher wrote:
>
>  I've built openmpi version 1.3.3 on a MacPro with OS X 10.5.8 and the
>> Intel 10.1.006 Fortran compiler and gcc 4.0.  As far as I can tell, the
>> configure and make commands completed fine. There are some warnings, but
>> it's not clear to me that they are critical - or the explanation for what's
>> not working. After installing, I try to compile a simple F77 hello world
>> code. The output is:
>>
>> % mpif77 helloworld_mpi.f -o helloworld_mpi
>> ld: warning in /opt/openmpi/lib/libmpi_f77.a, file is not of required
>> architecture
>>
>
> This means that it skipped that library because it didn't match what you
> were trying to compile against.
>
> Can you send the output of mpif77 --showme?
>

ifort -I/opt/openmpi/include -L/opt/openmpi/lib -lmpi_f77 -lmpi -lopen-rte
-lopen-pal -lutil


>
>  Undefined symbols:
>>  "_mpi_init_", referenced from:
>>  _MAIN__ in ifortIsUNoZ.o
>>
>
> None of these symbols were found because libmpi_f77.a was skipped.
>

Right.


>
>  Here's my configure command:
>>
>> ./configure --prefix=/opt/openmpi --enable-static --disable-shared CC=gcc
>> CFLAGS=-m64 CXX=g++ CXXFLAGS=-m64 F77=ifort FC=ifort FFLAGS=-assume
>> nounderscore FCFLAGS=-assume nounderscore
>>
>
> I do not have the intel compilers for Mac; do they default to producing 64
> bit objects?  I ask because it looks like you forced the C and C++ compilers
> to produce 64 bit objects -- do you need to do the same with ifort?  (via
> the FCFLAGS and FFLAGS env variables)
>

If I remember correctly, I had to add those flags, otherwise configure
claimed that the compilers were not compatible. I can rerun configure if you
suspect that this is an issue.  I did not add these flags to the Fortran
variables because configure did not complain further, but I can see that
this might be an issue.


>
> Also, did you quote the "-assume nounderscore" arguments to FFLAGS/FCFLAGS?
>  I.e., something like this:
>
>"FFLAGS=-assume nounderscore"
>
>
Yes, I did.

Andreas


> --
> Jeff Squyres
> jsquy...@cisco.com
>
>


Re: [OMPI users] OpenMPI on OS X - file is not of required architecture

2009-09-11 Thread Doug Reeder

Andreas,

Have you checked that ifort is creating 64 bit objects. If I remember  
correctly with 10.1 the default was to create 32 bit objects.


Doug Reeder
On Sep 11, 2009, at 3:25 PM, Andreas Haselbacher wrote:

On Fri, Sep 11, 2009 at 5:10 PM, Jeff Squyres   
wrote:

On Sep 11, 2009, at 10:05 AM, Andreas Haselbacher wrote:

I've built openmpi version 1.3.3 on a MacPro with OS X 10.5.8 and  
the Intel 10.1.006 Fortran compiler and gcc 4.0.  As far as I can  
tell, the configure and make commands completed fine. There are some  
warnings, but it's not clear to me that they are critical - or the  
explanation for what's not working. After installing, I try to  
compile a simple F77 hello world code. The output is:


% mpif77 helloworld_mpi.f -o helloworld_mpi
ld: warning in /opt/openmpi/lib/libmpi_f77.a, file is not of  
required architecture


This means that it skipped that library because it didn't match what  
you were trying to compile against.


Can you send the output of mpif77 --showme?


ifort -I/opt/openmpi/include -L/opt/openmpi/lib -lmpi_f77 -lmpi - 
lopen-rte -lopen-pal -lutil



Undefined symbols:
 "_mpi_init_", referenced from:
 _MAIN__ in ifortIsUNoZ.o

None of these symbols were found because libmpi_f77.a was skipped.


Right.


Here's my configure command:

./configure --prefix=/opt/openmpi --enable-static --disable-shared  
CC=gcc CFLAGS=-m64 CXX=g++ CXXFLAGS=-m64 F77=ifort FC=ifort FFLAGS=- 
assume nounderscore FCFLAGS=-assume nounderscore


I do not have the intel compilers for Mac; do they default to  
producing 64 bit objects?  I ask because it looks like you forced  
the C and C++ compilers to produce 64 bit objects -- do you need to  
do the same with ifort?  (via the FCFLAGS and FFLAGS env variables)


If I remember correctly, I had to add those flags, otherwise  
configure claimed that the compilers were not compatible. I can  
rerun configure if you suspect that this is an issue.  I did not add  
these flags to the Fortran variables because configure did not  
complain further, but I can see that this might be an issue.



Also, did you quote the "-assume nounderscore" arguments to FFLAGS/ 
FCFLAGS?  I.e., something like this:


   "FFLAGS=-assume nounderscore"


Yes, I did.

Andreas

--
Jeff Squyres
jsquy...@cisco.com


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users