[OMPI users] Timers
Hi all, I want to get the elapsed time from start to end of my parallel program (OPENMPI based). It should give same time for the same problem always; irrespective of whether the nodes are running some or programs or they are running only that program. How to do this? Regards.
Re: [OMPI users] undefined symbol error when built as a sharedlibrary
On Sep 10, 2009, at 9:42 PM, Ashika Umanga Umagiliya wrote: That fixed the problem ! You are indeed a voodoo master... could you explain the spell behind your magic :) The problem has to do with how plugins (aka dynamic shared objects, DSO's) are loaded. When a DSO is loaded into a Linux process, it has the option of making all the public symbols in that DSO public to the rest of the process or private within its own scope. Let's back up. Remember that Open MPI is based on plugins (DSO's). It loads lots and lots of plugins during execution (mostly during MPI_INIT). These plugins call functions in OMPI's public libraries (e.g., they call functions in libmpi.so). Hence, when the plugin DSO's are loaded, they need to be able to resolve these symbols into actual code that can be invoked. If the symbols cannot be resolved, the DSO load fails. If libParallel.so is loaded into a private scope, then its linked libraries (e.g., libmpi.so) are also loaded into that same private scope. Hence, all of libmpi.so's public symbols are only public within that single, private scope. Then, when OMPI goes to load its own DSOs, since libmpi.so's public symbols are in a private scope, OMPI's DSO's can't find them -- and therefore they refuse to load. (private scopes are not inherited -- a new DSO load cannot "see" libParallel.so/libmpi.so's private scope). It's an educated guess from your description that this is what was happening. OMPI's --disable-dlopen configure option has Open MPI build in a different way. Instead of building all of OMPI's plugins as DSOs, they are "slurped" up into libmpi.so (etc.). So there's no "loading" of DSOs at MPI_INIT time -- the plugin code actually resides *in* libmpi.so itself. Hence, resolution of all symbols is done when libParallel.so loads libmpi.so. Additionally, there's no secondary private scope created when DSOs are loaded -- they're all self- contained within libmpi.so (etc.). And therefore all the libmpi.so symbols that are required for the plugins are all able to be found/ resolved at load time. Does that make sense? Regards, umanga Jeff Squyres wrote: > I'm guessing that this has to do with deep, dark voodoo involved with > the run time linker. > > Can you try configuring/building Open MPI with --disable-dlopen > configure option, and rebuilding your libParallel.so against the new > libmpi.so? > > See if that fixes the problem for you. If it does, I can explain in > more detail (if you care). > > > On Sep 10, 2009, at 3:24 AM, Ashika Umanga Umagiliya wrote: > >> Greetings all, >> >> My parallel application is build as a shared library (libParallel.so). >> (I use Debian Lenny 64bit). >> A webservice is used to dynamically load libParallel.so and inturn >> execute the parallel process . >> >> But during runtime I get the error : >> >> webservicestub: symbol lookup error: >> /usr/local/lib/openmpi/mca_paffinity_linux.so: undefined symbol: >> mca_base_param_reg_int >> >> which I cannot figure out.I followed every 'ldd' and 'nm' seems >> everything is fine. >> So I compiled and tested my parallel code as an executable and then it >> worked fine. >> >> What could be the reason for this? >> >> Thanks in advance, >> umanga >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com
Re: [OMPI users] Timers
Hi I'm not sure if i completely understand your requirements, but have you tried MPI_WTime? Jody On Fri, Sep 11, 2009 at 7:54 AM, amjad ali wrote: > Hi all, > I want to get the elapsed time from start to end of my parallel program > (OPENMPI based). It should give same time for the same problem always; > irrespective of whether the nodes are running some or programs or they are > running only that program. How to do this? > > Regards. > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
[OMPI users] Bad MPI_Bcast behaviour when running over openib
Hi! The following code shows a bad behaviour when running over openib. Openmpi: 1.3.3 With openib it dies with "error polling HP CQ with status WORK REQUEST FLUSHED ERROR status number 5 ", with tcp or shmem it works as expected. #include #include #include #include "mpi.h" int main(int argc, char *argv[]) { int rank; int n; MPI_Init( &argc, &argv ); MPI_Comm_rank( MPI_COMM_WORLD, &rank ); fprintf(stderr, "I am %d at %d\n", rank, time(NULL)); fflush(stderr); n = 4; MPI_Bcast(&n, 1, MPI_INTEGER, 0, MPI_COMM_WORLD); fprintf(stderr, "I am %d at %d\n", rank, time(NULL)); fflush(stderr); if (rank == 0) { sleep(60); } MPI_Barrier(MPI_COMM_WORLD); MPI_Finalize( ); exit(0); } I know about the internal openmpi reason for it do behave as it does. But i think that it should be allowed to behave as it does. This example is a bit engineered but there are codes where a similar situation can occur, i.e. the Bcast sender doing lots of other work after the Bcast before the next MPI call. VASP is a candidate for this. -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
Re: [OMPI users] undefined symbol error when built as a sharedlibrary
Am 11.09.2009 um 12:14 schrieb Jeff Squyres: On Sep 10, 2009, at 9:42 PM, Ashika Umanga Umagiliya wrote: That fixed the problem ! You are indeed a voodoo master... could you explain the spell behind your magic :) The problem has to do with how plugins (aka dynamic shared objects, DSO's) are loaded. When a DSO is loaded into a Linux process, it has the option of making all the public symbols in that DSO public to the rest of the process or private within its own scope. Let's back up. Remember that Open MPI is based on plugins (DSO's). It loads lots and lots of plugins during execution (mostly during MPI_INIT). These plugins call functions in OMPI's public libraries (e.g., they call functions in libmpi.so). Hence, when the plugin DSO's are loaded, they need to be able to resolve these symbols into actual code that can be invoked. If the symbols cannot be resolved, the DSO load fails. If libParallel.so is loaded into a private scope, then its linked libraries (e.g., libmpi.so) are also loaded into that same private scope. Hence, all of libmpi.so's public symbols are only public within that single, private scope. Then, when OMPI goes to load its own DSOs, since libmpi.so's public symbols are in a private scope, OMPI's DSO's can't find them -- and therefore they refuse to load. (private scopes are not inherited -- a new DSO load cannot "see" libParallel.so/libmpi.so's private scope). It's an educated guess from your description that this is what was happening. OMPI's --disable-dlopen configure option has Open MPI build in a different way. Aha - this might also explain what I faced some time ago. I tried to compile an application called Molpro with GlobalArrays which I compiled with Open MPI. I faced similar errors - the compilation worked without any problem, but I couldn't run the application, as it resulted in a similar error. Finally I gave up and stayed with mpich (1) for this. I will try to build it with this switch in the next days - maybe it will also solve this issue. -- Reuti Instead of building all of OMPI's plugins as DSOs, they are "slurped" up into libmpi.so (etc.). So there's no "loading" of DSOs at MPI_INIT time -- the plugin code actually resides *in* libmpi.so itself. Hence, resolution of all symbols is done when libParallel.so loads libmpi.so. Additionally, there's no secondary private scope created when DSOs are loaded -- they're all self- contained within libmpi.so (etc.). And therefore all the libmpi.so symbols that are required for the plugins are all able to be found/ resolved at load time. Does that make sense? Regards, umanga Jeff Squyres wrote: > I'm guessing that this has to do with deep, dark voodoo involved with > the run time linker. > > Can you try configuring/building Open MPI with --disable-dlopen > configure option, and rebuilding your libParallel.so against the new > libmpi.so? > > See if that fixes the problem for you. If it does, I can explain in > more detail (if you care). > > > On Sep 10, 2009, at 3:24 AM, Ashika Umanga Umagiliya wrote: > >> Greetings all, >> >> My parallel application is build as a shared library (libParallel.so). >> (I use Debian Lenny 64bit). >> A webservice is used to dynamically load libParallel.so and inturn >> execute the parallel process . >> >> But during runtime I get the error : >> >> webservicestub: symbol lookup error: >> /usr/local/lib/openmpi/mca_paffinity_linux.so: undefined symbol: >> mca_base_param_reg_int >> >> which I cannot figure out.I followed every 'ldd' and 'nm' seems >> everything is fine. >> So I compiled and tested my parallel code as an executable and then it >> worked fine. >> >> What could be the reason for this? >> >> Thanks in advance, >> umanga >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] undefined symbol error when built as asharedlibrary
On Sep 11, 2009, at 7:26 AM, Reuti wrote: > OMPI's --disable-dlopen configure option has Open MPI build in a > different way. Aha - this might also explain what I faced some time ago. I tried to compile an application called Molpro with GlobalArrays which I compiled with Open MPI. I faced similar errors - the compilation worked without any problem, but I couldn't run the application, as it resulted in a similar error. Finally I gave up and stayed with mpich (1) for this. IMHO (and knowing very little about how linkers actually work), the problem is with linker namespaces. If they could be inherited (e.g., a *tree* of scopes could be private), then things might work. It would probably be interesting to sit down with a run-time linker developer sometime and ask about this (I know that linkers are fantastically complicated; there might be Good reasons why such a scheme doesn't already exist). -- Jeff Squyres jsquy...@cisco.com
Re: [OMPI users] Bad MPI_Bcast behaviour when running over openib
Hi, how exactly do you run this to get this error? I tried and it worked for me. burl-ct-x2200-16 50 =>mpirun -mca btl_openib_warn_default_gid_prefix 0 -mca btl self,sm,openib -np 2 -host burl-ct-x2200-16,burl-ct-x2200-17 -mca btl_openib_ib_timeout 16 a.out I am 0 at 1252670691 I am 1 at 1252670559 I am 0 at 1252670692 I am 1 at 1252670559 burl-ct-x2200-16 51 => Rolf On 09/11/09 07:18, Ake Sandgren wrote: Hi! The following code shows a bad behaviour when running over openib. Openmpi: 1.3.3 With openib it dies with "error polling HP CQ with status WORK REQUEST FLUSHED ERROR status number 5 ", with tcp or shmem it works as expected. #include #include #include #include "mpi.h" int main(int argc, char *argv[]) { int rank; int n; MPI_Init( &argc, &argv ); MPI_Comm_rank( MPI_COMM_WORLD, &rank ); fprintf(stderr, "I am %d at %d\n", rank, time(NULL)); fflush(stderr); n = 4; MPI_Bcast(&n, 1, MPI_INTEGER, 0, MPI_COMM_WORLD); fprintf(stderr, "I am %d at %d\n", rank, time(NULL)); fflush(stderr); if (rank == 0) { sleep(60); } MPI_Barrier(MPI_COMM_WORLD); MPI_Finalize( ); exit(0); } I know about the internal openmpi reason for it do behave as it does. But i think that it should be allowed to behave as it does. This example is a bit engineered but there are codes where a similar situation can occur, i.e. the Bcast sender doing lots of other work after the Bcast before the next MPI call. VASP is a candidate for this. -- = rolf.vandeva...@sun.com 781-442-3043 =
Re: [OMPI users] Bad MPI_Bcast behaviour when running over openib
Cisco is no longer an IB vendor, but I seem to recall that these kinds of errors typically indicated a fabric problem. Have you run layer 0 and 1 diagnostics to ensure that the fabric is clean? On Sep 11, 2009, at 8:09 AM, Rolf Vandevaart wrote: Hi, how exactly do you run this to get this error? I tried and it worked for me. burl-ct-x2200-16 50 =>mpirun -mca btl_openib_warn_default_gid_prefix 0 -mca btl self,sm,openib -np 2 -host burl-ct-x2200-16,burl-ct-x2200-17 -mca btl_openib_ib_timeout 16 a.out I am 0 at 1252670691 I am 1 at 1252670559 I am 0 at 1252670692 I am 1 at 1252670559 burl-ct-x2200-16 51 => Rolf On 09/11/09 07:18, Ake Sandgren wrote: > Hi! > > The following code shows a bad behaviour when running over openib. > > Openmpi: 1.3.3 > With openib it dies with "error polling HP CQ with status WORK REQUEST > FLUSHED ERROR status number 5 ", with tcp or shmem it works as expected. > > > #include > #include > #include > #include "mpi.h" > > int main(int argc, char *argv[]) > { > int rank; > int n; > > MPI_Init( &argc, &argv ); > > MPI_Comm_rank( MPI_COMM_WORLD, &rank ); > > fprintf(stderr, "I am %d at %d\n", rank, time(NULL)); > fflush(stderr); > > n = 4; > MPI_Bcast(&n, 1, MPI_INTEGER, 0, MPI_COMM_WORLD); > fprintf(stderr, "I am %d at %d\n", rank, time(NULL)); > fflush(stderr); > if (rank == 0) { > sleep(60); > } > MPI_Barrier(MPI_COMM_WORLD); > > MPI_Finalize( ); > exit(0); > } > > I know about the internal openmpi reason for it do behave as it does. > But i think that it should be allowed to behave as it does. > > This example is a bit engineered but there are codes where a similar > situation can occur, i.e. the Bcast sender doing lots of other work > after the Bcast before the next MPI call. VASP is a candidate for this. > -- = rolf.vandeva...@sun.com 781-442-3043 = ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com
[OMPI users] application hangs when checkpointing application
Hi Everyone, I wrote a small program with a function to trigger the checkpointing mechanism as follows: #include #include #include #include #include void trigger_checkpoint(); int main(int argc, char **argv) { int rank,size; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); printf("I am processor no %d of a total of %d procs \n", rank, size); system("sleep 10"); trigger_checkpoint(); printf("I am processor no %d of a total of %d procs \n", rank, size); system("sleep 10"); printf("I am processor no %d of a total of %d procs \n", rank, size); system("sleep 10"); printf("bye \n"); MPI_Finalize(); return 0; } void trigger_checkpoint() { printf("hi\n"); system("ompi-checkpoint -v `pidof mpirun` "); } # The application works fine on my laptop with ubuntu as the OS. However, when I tried running it on one of the machines at my uni, with suse linux installed, the application hangs as soon as the ompi-checkpoint is triggered. This is what I get: ## I am processor no 0 of a total of 1 procs hi I am processor no 0 of a total of 1 procs [sun06:15426] orte_checkpoint: Checkpointing... [sun06:15426] PID 15411 [sun06:15426] Connected to Mpirun [[12727,0],0] [sun06:15426] orte_checkpoint: notify_hnp: Contact Head Node Process PID 15411 does anyone has some ideas about this? Thank a lot Jean.
Re: [OMPI users] Bad MPI_Bcast behaviour when running over openib
On Fri, 2009-09-11 at 13:18 +0200, Ake Sandgren wrote: > Hi! > > The following code shows a bad behaviour when running over openib. Oops. Red Face big time. I happened to run the IB test between two systems that don't have IB connectivity. Goes and hide in a dark corner... -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
[OMPI users] Application hangs when checkpointing application (update)
Hi Everyone, I noticed that it hangs just before displaying the following while trying to checkpoint the application. [sun06:15252] orte_checkpoint: notify_hnp: Requested a checkpoint of jobid [INVALID] ### Can it be related to the above? Thanks -- Hi Everyone, I wrote a small program with a function to trigger the checkpointing mechanism as follows: #include #include #include #include #include void trigger_checkpoint(); int main(int argc, char **argv) { int rank,size; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); printf("I am processor no %d of a total of %d procs \n", rank, size); system("sleep 10"); trigger_checkpoint(); printf("I am processor no %d of a total of %d procs \n", rank, size); system("sleep 10"); printf("I am processor no %d of a total of %d procs \n", rank, size); system("sleep 10"); printf("bye \n"); MPI_Finalize(); return 0; } void trigger_checkpoint() { printf("hi\n"); system("ompi-checkpoint -v `pidof mpirun` "); } # The application works fine on my laptop with ubuntu as the OS. However, when I tried running it on one of the machines at my uni, with suse linux installed, the application hangs as soon as the ompi-checkpoint is triggered. This is what I get: ## I am processor no 0 of a total of 1 procs hi I am processor no 0 of a total of 1 procs [sun06:15426] orte_checkpoint: Checkpointing... [sun06:15426] PID 15411 [sun06:15426] Connected to Mpirun [[12727,0],0] [sun06:15426] orte_checkpoint: notify_hnp: Contact Head Node Process PID 15411 ### does anyone has some ideas about this? Thanks a lot Jean.
[OMPI users] OpenMPI on OS X - file is not of required architecture
I've built openmpi version 1.3.3 on a MacPro with OS X 10.5.8 and the Intel 10.1.006 Fortran compiler and gcc 4.0. As far as I can tell, the configure and make commands completed fine. There are some warnings, but it's not clear to me that they are critical - or the explanation for what's not working. After installing, I try to compile a simple F77 hello world code. The output is:% mpif77 helloworld_mpi.f -o helloworld_mpild: warning in /opt/openmpi/lib/libmpi_f77.a, file is not of required architectureUndefined symbols: "_mpi_init_", referenced from: _MAIN__ in ifortIsUNoZ.o "_mpi_comm_size_", referenced from: _MAIN__ in ifortIsUNoZ.o "_mpi_finalize_", referenced from: _MAIN__ in ifortIsUNoZ.o "_mpi_comm_rank_", referenced from: _MAIN__ in ifortIsUNoZ.old: symbol(s) not foundI don't know what the warning about the "required architecture" means and cannot find any relevant info in the archives or with google. I'd appreciate any help. More info is below, including the config.log file as an attachment. Here's my configure command:./configure --prefix=/opt/openmpi --enable-static --disable-shared CC=gcc CFLAGS=-m64 CXX=g++ CXXFLAGS=-m64 F77=ifort FC=ifort FFLAGS=-assume nounderscore FCFLAGS=-assume nounderscoreThe output of the ompi_info --all command is also attached. Thanks,Andreas config.log.gz Description: GNU Zip compressed data open_info_all.out.gz Description: GNU Zip compressed data
Re: [OMPI users] Disable use of Torque at run-time
Hi Ralph, Thank you for you help. This is exactly what I wanted! Regards, Jason Ralph Castain wrote: Hmmm...well, here is one way to do it: mpirun -n 1 -host n0 ./master_worker : -n N-1 -host +e ./master_worker What this will do is put rank 0 on the first node in your allocation, and then all the remaining ranks on the remaining nodes in the allocation. All the ranks will be in the same comm_world. Check out "man orte_hosts" for a detailed explanation (with examples) of this "relative node indexing" syntax. HTH Ralph On Sep 10, 2009, at 3:57 PM, jgans wrote: A single app: mpirun -N ./master_worker Regards, Jason Ralph Castain wrote: Is the master a different app, or is the same app used? In other words, do you run this as: mpirun -n 1 ./master: -n N worker or mpirun -N ./master_worker Either way, I can advise you on a better way to accomplish your goal On Sep 10, 2009, at 2:58 PM, Jason D. Gans wrote: Hi, I have a master/worker bioinformatics application where the master has a higher memory overhead than the workers. I want to restrict the master node to a single slot (to prevent the master node from getting oversubscribed and having workers compete for precious ram), while all other non-master nodes can be oversubscribed (infinite max_slot). Regards, Jason I guess I'm puzzled, then. First, hostfile and Torque work fine together in the 1.3 series - it was the 1.2 series that had the problem. Second, the default max_slot setting is taken from the slots allocated to you by Torque. I don't see the purpose in changing them - you can always oversubscribe the node anyway. Perhaps you could explain more about what you are trying to do? You may find that there is a much simpler solution already in place. On Sep 10, 2009, at 2:07 PM, Jason D. Gans wrote: What OMPI version are you talking about? version 1.3.1 On Sep 10, 2009, at 1:40 PM, Jason D. Gans wrote: Hello, I would like to use a custom hostfile (that changes the default max_slot values for certain nodes). My understanding of the FAQ is that this is *not* possible with Torque. Therefore, is is possible to disable use of Torque at runtime (via an argument to mpirun), or do I need to recompile to remove Torque support altogether. Regards, Jason Gans ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] OpenMPI on OS X - file is not of required architecture
On Sep 11, 2009, at 10:05 AM, Andreas Haselbacher wrote: I've built openmpi version 1.3.3 on a MacPro with OS X 10.5.8 and the Intel 10.1.006 Fortran compiler and gcc 4.0. As far as I can tell, the configure and make commands completed fine. There are some warnings, but it's not clear to me that they are critical - or the explanation for what's not working. After installing, I try to compile a simple F77 hello world code. The output is: % mpif77 helloworld_mpi.f -o helloworld_mpi ld: warning in /opt/openmpi/lib/libmpi_f77.a, file is not of required architecture This means that it skipped that library because it didn't match what you were trying to compile against. Can you send the output of mpif77 --showme? Undefined symbols: "_mpi_init_", referenced from: _MAIN__ in ifortIsUNoZ.o None of these symbols were found because libmpi_f77.a was skipped. Here's my configure command: ./configure --prefix=/opt/openmpi --enable-static --disable-shared CC=gcc CFLAGS=-m64 CXX=g++ CXXFLAGS=-m64 F77=ifort FC=ifort FFLAGS=- assume nounderscore FCFLAGS=-assume nounderscore I do not have the intel compilers for Mac; do they default to producing 64 bit objects? I ask because it looks like you forced the C and C++ compilers to produce 64 bit objects -- do you need to do the same with ifort? (via the FCFLAGS and FFLAGS env variables) Also, did you quote the "-assume nounderscore" arguments to FFLAGS/ FCFLAGS? I.e., something like this: "FFLAGS=-assume nounderscore" -- Jeff Squyres jsquy...@cisco.com
Re: [OMPI users] OpenMPI on OS X - file is not of required architecture
On Fri, Sep 11, 2009 at 5:10 PM, Jeff Squyres wrote: > On Sep 11, 2009, at 10:05 AM, Andreas Haselbacher wrote: > > I've built openmpi version 1.3.3 on a MacPro with OS X 10.5.8 and the >> Intel 10.1.006 Fortran compiler and gcc 4.0. As far as I can tell, the >> configure and make commands completed fine. There are some warnings, but >> it's not clear to me that they are critical - or the explanation for what's >> not working. After installing, I try to compile a simple F77 hello world >> code. The output is: >> >> % mpif77 helloworld_mpi.f -o helloworld_mpi >> ld: warning in /opt/openmpi/lib/libmpi_f77.a, file is not of required >> architecture >> > > This means that it skipped that library because it didn't match what you > were trying to compile against. > > Can you send the output of mpif77 --showme? > ifort -I/opt/openmpi/include -L/opt/openmpi/lib -lmpi_f77 -lmpi -lopen-rte -lopen-pal -lutil > > Undefined symbols: >> "_mpi_init_", referenced from: >> _MAIN__ in ifortIsUNoZ.o >> > > None of these symbols were found because libmpi_f77.a was skipped. > Right. > > Here's my configure command: >> >> ./configure --prefix=/opt/openmpi --enable-static --disable-shared CC=gcc >> CFLAGS=-m64 CXX=g++ CXXFLAGS=-m64 F77=ifort FC=ifort FFLAGS=-assume >> nounderscore FCFLAGS=-assume nounderscore >> > > I do not have the intel compilers for Mac; do they default to producing 64 > bit objects? I ask because it looks like you forced the C and C++ compilers > to produce 64 bit objects -- do you need to do the same with ifort? (via > the FCFLAGS and FFLAGS env variables) > If I remember correctly, I had to add those flags, otherwise configure claimed that the compilers were not compatible. I can rerun configure if you suspect that this is an issue. I did not add these flags to the Fortran variables because configure did not complain further, but I can see that this might be an issue. > > Also, did you quote the "-assume nounderscore" arguments to FFLAGS/FCFLAGS? > I.e., something like this: > >"FFLAGS=-assume nounderscore" > > Yes, I did. Andreas > -- > Jeff Squyres > jsquy...@cisco.com > >
Re: [OMPI users] OpenMPI on OS X - file is not of required architecture
Andreas, Have you checked that ifort is creating 64 bit objects. If I remember correctly with 10.1 the default was to create 32 bit objects. Doug Reeder On Sep 11, 2009, at 3:25 PM, Andreas Haselbacher wrote: On Fri, Sep 11, 2009 at 5:10 PM, Jeff Squyres wrote: On Sep 11, 2009, at 10:05 AM, Andreas Haselbacher wrote: I've built openmpi version 1.3.3 on a MacPro with OS X 10.5.8 and the Intel 10.1.006 Fortran compiler and gcc 4.0. As far as I can tell, the configure and make commands completed fine. There are some warnings, but it's not clear to me that they are critical - or the explanation for what's not working. After installing, I try to compile a simple F77 hello world code. The output is: % mpif77 helloworld_mpi.f -o helloworld_mpi ld: warning in /opt/openmpi/lib/libmpi_f77.a, file is not of required architecture This means that it skipped that library because it didn't match what you were trying to compile against. Can you send the output of mpif77 --showme? ifort -I/opt/openmpi/include -L/opt/openmpi/lib -lmpi_f77 -lmpi - lopen-rte -lopen-pal -lutil Undefined symbols: "_mpi_init_", referenced from: _MAIN__ in ifortIsUNoZ.o None of these symbols were found because libmpi_f77.a was skipped. Right. Here's my configure command: ./configure --prefix=/opt/openmpi --enable-static --disable-shared CC=gcc CFLAGS=-m64 CXX=g++ CXXFLAGS=-m64 F77=ifort FC=ifort FFLAGS=- assume nounderscore FCFLAGS=-assume nounderscore I do not have the intel compilers for Mac; do they default to producing 64 bit objects? I ask because it looks like you forced the C and C++ compilers to produce 64 bit objects -- do you need to do the same with ifort? (via the FCFLAGS and FFLAGS env variables) If I remember correctly, I had to add those flags, otherwise configure claimed that the compilers were not compatible. I can rerun configure if you suspect that this is an issue. I did not add these flags to the Fortran variables because configure did not complain further, but I can see that this might be an issue. Also, did you quote the "-assume nounderscore" arguments to FFLAGS/ FCFLAGS? I.e., something like this: "FFLAGS=-assume nounderscore" Yes, I did. Andreas -- Jeff Squyres jsquy...@cisco.com ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users