[OMPI users] dual cores
Dear Open MPI gurus: I have just installed Open MPI this evening. I have a dual core laptop and I would like to have both cores running. Here is the following my-hosts file: localhost slots=2 and here is the command and output: mpirun --hostfile my-hosts -np 4 --byslot hello_c |sort Hello, world, I am 0 of 4 Hello, world, I am 1 of 4 Hello, world, I am 2 of 4 Hello, world, I am 3 of 4 hodgesse@erinstoy:~/Desktop/openmpi-1.2.8/examples> How do I know if both cores are running, please? thanks, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodge...@uhd.edu
Re: [OMPI users] dual cores
Dear Erin, I'm nowhere near a guru, so I hope you don't what I have to say (it might be wrong...). But what I did was just put a long loop into the program and while it was running, I opened another window and looked at the output of "top". Obviously, without the loop, the program would terminate too fast. If you have two CPUs and the total of the process exceeds 100% (i.e., if you run with np=2, you might have 98% and 98%), then I would think this is enough proof that both cores are being used. I'm saying this on the list hoping that someone can correct my knowledge of it, too... Ray Hodgess, Erin wrote: Dear Open MPI gurus: I have just installed Open MPI this evening. I have a dual core laptop and I would like to have both cores running. Here is the following my-hosts file: localhost slots=2 and here is the command and output: mpirun --hostfile my-hosts -np 4 --byslot hello_c |sort Hello, world, I am 0 of 4 Hello, world, I am 1 of 4 Hello, world, I am 2 of 4 Hello, world, I am 3 of 4 hodgesse@erinstoy:~/Desktop/openmpi-1.2.8/examples> How do I know if both cores are running, please?
Re: [OMPI users] dual cores
This sounds great! Thanks for your help! Sincerely, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodge...@uhd.edu -Original Message- From: users-boun...@open-mpi.org on behalf of Raymond Wan Sent: Sun 11/9/2008 11:20 PM To: Open MPI Users Subject: Re: [OMPI users] dual cores Dear Erin, I'm nowhere near a guru, so I hope you don't what I have to say (it might be wrong...). But what I did was just put a long loop into the program and while it was running, I opened another window and looked at the output of "top". Obviously, without the loop, the program would terminate too fast. If you have two CPUs and the total of the process exceeds 100% (i.e., if you run with np=2, you might have 98% and 98%), then I would think this is enough proof that both cores are being used. I'm saying this on the list hoping that someone can correct my knowledge of it, too... Ray Hodgess, Erin wrote: > Dear Open MPI gurus: > > I have just installed Open MPI this evening. > > I have a dual core laptop and I would like to have both cores running. > > Here is the following my-hosts file: > localhost slots=2 > > and here is the command and output: > mpirun --hostfile my-hosts -np 4 --byslot hello_c |sort > Hello, world, I am 0 of 4 > Hello, world, I am 1 of 4 > Hello, world, I am 2 of 4 > Hello, world, I am 3 of 4 > hodgesse@erinstoy:~/Desktop/openmpi-1.2.8/examples> > > > How do I know if both cores are running, please? > ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] dual cores
Run 'top' For long running applications you should see 4 processes each at 50% (4*50=200% two cpus). You are ok, your hello_c did what it should, each of thoese 'hello's could have came from any of the two cpus. Also if your only running on your local machine, you don't need a hostfile, and -byslot is meaningless in this case, mpirun -np 4 ./hello_c Would work just fine. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Nov 10, 2008, at 12:05 AM, Hodgess, Erin wrote: Dear Open MPI gurus: I have just installed Open MPI this evening. I have a dual core laptop and I would like to have both cores running. Here is the following my-hosts file: localhost slots=2 and here is the command and output: mpirun --hostfile my-hosts -np 4 --byslot hello_c |sort Hello, world, I am 0 of 4 Hello, world, I am 1 of 4 Hello, world, I am 2 of 4 Hello, world, I am 3 of 4 hodgesse@erinstoy:~/Desktop/openmpi-1.2.8/examples> How do I know if both cores are running, please? thanks, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodge...@uhd.edu ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] dual cores
great! Thanks, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodge...@uhd.edu -Original Message- From: users-boun...@open-mpi.org on behalf of Brock Palen Sent: Sun 11/9/2008 11:21 PM To: Open MPI Users Subject: Re: [OMPI users] dual cores Run 'top' For long running applications you should see 4 processes each at 50% (4*50=200% two cpus). You are ok, your hello_c did what it should, each of thoese 'hello's could have came from any of the two cpus. Also if your only running on your local machine, you don't need a hostfile, and -byslot is meaningless in this case, mpirun -np 4 ./hello_c Would work just fine. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Nov 10, 2008, at 12:05 AM, Hodgess, Erin wrote: > Dear Open MPI gurus: > > I have just installed Open MPI this evening. > > I have a dual core laptop and I would like to have both cores running. > > Here is the following my-hosts file: > localhost slots=2 > > and here is the command and output: > mpirun --hostfile my-hosts -np 4 --byslot hello_c |sort > Hello, world, I am 0 of 4 > Hello, world, I am 1 of 4 > Hello, world, I am 2 of 4 > Hello, world, I am 3 of 4 > hodgesse@erinstoy:~/Desktop/openmpi-1.2.8/examples> > > > How do I know if both cores are running, please? > > thanks, > Erin > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodge...@uhd.edu > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users <>
Re: [OMPI users] dual cores
you can also press "f" while"top" is running and choose option "j" this way you will see what CPU is chosen under column P Lenny. On Mon, Nov 10, 2008 at 7:38 AM, Hodgess, Erin wrote: > great! > > Thanks, > Erin > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodge...@uhd.edu > > > > -Original Message- > From: users-boun...@open-mpi.org on behalf of Brock Palen > Sent: Sun 11/9/2008 11:21 PM > To: Open MPI Users > Subject: Re: [OMPI users] dual cores > > Run 'top' For long running applications you should see 4 processes > each at 50% (4*50=200% two cpus). > > You are ok, your hello_c did what it should, each of thoese 'hello's > could have came from any of the two cpus. > > Also if your only running on your local machine, you don't need a > hostfile, and -byslot is meaningless in this case, > > mpirun -np 4 ./hello_c > > Would work just fine. > > Brock Palen > www.umich.edu/~brockp > Center for Advanced Computing > bro...@umich.edu > (734)936-1985 > > > > On Nov 10, 2008, at 12:05 AM, Hodgess, Erin wrote: > > > Dear Open MPI gurus: > > > > I have just installed Open MPI this evening. > > > > I have a dual core laptop and I would like to have both cores running. > > > > Here is the following my-hosts file: > > localhost slots=2 > > > > and here is the command and output: > > mpirun --hostfile my-hosts -np 4 --byslot hello_c |sort > > Hello, world, I am 0 of 4 > > Hello, world, I am 1 of 4 > > Hello, world, I am 2 of 4 > > Hello, world, I am 3 of 4 > > hodgesse@erinstoy:~/Desktop/openmpi-1.2.8/examples> > > > > > > How do I know if both cores are running, please? > > > > thanks, > > Erin > > > > > > Erin M. Hodgess, PhD > > Associate Professor > > Department of Computer and Mathematical Sciences > > University of Houston - Downtown > > mailto: hodge...@uhd.edu > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] Open MPI programs with autoconf/automake?
On Mon 2008-11-10 12:35, Raymond Wan wrote: > One thing I was wondering about was whether it is possible, though the > use of #define's, to create code that is both multi-processor > (MPI/mpic++) and single-processor (normal g++). That is, if users do > not have any MPI installed, it compiles it with g++. > > With #define's and compiler flags, I think that can be easily done -- > was wondering if this is something that developers using MPI do and > whether AC/AM supports it. The normal way to do this is by building against a serial implementation of MPI. Lots of parallel numerical libraries bundle such an implementation so you could just grab one of those. For example, see PETSc's mpiuni ($PETSC_DIR/include/mpiuni/mpi.h and $PETSC_DIR/src/sys/mpiuni/mpi.c) which implements many MPI calls as macros. Note that your serial implementation only needs to provide the subset of MPI that your program actually uses. For instance, if you never send messages to yourself, you can implement MPI_Send as MPI_Abort since it should never be called in serial. Jed pgprlwscpafzZ.pgp Description: PGP signature
[OMPI users] Can I build development RPM from openmpi-1.2.8-1.src.rpm?
Hi, I would like to to build OpenMPI from openmpi-1.2.8-1.src.rpm. I've tried plain rpmbuild and rpmbuild ... --define 'build_all_in_one_rpm 1' but resulting rpm doesn't conain any *.a libraries. I think this is a problem because I've straced mpif90 and discovered that ld invoked from gfortran only looks for libmpi_f90.a in response to -lmpi_f90 inroduced by mpif90. WBR Oleg V. Zhylin o...@yahoo.com
Re: [OMPI users] Open MPI programs with autoconf/automake?
On Nov 10, 2008, at 6:41 AM, Jed Brown wrote: With #define's and compiler flags, I think that can be easily done -- was wondering if this is something that developers using MPI do and whether AC/AM supports it. AC will allow you to #define whatever you want -- look at the documentation for AC_DEFINE and AC_DEFINE_UNQUOTED. You can also tell your configure script to accept various --with- and --enable- arguments; see the docs for AC_ARG_WITH and AC_ARG_ENABLE. The normal way to do this is by building against a serial implementation of MPI. Lots of parallel numerical libraries bundle such an implementation so you could just grab one of those. For example, see PETSc's mpiuni ($PETSC_DIR/include/mpiuni/mpi.h and $PETSC_DIR/src/sys/mpiuni/mpi.c) which implements many MPI calls as macros. Note that your serial implementation only needs to provide the subset of MPI that your program actually uses. For instance, if you never send messages to yourself, you can implement MPI_Send as MPI_Abort since it should never be called in serial. This is one viable way to do it. Another way that I have seen is to use #define's (via AC_DEFINE / AC_DEFINE_UNQUOTED) to both define BUILDING_WITH_MPI to 0 or 1 (or some variant) and conditionally use the MPI wrapper compilers (or not) to build your application. This technique is best suited to applications have are highly modular and can easily segregate all your calls to MPI in a single area that can be turned on / off with a #define. To put this more concretely, you can have this: ./configure --with-mpi that does two things: 1. set CC=mpicc (and friends) before calling AC_PROG_CC (and friends). This will setup your app to be compiled with the MPI wrapper compilers, and therefore automatically link in libmpi, etc. 2. #define BUILDING_WITH_MPI to 1, so in your code you can do stuff like #if BUILDING_WITH_MPI MPI_Send(...); #endif If --with-mpi is not specified, the following will happen: 1. You don't set CC (and friends), so AC_PROG_CC will find the default compilers. Hence, your app will not be compiled and linked against the MPI libraries. 2. #define BUILDING_WITH_MPI to 0, so the code above will compile out the call to MPI_Send(). Both of these are valid techniques -- use whichever suits your app the best. -- Jeff Squyres Cisco Systems
[OMPI users] /dev/shm
We are running OpenMPI 1.2.7. Now that we have been running for a while, we are getting messages of the sort. node: Unable to allocate shared memory for intra-node messaging. node: Delete stale shared memory files in /dev/shm. MPI process terminated unexpectedly If the user deletes the stale files, they can run. -- Ray Muno University of Minnesota Aerospace Engineering and Mechanics
Re: [OMPI users] Can I build development RPM from openmpi-1.2.8-1.src.rpm?
On Nov 10, 2008, at 8:27 AM, Oleg V. Zhylin wrote: I would like to to build OpenMPI from openmpi-1.2.8-1.src.rpm. I've tried plain rpmbuild and rpmbuild ... --define 'build_all_in_one_rpm 1' but resulting rpm doesn't conain any *.a libraries. Right -- OMPI builds shared libraries by default. I think this is a problem because I've straced mpif90 and discovered that ld invoked from gfortran only looks for libmpi_f90.a in response to -lmpi_f90 inroduced by mpif90. Really? That's odd -- our mpif90 simply links against -lmpi_f90, not specifically .a or .so. You can run "mpif90 --showme" to see the command that our wrapper *would* execute. You can also tweak the flags that OMPI passes to the wrapper compilers; see this FAQ entry: http://www.open-mpi.org/faq/?category=mpi-apps#override-wrappers-after-v1.0 -- Jeff Squyres Cisco Systems
Re: [OMPI users] ompi_info hangs
If you're not using OpenFabrics-based networks, try configuring Open MPI --without-memory-manager and see if that fixes your problems. On Nov 8, 2008, at 5:31 PM, Robert Kubrick wrote: George, I have warning when running under debugger 'Lowest section in system-supplied DSO at 0xe000 is .hash at e0b4' The program hangs in _int_malloc(): (gdb) run Starting program: /opt/openmpi-1.2.7/bin/ompi_info warning: Lowest section in system-supplied DSO at 0xe000 is .hash at e0b4 [Thread debugging using libthread_db enabled] [New Thread 0xf7b7d6d0 (LWP 16621)] 1.2.7 Program received signal SIGINT, Interrupt. [Switching to Thread 0xf7b7d6d0 (LWP 16621)] 0xf7e5267e in _int_malloc () from /opt/openmpi/lib/libopen-pal.so.0 (gdb) where #0 0xf7e5267e in _int_malloc () from /opt/openmpi/lib/libopen- pal.so.0 #1 0xf7e544e1 in malloc () from /opt/openmpi/lib/libopen-pal.so.0 #2 0xf7db46c7 in operator new () from /usr/lib/libstdc++.so.6 #3 0xf7d8e121 in std::string::_Rep::_S_create () from /usr/lib/ libstdc++.so.6 #4 0xf7d8ee18 in std::string::_Rep::_M_clone () from /usr/lib/ libstdc++.so.6 #5 0xf7d8fac8 in std::string::reserve () from /usr/lib/libstdc++.so.6 #6 0xf7d8ff6a in std::string::append () from /usr/lib/libstdc++.so.6 #7 0x08054f30 in ompi_info::out () #8 0x08062a33 in ompi_info::show_ompi_version () #9 0x080533a0 in main () On Nov 8, 2008, at 12:33 PM, George Bosilca wrote: I think we had a similar problem on the past. It has something to do with the atomics on this architecture. I don't have access to such an architecture. Can you provide us a stack trace when this happens ? Thanks, george. On Nov 8, 2008, at 12:14 PM, Robert Kubrick wrote: I am having problems building OMPI 1.2.7 on an Intel Xeon quad- core 64 bits server. The compilation completes but ompi_info hangs after printing the OMPI version: # ompi_info 1.2.7 I tried to run a few mpi applications on this same install and they do work fine. What can cause ompi_info to hang? ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] /dev/shm
on most systems /dev/shm is limited to half the physical ram. Was the user someone filling up /dev/shm so there was no space? Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Nov 10, 2008, at 1:25 PM, Ray Muno wrote: We are running OpenMPI 1.2.7. Now that we have been running for a while, we are getting messages of the sort. node: Unable to allocate shared memory for intra-node messaging. node: Delete stale shared memory files in /dev/shm. MPI process terminated unexpectedly If the user deletes the stale files, they can run. -- Ray Muno University of Minnesota Aerospace Engineering and Mechanics ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] /dev/shm
Brock Palen wrote: on most systems /dev/shm is limited to half the physical ram. Was the user someone filling up /dev/shm so there was no space? The problem is there is a large collection of stale files left in there by the users that have run on that node (Rocks based cluster). I am trying to determine why they are left behind. -- Ray Muno University of Minnesota Aerospace Engineering and Mechanics
Re: [OMPI users] dual cores
There's also a great project at SourceForge called "htop" that is a "better" version of top. It includes the ability to query for and set processor affinity for abitrary processes, colorized output, tree- based output (showing process hierarchies), etc. It's pretty nice (IMHO): http://htop.sf.net/ On Nov 10, 2008, at 3:03 AM, Lenny Verkhovsky wrote: you can also press "f" while"top" is running and choose option "j" this way you will see what CPU is chosen under column P Lenny. On Mon, Nov 10, 2008 at 7:38 AM, Hodgess, Erin wrote: great! Thanks, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodge...@uhd.edu -Original Message- From: users-boun...@open-mpi.org on behalf of Brock Palen Sent: Sun 11/9/2008 11:21 PM To: Open MPI Users Subject: Re: [OMPI users] dual cores Run 'top' For long running applications you should see 4 processes each at 50% (4*50=200% two cpus). You are ok, your hello_c did what it should, each of thoese 'hello's could have came from any of the two cpus. Also if your only running on your local machine, you don't need a hostfile, and -byslot is meaningless in this case, mpirun -np 4 ./hello_c Would work just fine. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Nov 10, 2008, at 12:05 AM, Hodgess, Erin wrote: > Dear Open MPI gurus: > > I have just installed Open MPI this evening. > > I have a dual core laptop and I would like to have both cores running. > > Here is the following my-hosts file: > localhost slots=2 > > and here is the command and output: > mpirun --hostfile my-hosts -np 4 --byslot hello_c |sort > Hello, world, I am 0 of 4 > Hello, world, I am 1 of 4 > Hello, world, I am 2 of 4 > Hello, world, I am 3 of 4 > hodgesse@erinstoy:~/Desktop/openmpi-1.2.8/examples> > > > How do I know if both cores are running, please? > > thanks, > Erin > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodge...@uhd.edu > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] Can I build development RPM from openmpi-1.2.8-1.src.rpm?
> Right -- OMPI builds shared libraries by default. What is the proper way to build static libraries from RPM? Or tarball is the only option to accomplish this? > Really? That's odd -- our mpif90 simply links against > -lmpi_f90, not specifically .a or .so. You can run > "mpif90 --showme" to see the command that our > wrapper *would* execute. You can also tweak the flags that > OMPI passes to the wrapper compilers; see this FAQ entry: Well, I suppose if removing -lmpi_f90 and other mpi libs from command-line would defeat the purpose of building an MPI executable. Moreover ld manual page says that on platforms that support shared libraries it looks for .so first and .a after that. But I've tried Fedora Core 6 and 9 and both give same result and on both strace shows that ld doesn't attempt to look for libmpi_f90.so at all. Does anyone has experience building MPI on Fedora? Are there any additional steps required other then yum install openmpi*? WBR Oleg V. Zhylin o...@yahoo.com --- On Mon, 11/10/08, Jeff Squyres wrote: > From: Jeff Squyres > Subject: Re: [OMPI users] Can I build development RPM from > openmpi-1.2.8-1.src.rpm? > To: o...@yahoo.com, "Open MPI Users" > Date: Monday, November 10, 2008, 8:40 PM > On Nov 10, 2008, at 8:27 AM, Oleg V. Zhylin wrote: > > > I would like to to build OpenMPI from > openmpi-1.2.8-1.src.rpm. I've tried plain rpmbuild and > rpmbuild ... --define 'build_all_in_one_rpm 1' but > resulting rpm doesn't conain any *.a libraries. > > Right -- OMPI builds shared libraries by default. > > > I think this is a problem because I've straced > mpif90 and discovered that ld invoked from gfortran only > looks for libmpi_f90.a in response to -lmpi_f90 inroduced by > mpif90. > > > Really? That's odd -- our mpif90 simply links against > -lmpi_f90, not specifically .a or .so. You can run > "mpif90 --showme" to see the command that our > wrapper *would* execute. You can also tweak the flags that > OMPI passes to the wrapper compilers; see this FAQ entry: > > > http://www.open-mpi.org/faq/?category=mpi-apps#override-wrappers-after-v1.0 > > --Jeff Squyres > Cisco Systems
Re: [OMPI users] Can I build development RPM from openmpi-1.2.8-1.src.rpm?
On Nov 10, 2008, at 2:18 PM, Oleg V. Zhylin wrote: Right -- OMPI builds shared libraries by default. What is the proper way to build static libraries from RPM? Or tarball is the only option to accomplish this? You can pass any options to OMPI's configure script through the rpmbuild interface, such as: rpmbuild \ --define 'configure_options CFLAGS=-g --with-openib=/usr/local/ ofed --disable-shared --enable-static' ... But be aware that static linking is not for the weak, especially if you're using high-speed networks. Check out both of these: http://www.open-mpi.org/faq/?category=mpi-apps#static-mpi-apps http://www.open-mpi.org/faq/?category=mpi-apps#static-ofa-mpi-apps Really? That's odd -- our mpif90 simply links against -lmpi_f90, not specifically .a or .so. You can run "mpif90 --showme" to see the command that our wrapper *would* execute. You can also tweak the flags that OMPI passes to the wrapper compilers; see this FAQ entry: Well, I suppose if removing -lmpi_f90 and other mpi libs from command-line would defeat the purpose of building an MPI executable. Moreover ld manual page says that on platforms that support shared libraries it looks for .so first and .a after that. That's pretty standard behavior that has been around for forever. But I've tried Fedora Core 6 and 9 and both give same result and on both strace shows that ld doesn't attempt to look for libmpi_f90.so at all. Are you saying that you have libmpi_f90.so available and when you try to run, you get missing symbol errors? Or are you failing to compile/ link at all? Does anyone has experience building MPI on Fedora? FWIW: building on Fedora should be little different than building on other Linux systems. Are there any additional steps required other then yum install openmpi*? I always build via source (but I'm a developer, so my bias is a little different ;-) ). I'm unfamiliar with Fedora's yum repositories... -- Jeff Squyres Cisco Systems
Re: [OMPI users] /dev/shm
That is odd. Is your user's app crashing or being forcibly killed? The ORTE daemon that is silently launched in v1.2 jobs should ensure that files under /tmp/openmpi-sessions-@ are removed. On Nov 10, 2008, at 2:14 PM, Ray Muno wrote: Brock Palen wrote: on most systems /dev/shm is limited to half the physical ram. Was the user someone filling up /dev/shm so there was no space? The problem is there is a large collection of stale files left in there by the users that have run on that node (Rocks based cluster). I am trying to determine why they are left behind. -- Ray Muno University of Minnesota Aerospace Engineering and Mechanics ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] dual cores
I got "htop" and it's wonderful. Thanks for the suggestion. Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodge...@uhd.edu -Original Message- From: users-boun...@open-mpi.org on behalf of Jeff Squyres Sent: Mon 11/10/2008 1:14 PM To: Open MPI Users Subject: Re: [OMPI users] dual cores There's also a great project at SourceForge called "htop" that is a "better" version of top. It includes the ability to query for and set processor affinity for abitrary processes, colorized output, tree- based output (showing process hierarchies), etc. It's pretty nice (IMHO): http://htop.sf.net/ On Nov 10, 2008, at 3:03 AM, Lenny Verkhovsky wrote: > you can also press "f" while"top" is running and choose option "j" > this way you will see what CPU is chosen under column P > Lenny. > > On Mon, Nov 10, 2008 at 7:38 AM, Hodgess, Erin > wrote: > great! > > Thanks, > Erin > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodge...@uhd.edu > > > > -Original Message- > From: users-boun...@open-mpi.org on behalf of Brock Palen > Sent: Sun 11/9/2008 11:21 PM > To: Open MPI Users > Subject: Re: [OMPI users] dual cores > > Run 'top' For long running applications you should see 4 processes > each at 50% (4*50=200% two cpus). > > You are ok, your hello_c did what it should, each of thoese 'hello's > could have came from any of the two cpus. > > Also if your only running on your local machine, you don't need a > hostfile, and -byslot is meaningless in this case, > > mpirun -np 4 ./hello_c > > Would work just fine. > > Brock Palen > www.umich.edu/~brockp > Center for Advanced Computing > bro...@umich.edu > (734)936-1985 > > > > On Nov 10, 2008, at 12:05 AM, Hodgess, Erin wrote: > > > Dear Open MPI gurus: > > > > I have just installed Open MPI this evening. > > > > I have a dual core laptop and I would like to have both cores > running. > > > > Here is the following my-hosts file: > > localhost slots=2 > > > > and here is the command and output: > > mpirun --hostfile my-hosts -np 4 --byslot hello_c |sort > > Hello, world, I am 0 of 4 > > Hello, world, I am 1 of 4 > > Hello, world, I am 2 of 4 > > Hello, world, I am 3 of 4 > > hodgesse@erinstoy:~/Desktop/openmpi-1.2.8/examples> > > > > > > How do I know if both cores are running, please? > > > > thanks, > > Erin > > > > > > Erin M. Hodgess, PhD > > Associate Professor > > Department of Computer and Mathematical Sciences > > University of Houston - Downtown > > mailto: hodge...@uhd.edu > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users <>
Re: [OMPI users] /dev/shm
yeah if that gets full it is not going to work, We use /dev/shm for some FEA apps that have bad IO patters, I tend to keep it to just the most educated users. It just impacts others to much if not treated with respect. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Nov 10, 2008, at 2:14 PM, Ray Muno wrote: Brock Palen wrote: on most systems /dev/shm is limited to half the physical ram. Was the user someone filling up /dev/shm so there was no space? The problem is there is a large collection of stale files left in there by the users that have run on that node (Rocks based cluster). I am trying to determine why they are left behind. -- Ray Muno University of Minnesota Aerospace Engineering and Mechanics ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] /dev/shm
Jeff Squyres wrote: That is odd. Is your user's app crashing or being forcibly killed? The ORTE daemon that is silently launched in v1.2 jobs should ensure that files under /tmp/openmpi-sessions-@ are removed. It looks like I see orphaned directories under /tmp/openmpi* as well. -- Ray Muno
Re: [OMPI users] dual cores
Hi Erin, > I have a dual core laptop and I would like to have both cores running. > > Here is the following my-hosts file: > localhost slots=2 Be warned that at least in default config running more MPI threads than you have cores results in dog slow code. Single core machine: $ cat my-hosts localhost slots=1 $ mpirun -np 1 -hostfile my-hosts ./sort selectionsort 1024 1024 0.009905000seconds $ mpirun -np 2 -hostfile my-hosts ./sort selectionsort 1024 1024 4.113605000 seconds (on dual core both -np 1 and -np 2 run almost equally fast (only slightly speedup due to poor algorithm (developed for demonstration purposes)) Best regards Fabian
Re: [OMPI users] Can I build development RPM from openmpi-1.2.8-1.src.rpm?
My goal is to run some software that uses MPI so for now I the most standard setup. > Are you saying that you have libmpi_f90.so available and > when you try to run, you get missing symbol errors? Or are > you failing to compile/link at all? Linking stage fails. When I use mpif90 to produce actual executable ld reports error that it can't find -lmpi_f90. The libmpi_f90.so is in /usr/libs but, again, as I've discovered ld doesn't even try to look for it. Maybe this is ld problem, or ld in conjunction with gfortran... > I always build via source (but I'm a developer, so my > bias is a little different ;-) ). I'm unfamiliar with > Fedora's yum repositories... yum repositories for FC9 provide rpms for openmpi 1.2.4-2, but straightforward installation resulted in problem with -lmpi_f90. As I've said before I've tried this on two machines and their configurations are not exotic ones. I suppose the predicament is something obvious so I hope to here from people with openmpi experience under Fedora. WBR Oleg V. Zhylin o...@yahoo.com --- On Mon, 11/10/08, Jeff Squyres wrote: > From: Jeff Squyres > Subject: Re: [OMPI users] Can I build development RPM from > openmpi-1.2.8-1.src.rpm? > To: o...@yahoo.com > Cc: "Open MPI Users" > Date: Monday, November 10, 2008, 9:26 PM > On Nov 10, 2008, at 2:18 PM, Oleg V. Zhylin wrote: > > >> Right -- OMPI builds shared libraries by default. > > > > What is the proper way to build static libraries from > RPM? Or tarball is the only option to accomplish this? > > You can pass any options to OMPI's configure script > through the rpmbuild interface, such as: > > rpmbuild \ > --define 'configure_options CFLAGS=-g > --with-openib=/usr/local/ofed --disable-shared > --enable-static' ... > > But be aware that static linking is not for the weak, > especially if you're using high-speed networks. Check > out both of these: > > http://www.open-mpi.org/faq/?category=mpi-apps#static-mpi-apps > http://www.open-mpi.org/faq/?category=mpi-apps#static-ofa-mpi-apps > > >> Really? That's odd -- our mpif90 simply links > against > >> -lmpi_f90, not specifically .a or .so. You can > run > >> "mpif90 --showme" to see the command > that our > >> wrapper *would* execute. You can also tweak the > flags that > >> OMPI passes to the wrapper compilers; see this FAQ > entry: > > > > Well, I suppose if removing -lmpi_f90 and other mpi > libs from command-line would defeat the purpose of building > an MPI executable. Moreover ld manual page says that on > platforms that support shared libraries it looks for .so > first and .a after that. > > That's pretty standard behavior that has been around > for forever. > > > But I've tried Fedora Core 6 and 9 and both give > same result and on both strace shows that ld doesn't > attempt to look for libmpi_f90.so at all. > > Are you saying that you have libmpi_f90.so available and > when you try to run, you get missing symbol errors? Or are > you failing to compile/link at all? > > > Does anyone has experience building MPI on Fedora? > > FWIW: building on Fedora should be little different than > building on other Linux systems. > > > Are there any additional steps required other then yum > install openmpi*? > > > I always build via source (but I'm a developer, so my > bias is a little different ;-) ). I'm unfamiliar with > Fedora's yum repositories... > > --Jeff Squyres > Cisco Systems
Re: [OMPI users] ompi_info hangs
I rebuilt without the memory manager, now ompi_info crashes with this output: ./configure --prefix=/usr/local/openmpi --disable-mpi-f90 --disable- mpi-f77 --without-memory-manager localhost:~/openmpi> ompi_info Open MPI: 1.2.8 Open MPI SVN revision: r19718 Open RTE: 1.2.8 Open RTE SVN revision: r19718 OPAL: 1.2.8 OPAL SVN revision: r19718 Prefix: /usr/local/openmpi Configured architecture: x86_64-unknown-linux-gnu Configured by: root Configured on: Tue Nov 11 04:08:47 CET 2008 Configure host: localhost Built by: root Built on: Tue Nov 11 04:13:01 CET 2008 Built host: localhost C bindings: yes C++ bindings: yes Fortran77 bindings: no Fortran90 bindings: no Fortran90 bindings size: na C compiler: gcc C compiler absolute: /usr/bin/gcc C++ compiler: g++ C++ compiler absolute: /usr/bin/g++ Fortran77 compiler: gfortran Fortran77 compiler abs: /usr/bin/gfortran Fortran90 compiler: none Fortran90 compiler abs: none C profiling: yes C++ profiling: yes Fortran77 profiling: no Fortran90 profiling: no C++ exceptions: no Thread support: posix (mpi: no, progress: no) Internal debug support: no MPI parameter check: runtime Memory profiling support: no Memory debugging support: no libltdl support: yes Heterogeneous support: yes mpirun default --prefix: no *** glibc detected *** ompi_info: double free or corruption (fasttop): 0x006279e0 *** === Backtrace: = /lib64/libc.so.6[0x2ae688b0621d] /lib64/libc.so.6(cfree+0x76)[0x2ae688b07f76] /usr/lib64/libstdc++.so.6(_ZNSs6assignERKSs+0x9c)[0x2ae6881b44bc] ompi_info(_ZN9ompi_info15open_componentsEv+0x100)[0x405670] ompi_info(main+0x11e7)[0x40b837] /lib64/libc.so.6(__libc_start_main+0xf4)[0x2ae688ab5b54] ompi_info(__gxx_personality_v0+0x121)[0x405249] === Memory map: 0040-0041f000 r-xp 08:01 68989625 /usr/local/openmpi/bin/ompi_info 0061e000-0061f000 r--p 0001e000 08:01 68989625 /usr/local/openmpi/bin/ompi_info 0061f000-0062 rw-p 0001f000 08:01 68989625 /usr/local/openmpi/bin/ompi_info 0062-00642000 rw-p 0062 00:00 0 [heap] 2ae687174000-2ae68719 r-xp 08:01 100681559 /lib64/ld-2.6.1.so 2ae68719-2ae687192000 rw-p 2ae68719 00:00 0 2ae68738f000-2ae687391000 rw-p 0001b000 08:01 100681559 /lib64/ld-2.6.1.so 2ae687391000-2ae687411000 r-xp 08:01 403422302 /usr/local/openmpi/lib/libmpi.so.0.0.0 2ae687411000-2ae687611000 ---p 0008 08:01 403422302 /usr/local/openmpi/lib/libmpi.so.0.0.0 2ae687611000-2ae687612000 r--p 0008 08:01 403422302 /usr/local/openmpi/lib/libmpi.so.0.0.0 2ae687612000-2ae68761b000 rw-p 00081000 08:01 403422302 /usr/local/openmpi/lib/libmpi.so.0.0.0 2ae68761b000-2ae687622000 rw-p 2ae68761b000 00:00 0 2ae687622000-2ae68767a000 r-xp 08:01 403422294 /usr/local/openmpi/lib/libopen-rte.so.0.0.0 2ae68767a000-2ae68787a000 ---p 00058000 08:01 403422294 /usr/local/openmpi/lib/libopen-rte.so.0.0.0 2ae68787a000-2ae68787b000 r--p 00058000 08:01 403422294 /usr/local/openmpi/lib/libopen-rte.so.0.0.0 2ae68787b000-2ae68787d000 rw-p 00059000 08:01 403422294 /usr/local/openmpi/lib/libopen-rte.so.0.0.0 2ae68787d000-2ae68787e000 rw-p 2ae68787d000 00:00 0 2ae68787e000-2ae6878b1000 r-xp 08:01 403422290 /usr/local/openmpi/lib/libopen-pal.so.0.0.0 2ae6878b1000-2ae687ab ---p 00033000 08:01 403422290 /usr/local/openmpi/lib/libopen-pal.so.0.0.0 2ae687ab-2ae687ab1000 r--p 00032000 08:01 403422290 /usr/local/openmpi/lib/libopen-pal.so.0.0.0 2ae687ab1000-2ae687ab3000 rw-p 00033000 08:01 403422290 /usr/local/openmpi/lib/libopen-pal.so.0.0.0 2ae687ab3000-2ae687ad5000 rw-p 2ae687ab3000 00:00 0 2ae687af3000-2ae687af5000 r-xp 08:01 100681700 /lib64/libdl-2.6.1.so 2ae687af5000-2ae687cf5000 ---p 2000 08:01 100681700 /lib64/libdl-2.6.1.so 2ae687cf5000-2ae687cf7000 rw-p 2000 08:01 100681700 /lib64/libdl-2.6.1.so 2ae687cf7000-2ae687cf8000 rw-p 2ae687cf7000 00:00 0 2ae687cf8000-2ae687d0c000 r-xp 08:01 100681705 /lib64/libnsl-2.6.1.so 2ae687d0c000-2ae687f0b000 ---p 00014000 08:01 100681705 /lib64/libnsl-2.6.1.so 2ae687f0b000-2ae687f0d000 rw-p 00013000 08:01 100681705 /lib64/libnsl-2.6.1.so 2ae687f0d000-2ae687f0f000 rw-p 2ae687f0d000 00:00 0 2ae68
Re: [OMPI users] Open MPI programs with autoconf/automake?
Hi Jed, Thank you for your post; I have to admit that I never thought of this as an option. As the "other way" [which Jeff has posted] is more natural to me, I will probably try for that first -- but I'll keep what you posted in the back of my mind. Thanks a lot! Ray Jed Brown wrote: On Mon 2008-11-10 12:35, Raymond Wan wrote: With #define's and compiler flags, I think that can be easily done -- was wondering if this is something that developers using MPI do and whether AC/AM supports it. The normal way to do this is by building against a serial implementation of MPI. Lots of parallel numerical libraries bundle such an implementation so you could just grab one of those. For example, see PETSc's mpiuni ($PETSC_DIR/include/mpiuni/mpi.h and $PETSC_DIR/src/sys/mpiuni/mpi.c) which implements many MPI calls as macros. Note that your serial implementation only needs to provide the subset of MPI that your program actually uses. For instance, if you never send messages to yourself, you can implement MPI_Send as MPI_Abort since it should never be called in serial.
Re: [OMPI users] Open MPI programs with autoconf/automake?
Hi Jeff, Jeff Squyres wrote: On Nov 10, 2008, at 6:41 AM, Jed Brown wrote: With #define's and compiler flags, I think that can be easily done -- was wondering if this is something that developers using MPI do and whether AC/AM supports it. AC will allow you to #define whatever you want -- look at the documentation for AC_DEFINE and AC_DEFINE_UNQUOTED. You can also tell your configure script to accept various --with- and --enable- arguments; see the docs for AC_ARG_WITH and AC_ARG_ENABLE. Thanks for this! I know "it's in the document", but I've been going through it with much difficulty. Definitely complete, but hard to get into and know what it is I need. So, some keywords to search for will definitely help! If --with-mpi is not specified, the following will happen: 1. You don't set CC (and friends), so AC_PROG_CC will find the default compilers. Hence, your app will not be compiled and linked against the MPI libraries. 2. #define BUILDING_WITH_MPI to 0, so the code above will compile out the call to MPI_Send(). Both of these are valid techniques -- use whichever suits your app the best. I see; thank you for giving me this second option. I guess I'm more attracted to this since it allows me to continue working with Open MPI. As I am writing the system [now], I'll have to keep in mind to make it modular so that parts can be #define'd in and out easily. Thank you for your careful explanation! Ray
[OMPI users] mpirun Only Works When Second Ethernet Interface Disabled
Ok, I'm totally flummoxed here. I'm an ISV delivering a C program that can use MPI for it's inter-node communications. It has been deployed on a number (dozens) of small clusters and has been working pretty over the last few months. That is, until someone tried to change the static IP address and netmask of the cluster's PUBLIC ethernet interface to a "special" address for their university. Now, my program "hangs" in some early MPI communications and I have to CTRL-C to get out of the process. I got things working again by specifying "--mca btl_tcp_if_include eth0" as an argument to mpiexec ( eth0=private TCP ). Any idea WHY changing the public address messes things up so badly? While I have a workaround, it kinda caught me by surprise and that usually that means there's something going on I don't understand. I thought I was being hit by this: http://www.open-mpi.org/faq/?category=tcp#tcp-routability But my process doesn't fail, it just gets...stuck. Here's the routing table for the head and a compute node: 239.2.11.71 0.0.0.0 255.255.255.255 UH0 00 eth0 128.0.0.0 0.0.0.0 255.255.255.0 U 0 00 eth1 172.76.76.240 0.0.0.0 255.255.255.240 U 0 00 eth0 169.254.0.0 0.0.0.0 255.255.0.0 U 0 00 eth1 224.0.0.0 0.0.0.0 240.0.0.0 U 0 00 eth0 0.0.0.0 128.0.0.100.0.0.0 UG0 0 0 eth1 - 239.2.11.71 0.0.0.0 255.255.255.255 UH0 00 eth0 172.76.76.240 0.0.0.0 255.255.255.240 U 0 00 eth0 169.254.0.0 0.0.0.0 255.255.0.0 U 0 00 eth0 224.0.0.0 0.0.0.0 240.0.0.0 U 0 00 eth0 0.0.0.0 172.76.76.254 0.0.0.0 UG0 00 eth0 What I know so far: - As I test this, there is nothing other than a switch plugged into eth1 and nothing else plugged into that (i.e. it gives me a link light, but no one to talk to). - "mpiexec -np 2 -host master,node1 myProgram" hangs - MPI Init is completing. I write out one log file per process and messages from "mpiexec -d' seem to support that conclusion. - I'm pretty sure my first Bcast works, but I seem to be getting stuck in my first Allreduce. - If I run strace on a process, it looks like it is sitting in a poll loop. - "mpiexec -np 2 -host master,node1 --mca btl_tcp_if_include eth0 myProgram" doesn't hang - If I run mpiexec from the head node and just specify a host list that does NOT include the head node, things work just fine. "mpiexec -np 2 -host node1,node2 myProgram" doesn't hang - I have strace outputs from each of these scenarios above from each node, but cannot make heads nor tails of them - If I take down the public interface (if-down eth1), things also work. - I can ping and ssh from any node to any node without any problem, so I don't think it's network related. - A non-mpi job launches and exits just fine ( "mpiexec -np 2 -host master,node1 hostname" works ) Details: - MPI 1.2.5, RedHat 4, 64-bit OS - Gigabit Ethernet, no high-speed interfaces - Original working public IP: 192.168.1.1 / 16 - Public IP address that breaks stuff: 128.0.0.1 / 24 - Internal address: 172.76.76.240 / 28 with the head node being .254 and the nodes are .241 .242. .243 and .244 - I built the 1.2.5 "multi" RPMs using the shell script and spec file on the openmpi site and installed the runtime using "rpm -Uvh ..." - All addresses are static. - Clusters are generally 5 nodes, master plus four compute nodes, but this shows up on just two. Per the FAQ, here's my ifconfig and ompi_info... [adminrig@vnode ~]$ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr x inet addr:172.76.76.254 Bcast:172.76.76.255 Mask:255.255.255.240 inet6 addr: fe80::/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:65860 errors:0 dropped:0 overruns:0 frame:0 TX packets:51860 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:8077348 (7.7 MiB) TX bytes:17135962 (16.3 MiB) Base address:0x2000 Memory:c820-c822 eth1 Link encap:Ethernet HWaddr x inet addr:128.0.0.1 Bcast:128.0.0.255 Mask:255.255.255.0 inet6 addr: fe80::/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:257 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:16880 (16.4 KiB) Base address:0x2020 Memory:c822-c824 loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:192297 errors:0 dropped:0 o