Currently we run a code on a cluster with distributed memory, and this code needs a lot of memory. Part of the data stored in memory is the same for each process, but it is organized as one array - we can split it if necessary. So far no magic occurred for us. What do we need to do to make the magic working?
On Wed, Oct 6, 2010 at 12:43, Jeff Squyres (jsquyres) <jsquy...@cisco.com>wrote: > Open MPI will use shared memory to communicate between peers on the sane > node - but that's hidden beneath the covers; it's not exposed via the MPI > API. You just MPI-send and magic occurs and the receiver gets the message. > > On Oct 4, 2010, at 11:13 AM, "Andrei Fokau" <andrei.fo...@neutron.kth.se> > wrote: > > Does OMPI have shared memory capabilities (as it is mentioned in MPI-2)? > How can I use them? > > On Sat, Sep 25, 2010 at 23:19, Andrei Fokau <<andrei.fo...@neutron.kth.se> > andrei.fo...@neutron.kth.se> wrote: > >> Here are some more details about our problem. We use a dozen of >> 4-processor nodes with 8 GB memory on each node. The code we run needs about >> 3 GB per processor, so we can load only 2 processors out of 4. The vast >> majority of those 3 GB is the same for each processor and is >> accessed continuously during calculation. In my original question I wasn't >> very clear asking about a possibility to use shared memory with Open MPI - >> in our case we do not need to have a remote access to the data, and it >> would be sufficient to share memory within each node only. >> >> Of course, the possibility to access the data remotely (via mmap) is >> attractive because it would allow to store much larger arrays (up to 10 GB) >> at one remote place, meaning higher accuracy for our calculations. However, >> I believe that the access time would be too long for the data read so >> frequently, and therefore the performance would be lost. >> >> I still hope that some of the subscribers to this mailing list have an >> experience of using Global Arrays. This library seems to be fine for our >> case, however I feel that there should be a simpler solution. Open MPI >> conforms with MPI-2 standard, and the later has a description of shared >> memory application. Do you see any other way for us to use shared memory >> (within node) apart of using Global Arrays? >> >> On Fri, Sep 24, 2010 at 19:03, Durga Choudhury < <dpcho...@gmail.com> >> dpcho...@gmail.com> wrote: >> >>> I think the 'middle ground' approach can be simplified even further if >>> the data file is in a shared device (e.g. NFS/Samba mount) that can be >>> mounted at the same location of the file system tree on all nodes. I >>> have never tried it, though and mmap()'ing a non-POSIX compliant file >>> system such as Samba might have issues I am unaware of. >>> >>> However, I do not see why you should not be able to do this even if >>> the file is being written to as long as you call msync() before using >>> the mapped pages. >>> >>> Durga >>> >>> >>> On Fri, Sep 24, 2010 at 12:31 PM, Eugene Loh < <eugene....@oracle.com> >>> eugene....@oracle.com> wrote: >>> > It seems to me there are two extremes. >>> > >>> > One is that you replicate the data for each process. This has the >>> > disadvantage of consuming lots of memory "unnecessarily." >>> > >>> > Another extreme is that shared data is distributed over all processes. >>> This >>> > has the disadvantage of making at least some of the data less >>> accessible, >>> > whether in programming complexity and/or run-time performance. >>> > >>> > I'm not familiar with Global Arrays. I was somewhat familiar with >>> HPF. I >>> > think the natural thing to do with those programming models is to >>> distribute >>> > data over all processes, which may relieve the excessive memory >>> consumption >>> > you're trying to address but which may also just put you at a different >>> > "extreme" of this spectrum. >>> > >>> > The middle ground I think might make most sense would be to share data >>> only >>> > within a node, but to replicate the data for each node. There are >>> probably >>> > multiple ways of doing this -- possibly even GA, I don't know. One way >>> > might be to use one MPI process per node, with OMP multithreading >>> within >>> > each process|node. Or (and I thought this was the solution you were >>> looking >>> > for), have some idea which processes are collocal. Have one process >>> per >>> > node create and initialize some shared memory -- mmap, perhaps, or SysV >>> > shared memory. Then, have its peers map the same shared memory into >>> their >>> > address spaces. >>> > >>> > You asked what source code changes would be required. It depends. If >>> > you're going to mmap shared memory in on each node, you need to know >>> which >>> > processes are collocal. If you're willing to constrain how processes >>> are >>> > mapped to nodes, this could be easy. (E.g., "every 4 processes are >>> > collocal".) If you want to discover dynamically at run time which are >>> > collocal, it would be harder. The mmap stuff could be in a stand-alone >>> > function of about a dozen lines. If the shared area is allocated as >>> one >>> > piece, substituting the single malloc() call with a call to your mmap >>> > function should be simple. If you have many malloc()s you're trying to >>> > replace, it's harder. >>> > >>> > Andrei Fokau wrote: >>> > >>> > The data are read from a file and processed before calculations begin, >>> so I >>> > think that mapping will not work in our case. >>> > Global Arrays look promising indeed. As I said, we need to put just a >>> part >>> > of data to the shared section. John, do you (or may be other users) >>> have an >>> > experience of working with GA? >>> > <http://www.emsl.pnl.gov/docs/global/um/build.html> >>> http://www.emsl.pnl.gov/docs/global/um/build.html >>> > When GA runs with MPI: >>> > MPI_Init(..) ! start MPI >>> > GA_Initialize() ! start global arrays >>> > MA_Init(..) ! start memory allocator >>> > .... do work >>> > GA_Terminate() ! tidy up global arrays >>> > MPI_Finalize() ! tidy up MPI >>> > ! exit program >>> > On Fri, Sep 24, 2010 at 13:44, Reuti < <re...@staff.uni-marburg.de> >>> re...@staff.uni-marburg.de> wrote: >>> >> >>> >> Am 24.09.2010 um 13:26 schrieb John Hearns: >>> >> >>> >> > On 24 September 2010 08:46, Andrei Fokau <<andrei.fo...@neutron.kth.se> >>> andrei.fo...@neutron.kth.se> >>> >> > wrote: >>> >> >> We use a C-program which consumes a lot of memory per process (up >>> to >>> >> >> few >>> >> >> GB), 99% of the data being the same for each process. So for us it >>> >> >> would be >>> >> >> quite reasonable to put that part of data in a shared memory. >>> >> > >>> >> > <http://www.emsl.pnl.gov/docs/global/> >>> http://www.emsl.pnl.gov/docs/global/ >>> >> > >>> >> > Is this eny help? Apologies if I'm talking through my hat. >>> >> >>> >> I was also thinking of this when I read "data in a shared memory" >>> (besides >>> >> approaches like <http://www.kerrighed.org/wiki/index.php/Main_Page> >>> http://www.kerrighed.org/wiki/index.php/Main_Page). Wasn't >>> >> this also one idea behind "High Performance Fortran" - running in >>> parallel >>> >> across nodes even without knowing that it's across nodes at all while >>> >> programming and access all data like it's being local. >>> > >>> >>>