Hi Eugene and Jody, thanks for the ideas and elaborate answers. I will look into SysV and mmap, and find out something. I am not tied to PGAPack, there may be other PGA libs too... But I guess MPI and SysV/mmap do not cancel each other out, I just have to know about what is running locally and what is running remotely.
I sure will remember your help - and I am also thinking of making the simulator opensource when it will be ready, as currently there's no fast, distributed/parallel intraday forex simulator library available that is capable of walk forward optimization and others. So others will be available to take part in cashing in (or losing) those zillions. :) Barnabas On Tue, Apr 28, 2009 at 6:41 PM, Eugene Loh <eugene....@sun.com> wrote: > Barnabas Debreczeni wrote: > >> I am using PGAPack as a GA library, and it uses MPI to parallelize >> optimization runs. This is how I got to Open MPI. >> > > Let me see if I understand the underlying premise. You want to parallelize, > but there are some large shared tables. There are many different > parallelization models. E.g., there are certainly *shared-memory* parallel > programming models such as OpenMP (which is totally different from Open MPI, > despite the similar names). But you are using MPI (which doesn't really do > shared memory) since you're trying to leverage PGAPack, which is nice for > handling genetic algorithms but basically forces you to use MPI. (I suspect > most GA algorithms map reasonably well to MPI. Your interest in shared > tables gives your situation a different twist.) > >> My problem is, I'd like to share that 2 GB table (computed once at the >> beginning, and is read-only after) between processes so I don't have >> to use up 16 gigs of memory. >> >> How do you share data between processes locally? >> > > Are there shared-memory parallel GA packages that might make more sense to > use here than PGAPack? > > If you want to stick with PGAPack/MPI, then you can set up shared memory > among MPI processes by going outside of MPI. (You could use MPI calls to > share data, including MPI_Get routines, but I'm guessing it's best just to > add non-MPI code to do the sharing.) You can for example create a file that > each process "mmap"s into its address space. There are also System V > shared-memory calls like shmget/shmat/shmdt that allow you to share memory > among processes. > > The main point: while MPI allows communication (and therefore "data > sharing") among processes, you might be better off with non-MPI mechanisms > here like mmap or SysV shared memory. > >> Later I will need to use other hosts too in the calculation. Will the >> slaves on other hosts need to calculate their own tables on go on from >> there and share them locally, or can I share these tables on the >> master host with them? >> > > I think this is a performance-vs-memory question. If your interconnect is > fast enough or your performance requirement low enough and your memory > constraints severe enough, then you can share common data among all your > nodes. You'd probably want to use MPI calls to do so... possibly using > one-sided MPI_Get routines depending on what sort of cluster you're running > on. > > But, if your interconnect is not fast enough or your performance requirement > high enough or your memory constraint not too severe, then just share within > each node. And, I could imagine you might have enough memory per node (a > few Gbytes) that this will be your scenario. So, just replicate your > mmap/SysV solution on each node. > > Short answer: you probably want to use non-MPI mechanisms to effect your > shared memory. > > Most importantly, when your algorithm is successfully implemented and > deployed and you're making millions of dollars, please remember us! > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >