Hi Eugene and Jody,

thanks for the ideas and elaborate answers. I will look into SysV and
mmap, and find out something. I am not tied to PGAPack, there may be
other PGA libs too... But I guess MPI and SysV/mmap do not cancel each
other out, I just have to know about what is running locally and what
is running remotely.

I sure will remember your help - and I am also thinking of making the
simulator opensource when it will be ready, as currently there's no
fast, distributed/parallel intraday forex simulator library available
that is capable of walk forward optimization and others. So others
will be available to take part in cashing in (or losing) those
zillions. :)

Barnabas

On Tue, Apr 28, 2009 at 6:41 PM, Eugene Loh <eugene....@sun.com> wrote:
> Barnabas Debreczeni wrote:
>
>> I am using PGAPack as a GA library, and it uses MPI to parallelize
>> optimization runs. This is how I got to Open MPI.
>>
>
> Let me see if I understand the underlying premise.  You want to parallelize,
> but there are some large shared tables.  There are many different
> parallelization models.  E.g., there are certainly *shared-memory* parallel
> programming models such as OpenMP (which is totally different from Open MPI,
> despite the similar names).  But you are using MPI (which doesn't really do
> shared memory) since you're trying to leverage PGAPack, which is nice for
> handling genetic algorithms but basically forces you to use MPI.  (I suspect
> most GA algorithms map reasonably well to MPI.  Your interest in shared
> tables gives your situation a different twist.)
>
>> My problem is, I'd like to share that 2 GB table (computed once at the
>> beginning, and is read-only after) between processes so I don't have
>> to use up 16 gigs of memory.
>>
>> How do you share data between processes locally?
>>
>
> Are there shared-memory parallel GA packages that might make more sense to
> use here than PGAPack?
>
> If you want to stick with PGAPack/MPI, then you can set up shared memory
> among MPI processes by going outside of MPI.  (You could use MPI calls to
> share data, including MPI_Get routines, but I'm guessing it's best just to
> add non-MPI code to do the sharing.)  You can for example create a file that
> each process "mmap"s into its address space.  There are also System V
> shared-memory calls like shmget/shmat/shmdt that allow you to share memory
> among processes.
>
> The main point:  while MPI allows communication (and therefore "data
> sharing") among processes, you might be better off with non-MPI mechanisms
> here like mmap or SysV shared memory.
>
>> Later I will need to use other hosts too in the calculation. Will the
>> slaves on other hosts need to calculate their own tables on go on from
>> there and share them locally, or can I share these tables on the
>> master host with them?
>>
>
> I think this is a performance-vs-memory question.  If your interconnect is
> fast enough or your performance requirement low enough and your memory
> constraints severe enough, then you can share common data among all your
> nodes.  You'd probably want to use MPI calls to do so... possibly using
> one-sided MPI_Get routines depending on what sort of cluster you're running
> on.
>
> But, if your interconnect is not fast enough or your performance requirement
> high enough or your memory constraint not too severe, then just share within
> each node.  And, I could imagine you might have enough memory per node (a
> few Gbytes) that this will be your scenario.  So, just replicate your
> mmap/SysV solution on each node.
>
> Short answer:  you probably want to use non-MPI mechanisms to effect your
> shared memory.
>
> Most importantly, when your algorithm is successfully implemented and
> deployed and you're making millions of dollars, please remember us!
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to