Juraj,

if i understand correctly, the "master" task calls MPI_Init(), and then fork&exec matlab.

In some cases (lack of hardware support), fork cannot even work. but let's assume it is fine for now.

Then, if i read between the lines, matlab calls mexFunction that MPI_Init().

As far as i am concerned, that cannot work.

The blocker is that a child cannot call MPI_Init() if its parent already called MPI_Init()


Fortunatly, you have some options :-)
1) start matlab from mpirun.
for example, if you want one master, two slaves and matlab, you can do something like
mpirun -np 1 master : -np 1 matlab : -np 2 slave

2) MPI_Comm_spawn matlab
master can MPI_Comm_spawn() matlab, and then matlab can merge the parent communicator,
and communicate to master and slaves

3) use the approach suggested by Dmitry
/* this is specific to matlab, and i have no experience with it */

One last point, MPI_Init() can be invoked only once per task
(e.g. if your mexFunction does
MPI_Init(); work(); MPI_Finalize();
then it can be invoked only once per mpirun

Cheers,

Gilles

On 10/5/2016 6:41 PM, Dmitry N. Mikushin wrote:
Hi Juraj,

Although MPI infrastructure may technically support forking, it's known that not all system resources can correctly replicate themselves to forked process. For example, forking inside MPI program with active CUDA driver will result into crash.

Why not to compile down the MATLAB into a native library and link it with the MPI application directly? E.g. like here: https://www.mathworks.com/matlabcentral/answers/98867-how-do-i-create-a-c-shared-library-from-mex-files-using-the-matlab-compiler?requestedDomain=www.mathworks.com

Kind regards,
- Dmitry Mikushin.


2016-10-05 11:32 GMT+03:00 juraj2...@gmail.com <mailto:juraj2...@gmail.com> <juraj2...@gmail.com <mailto:juraj2...@gmail.com>>:

    Hello,

    I have an application in C++(main.cpp) that is launched with
    multiple processes via mpirun. Master process calls matlab via
    system('matlab -nosplash -nodisplay -nojvm -nodesktop -r
    "interface"'), which executes simple script interface.m that calls
    mexFunction (mexsolve.cpp) from which I try to set up
    communication with the rest of the processes launched at the
    beginning together with the master process. When I run the
    application as listed below on two different machines I experience:

    1) crash at MPI_Init() in the mexFunction() on cluster machine
    with Linux 4.4.0-22-generic

    2) error in MPI_Send() shown below on local machine with
    Linux 3.10.0-229.el7.x86_64
    [archimedes:31962] shmem: mmap: an error occurred while
    determining whether or not
    /tmp/openmpi-sessions-1007@archimedes_0/58444/1/shared_mem_pool.archimedes
    could be created.
    [archimedes:31962] create_and_attach: unable to create shared
    memory BTL coordinating structure :: size 134217728
    [archimedes:31962] shmem: mmap: an error occurred while
    determining whether or not
    /tmp/openmpi-sessions-1007@archimedes_0/58444/1/0/vader_segment.archimedes.0
    could be created.
    
[archimedes][[58444,1],0][../../../../../opal/mca/btl/tcp/btl_tcp_endpoint.c:800:mca_btl_tcp_endpoint_complete_connect]
    connect() to <MY_IP> failed: Connection refused (111)

    I launch application as following:
    mpirun --mca mpi_warn_on_fork 0 --mca btl_openib_want_fork_support
    1  -np 2 -npernode 1 ./main

    I have openmpi-2.0.1 configured with --prefix=${INSTALLDIR}
    --enable-mpi-fortran=all --with-pmi --disable-dlopen

    For more details, the code is here:
    https://github.com/goghino/matlabMpiC
    <https://github.com/goghino/matlabMpiC>

    Thanks for any suggestions!

    Juraj

    _______________________________________________
    users mailing list
    users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
    https://rfd.newmexicoconsortium.org/mailman/listinfo/users
    <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>




_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to