Juraj,
if i understand correctly, the "master" task calls MPI_Init(), and then
fork&exec matlab.
In some cases (lack of hardware support), fork cannot even work. but
let's assume it is fine for now.
Then, if i read between the lines, matlab calls mexFunction that MPI_Init().
As far as i am concerned, that cannot work.
The blocker is that a child cannot call MPI_Init() if its parent already
called MPI_Init()
Fortunatly, you have some options :-)
1) start matlab from mpirun.
for example, if you want one master, two slaves and matlab, you can do
something like
mpirun -np 1 master : -np 1 matlab : -np 2 slave
2) MPI_Comm_spawn matlab
master can MPI_Comm_spawn() matlab, and then matlab can merge the parent
communicator,
and communicate to master and slaves
3) use the approach suggested by Dmitry
/* this is specific to matlab, and i have no experience with it */
One last point, MPI_Init() can be invoked only once per task
(e.g. if your mexFunction does
MPI_Init(); work(); MPI_Finalize();
then it can be invoked only once per mpirun
Cheers,
Gilles
On 10/5/2016 6:41 PM, Dmitry N. Mikushin wrote:
Hi Juraj,
Although MPI infrastructure may technically support forking, it's
known that not all system resources can correctly replicate themselves
to forked process. For example, forking inside MPI program with active
CUDA driver will result into crash.
Why not to compile down the MATLAB into a native library and link it
with the MPI application directly? E.g. like here:
https://www.mathworks.com/matlabcentral/answers/98867-how-do-i-create-a-c-shared-library-from-mex-files-using-the-matlab-compiler?requestedDomain=www.mathworks.com
Kind regards,
- Dmitry Mikushin.
2016-10-05 11:32 GMT+03:00 juraj2...@gmail.com
<mailto:juraj2...@gmail.com> <juraj2...@gmail.com
<mailto:juraj2...@gmail.com>>:
Hello,
I have an application in C++(main.cpp) that is launched with
multiple processes via mpirun. Master process calls matlab via
system('matlab -nosplash -nodisplay -nojvm -nodesktop -r
"interface"'), which executes simple script interface.m that calls
mexFunction (mexsolve.cpp) from which I try to set up
communication with the rest of the processes launched at the
beginning together with the master process. When I run the
application as listed below on two different machines I experience:
1) crash at MPI_Init() in the mexFunction() on cluster machine
with Linux 4.4.0-22-generic
2) error in MPI_Send() shown below on local machine with
Linux 3.10.0-229.el7.x86_64
[archimedes:31962] shmem: mmap: an error occurred while
determining whether or not
/tmp/openmpi-sessions-1007@archimedes_0/58444/1/shared_mem_pool.archimedes
could be created.
[archimedes:31962] create_and_attach: unable to create shared
memory BTL coordinating structure :: size 134217728
[archimedes:31962] shmem: mmap: an error occurred while
determining whether or not
/tmp/openmpi-sessions-1007@archimedes_0/58444/1/0/vader_segment.archimedes.0
could be created.
[archimedes][[58444,1],0][../../../../../opal/mca/btl/tcp/btl_tcp_endpoint.c:800:mca_btl_tcp_endpoint_complete_connect]
connect() to <MY_IP> failed: Connection refused (111)
I launch application as following:
mpirun --mca mpi_warn_on_fork 0 --mca btl_openib_want_fork_support
1 -np 2 -npernode 1 ./main
I have openmpi-2.0.1 configured with --prefix=${INSTALLDIR}
--enable-mpi-fortran=all --with-pmi --disable-dlopen
For more details, the code is here:
https://github.com/goghino/matlabMpiC
<https://github.com/goghino/matlabMpiC>
Thanks for any suggestions!
Juraj
_______________________________________________
users mailing list
users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users