Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

Grzegorz Maj Fri, 23 Apr 2010 09:19:40 -0400

Thank you Ralph for your explanation.
And, apart from that descriptors' issue, is there any other way to
solve my problem, i.e. to run separately a number of processes,
without mpirun and then to collect them into an MPI intracomm group?
If I for example would need to run some 'server process' (even using
mpirun) for this task, that's OK. Any ideas?


Thanks,
Grzegorz Maj


2010/4/18 Ralph Castain <r...@open-mpi.org>:
> Okay, but here is the problem. If you don't use mpirun, and are not operating 
> in an environment we support for "direct" launch (i.e., starting processes 
> outside of mpirun), then every one of those processes thinks it is a 
> singleton - yes?
>
> What you may not realize is that each singleton immediately fork/exec's an 
> orted daemon that is configured to behave just like mpirun. This is required 
> in order to support MPI-2 operations such as MPI_Comm_spawn, 
> MPI_Comm_connect/accept, etc.
>
> So if you launch 64 processes that think they are singletons, then you have 
> 64 copies of orted running as well. This eats up a lot of file descriptors, 
> which is probably why you are hitting this 65 process limit - your system is 
> probably running out of file descriptors. You might check you system limits 
> and see if you can get them revised upward.
>
>
> On Apr 17, 2010, at 4:24 PM, Grzegorz Maj wrote:
>
>> Yes, I know. The problem is that I need to use some special way for
>> running my processes provided by the environment in which I'm working
>> and unfortunately I can't use mpirun.
>>
>> 2010/4/18 Ralph Castain <r...@open-mpi.org>:
>>> Guess I don't understand why you can't use mpirun - all it does is start 
>>> things, provide a means to forward io, etc. It mainly sits there quietly 
>>> without using any cpu unless required to support the job.
>>>
>>> Sounds like it would solve your problem. Otherwise, I know of no way to get 
>>> all these processes into comm_world.
>>>
>>>
>>> On Apr 17, 2010, at 2:27 PM, Grzegorz Maj wrote:
>>>
>>>> Hi,
>>>> I'd like to dynamically create a group of processes communicating via
>>>> MPI. Those processes need to be run without mpirun and create
>>>> intracommunicator after the startup. Any ideas how to do this
>>>> efficiently?
>>>> I came up with a solution in which the processes are connecting one by
>>>> one using MPI_Comm_connect, but unfortunately all the processes that
>>>> are already in the group need to call MPI_Comm_accept. This means that
>>>> when the n-th process wants to connect I need to collect all the n-1
>>>> processes on the MPI_Comm_accept call. After I run about 40 processes
>>>> every subsequent call takes more and more time, which I'd like to
>>>> avoid.
>>>> Another problem in this solution is that when I try to connect 66-th
>>>> process the root of the existing group segfaults on MPI_Comm_accept.
>>>> Maybe it's my bug, but it's weird as everything works fine for at most
>>>> 65 processes. Is there any limitation I don't know about?
>>>> My last question is about MPI_COMM_WORLD. When I run my processes
>>>> without mpirun their MPI_COMM_WORLD is the same as MPI_COMM_SELF. Is
>>>> there any way to change MPI_COMM_WORLD and set it to the
>>>> intracommunicator that I've created?
>>>>
>>>> Thanks,
>>>> Grzegorz Maj
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>

Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

Reply via email to