Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

Grzegorz Maj Mon, 12 Jul 2010 09:20:44 -0400

1024 is not the problem: changing it to 2048 hasn't change anything.
Following your advice I've run my process using gdb. Unfortunately I
didn't get anything more than:


Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xf7e4c6c0 (LWP 20246)]
0xf7f39905 in ompi_comm_set () from /home/gmaj/openmpi/lib/libmpi.so.0

(gdb) bt
#0  0xf7f39905 in ompi_comm_set () from /home/gmaj/openmpi/lib/libmpi.so.0
#1  0xf7e3ba95 in connect_accept () from
/home/gmaj/openmpi/lib/openmpi/mca_dpm_orte.so
#2  0xf7f62013 in PMPI_Comm_connect () from /home/gmaj/openmpi/lib/libmpi.so.0
#3  0x080489ed in main (argc=825832753, argv=0x34393638) at client.c:43

What's more: when I've added a breakpoint on ompi_comm_set in 66th
process and stepped a couple of instructions, one of the other
processes crashed (as usualy on ompi_comm_set) earlier than 66th did.

Finally I decided to recompile openmpi using -g flag for gcc. In this
case the 66 processes issue has gone! I was running my applications
exactly the same way as previously (even without recompilation) and
I've run successfully over 130 processes.
When switching back to the openmpi compilation without -g it again segfaults.

Any ideas? I'm really confused.



2010/7/7 Ralph Castain <r...@open-mpi.org>:
> I would guess the #files limit of 1024. However, if it behaves the same way 
> when spread across multiple machines, I would suspect it is somewhere in your 
> program itself. Given that the segfault is in your process, can you use gdb 
> to look at the core file and see where and why it fails?
>
> On Jul 7, 2010, at 10:17 AM, Grzegorz Maj wrote:
>
>> 2010/7/7 Ralph Castain <r...@open-mpi.org>:
>>>
>>> On Jul 6, 2010, at 8:48 AM, Grzegorz Maj wrote:
>>>
>>>> Hi Ralph,
>>>> sorry for the late response, but I couldn't find free time to play
>>>> with this. Finally I've applied the patch you prepared. I've launched
>>>> my processes in the way you've described and I think it's working as
>>>> you expected. None of my processes runs the orted daemon and they can
>>>> perform MPI operations. Unfortunately I'm still hitting the 65
>>>> processes issue :(
>>>> Maybe I'm doing something wrong.
>>>> I attach my source code. If anybody could have a look on this, I would
>>>> be grateful.
>>>>
>>>> When I run that code with clients_count <= 65 everything works fine:
>>>> all the processes create a common grid, exchange some information and
>>>> disconnect.
>>>> When I set clients_count > 65 the 66th process crashes on
>>>> MPI_Comm_connect (segmentation fault).
>>>
>>> I didn't have time to check the code, but my guess is that you are still 
>>> hitting some kind of file descriptor or other limit. Check to see what your 
>>> limits are - usually "ulimit" will tell you.
>>
>> My limitations are:
>> time(seconds)        unlimited
>> file(blocks)         unlimited
>> data(kb)             unlimited
>> stack(kb)            10240
>> coredump(blocks)     0
>> memory(kb)           unlimited
>> locked memory(kb)    64
>> process              200704
>> nofiles              1024
>> vmemory(kb)          unlimited
>> locks                unlimited
>>
>> Which one do you think could be responsible for that?
>>
>> I was trying to run all the 66 processes on one machine or spread them
>> across several machines and it always crashes the same way on the 66th
>> process.
>>
>>>
>>>>
>>>> Another thing I would like to know is if it's normal that any of my
>>>> processes when calling MPI_Comm_connect or MPI_Comm_accept when the
>>>> other side is not ready, is eating up a full CPU available.
>>>
>>> Yes - the waiting process is polling in a tight loop waiting for the 
>>> connection to be made.
>>>
>>>>
>>>> Any help would be appreciated,
>>>> Grzegorz Maj
>>>>
>>>>
>>>> 2010/4/24 Ralph Castain <r...@open-mpi.org>:
>>>>> Actually, OMPI is distributed with a daemon that does pretty much what you
>>>>> want. Checkout "man ompi-server". I originally wrote that code to support
>>>>> cross-application MPI publish/subscribe operations, but we can utilize it
>>>>> here too. Have to blame me for not making it more publicly known.
>>>>> The attached patch upgrades ompi-server and modifies the singleton startup
>>>>> to provide your desired support. This solution works in the following
>>>>> manner:
>>>>> 1. launch "ompi-server -report-uri <filename>". This starts a persistent
>>>>> daemon called "ompi-server" that acts as a rendezvous point for
>>>>> independently started applications.  The problem with starting different
>>>>> applications and wanting them to MPI connect/accept lies in the need to 
>>>>> have
>>>>> the applications find each other. If they can't discover contact info for
>>>>> the other app, then they can't wire up their interconnects. The
>>>>> "ompi-server" tool provides that rendezvous point. I don't like that
>>>>> comm_accept segfaulted - should have just error'd out.
>>>>> 2. set OMPI_MCA_orte_server=file:<filename>" in the environment where you
>>>>> will start your processes. This will allow your singleton processes to 
>>>>> find
>>>>> the ompi-server. I automatically also set the envar to connect the MPI
>>>>> publish/subscribe system for you.
>>>>> 3. run your processes. As they think they are singletons, they will detect
>>>>> the presence of the above envar and automatically connect themselves to 
>>>>> the
>>>>> "ompi-server" daemon. This provides each process with the ability to 
>>>>> perform
>>>>> any MPI-2 operation.
>>>>> I tested this on my machines and it worked, so hopefully it will meet your
>>>>> needs. You only need to run one "ompi-server" period, so long as you 
>>>>> locate
>>>>> it where all of the processes can find the contact file and can open a TCP
>>>>> socket to the daemon. There is a way to knit multiple ompi-servers into a
>>>>> broader network (e.g., to connect processes that cannot directly access a
>>>>> server due to network segmentation), but it's a tad tricky - let me know 
>>>>> if
>>>>> you require it and I'll try to help.
>>>>> If you have trouble wiring them all into a single communicator, you might
>>>>> ask separately about that and see if one of our MPI experts can provide
>>>>> advice (I'm just the RTE grunt).
>>>>> HTH - let me know how this works for you and I'll incorporate it into 
>>>>> future
>>>>> OMPI releases.
>>>>> Ralph
>>>>>
>>>>>
>>>>> On Apr 24, 2010, at 1:49 AM, Krzysztof Zarzycki wrote:
>>>>>
>>>>> Hi Ralph,
>>>>> I'm Krzysztof and I'm working with Grzegorz Maj on this our small
>>>>> project/experiment.
>>>>> We definitely would like to give your patch a try. But could you please
>>>>> explain your solution a little more?
>>>>> You still would like to start one mpirun per mpi grid, and then have
>>>>> processes started by us to join the MPI comm?
>>>>> It is a good solution of course.
>>>>> But it would be especially preferable to have one daemon running
>>>>> persistently on our "entry" machine that can handle several mpi grid 
>>>>> starts.
>>>>> Can your patch help us this way too?
>>>>> Thanks for your help!
>>>>> Krzysztof
>>>>>
>>>>> On 24 April 2010 03:51, Ralph Castain <r...@open-mpi.org> wrote:
>>>>>>
>>>>>> In thinking about this, my proposed solution won't entirely fix the
>>>>>> problem - you'll still wind up with all those daemons. I believe I can
>>>>>> resolve that one as well, but it would require a patch.
>>>>>>
>>>>>> Would you like me to send you something you could try? Might take a 
>>>>>> couple
>>>>>> of iterations to get it right...
>>>>>>
>>>>>> On Apr 23, 2010, at 12:12 PM, Ralph Castain wrote:
>>>>>>
>>>>>>> Hmmm....I -think- this will work, but I cannot guarantee it:
>>>>>>>
>>>>>>> 1. launch one process (can just be a spinner) using mpirun that includes
>>>>>>> the following option:
>>>>>>>
>>>>>>> mpirun -report-uri file
>>>>>>>
>>>>>>> where file is some filename that mpirun can create and insert its
>>>>>>> contact info into it. This can be a relative or absolute path. This 
>>>>>>> process
>>>>>>> must remain alive throughout your application - doesn't matter what it 
>>>>>>> does.
>>>>>>> It's purpose is solely to keep mpirun alive.
>>>>>>>
>>>>>>> 2. set OMPI_MCA_dpm_orte_server=FILE:file in your environment, where
>>>>>>> "file" is the filename given above. This will tell your processes how to
>>>>>>> find mpirun, which is acting as a meeting place to handle the 
>>>>>>> connect/accept
>>>>>>> operations
>>>>>>>
>>>>>>> Now run your processes, and have them connect/accept to each other.
>>>>>>>
>>>>>>> The reason I cannot guarantee this will work is that these processes
>>>>>>> will all have the same rank && name since they all start as singletons.
>>>>>>> Hence, connect/accept is likely to fail.
>>>>>>>
>>>>>>> But it -might- work, so you might want to give it a try.
>>>>>>>
>>>>>>> On Apr 23, 2010, at 8:10 AM, Grzegorz Maj wrote:
>>>>>>>
>>>>>>>> To be more precise: by 'server process' I mean some process that I
>>>>>>>> could run once on my system and it could help in creating those
>>>>>>>> groups.
>>>>>>>> My typical scenario is:
>>>>>>>> 1. run N separate processes, each without mpirun
>>>>>>>> 2. connect them into MPI group
>>>>>>>> 3. do some job
>>>>>>>> 4. exit all N processes
>>>>>>>> 5. goto 1
>>>>>>>>
>>>>>>>> 2010/4/23 Grzegorz Maj <ma...@wp.pl>:
>>>>>>>>> Thank you Ralph for your explanation.
>>>>>>>>> And, apart from that descriptors' issue, is there any other way to
>>>>>>>>> solve my problem, i.e. to run separately a number of processes,
>>>>>>>>> without mpirun and then to collect them into an MPI intracomm group?
>>>>>>>>> If I for example would need to run some 'server process' (even using
>>>>>>>>> mpirun) for this task, that's OK. Any ideas?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Grzegorz Maj
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2010/4/18 Ralph Castain <r...@open-mpi.org>:
>>>>>>>>>> Okay, but here is the problem. If you don't use mpirun, and are not
>>>>>>>>>> operating in an environment we support for "direct" launch (i.e., 
>>>>>>>>>> starting
>>>>>>>>>> processes outside of mpirun), then every one of those processes 
>>>>>>>>>> thinks it is
>>>>>>>>>> a singleton - yes?
>>>>>>>>>>
>>>>>>>>>> What you may not realize is that each singleton immediately
>>>>>>>>>> fork/exec's an orted daemon that is configured to behave just like 
>>>>>>>>>> mpirun.
>>>>>>>>>> This is required in order to support MPI-2 operations such as
>>>>>>>>>> MPI_Comm_spawn, MPI_Comm_connect/accept, etc.
>>>>>>>>>>
>>>>>>>>>> So if you launch 64 processes that think they are singletons, then
>>>>>>>>>> you have 64 copies of orted running as well. This eats up a lot of 
>>>>>>>>>> file
>>>>>>>>>> descriptors, which is probably why you are hitting this 65 process 
>>>>>>>>>> limit -
>>>>>>>>>> your system is probably running out of file descriptors. You might 
>>>>>>>>>> check you
>>>>>>>>>> system limits and see if you can get them revised upward.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Apr 17, 2010, at 4:24 PM, Grzegorz Maj wrote:
>>>>>>>>>>
>>>>>>>>>>> Yes, I know. The problem is that I need to use some special way for
>>>>>>>>>>> running my processes provided by the environment in which I'm
>>>>>>>>>>> working
>>>>>>>>>>> and unfortunately I can't use mpirun.
>>>>>>>>>>>
>>>>>>>>>>> 2010/4/18 Ralph Castain <r...@open-mpi.org>:
>>>>>>>>>>>> Guess I don't understand why you can't use mpirun - all it does is
>>>>>>>>>>>> start things, provide a means to forward io, etc. It mainly sits 
>>>>>>>>>>>> there
>>>>>>>>>>>> quietly without using any cpu unless required to support the job.
>>>>>>>>>>>>
>>>>>>>>>>>> Sounds like it would solve your problem. Otherwise, I know of no
>>>>>>>>>>>> way to get all these processes into comm_world.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Apr 17, 2010, at 2:27 PM, Grzegorz Maj wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>> I'd like to dynamically create a group of processes communicating
>>>>>>>>>>>>> via
>>>>>>>>>>>>> MPI. Those processes need to be run without mpirun and create
>>>>>>>>>>>>> intracommunicator after the startup. Any ideas how to do this
>>>>>>>>>>>>> efficiently?
>>>>>>>>>>>>> I came up with a solution in which the processes are connecting
>>>>>>>>>>>>> one by
>>>>>>>>>>>>> one using MPI_Comm_connect, but unfortunately all the processes
>>>>>>>>>>>>> that
>>>>>>>>>>>>> are already in the group need to call MPI_Comm_accept. This means
>>>>>>>>>>>>> that
>>>>>>>>>>>>> when the n-th process wants to connect I need to collect all the
>>>>>>>>>>>>> n-1
>>>>>>>>>>>>> processes on the MPI_Comm_accept call. After I run about 40
>>>>>>>>>>>>> processes
>>>>>>>>>>>>> every subsequent call takes more and more time, which I'd like to
>>>>>>>>>>>>> avoid.
>>>>>>>>>>>>> Another problem in this solution is that when I try to connect
>>>>>>>>>>>>> 66-th
>>>>>>>>>>>>> process the root of the existing group segfaults on
>>>>>>>>>>>>> MPI_Comm_accept.
>>>>>>>>>>>>> Maybe it's my bug, but it's weird as everything works fine for at
>>>>>>>>>>>>> most
>>>>>>>>>>>>> 65 processes. Is there any limitation I don't know about?
>>>>>>>>>>>>> My last question is about MPI_COMM_WORLD. When I run my processes
>>>>>>>>>>>>> without mpirun their MPI_COMM_WORLD is the same as MPI_COMM_SELF.
>>>>>>>>>>>>> Is
>>>>>>>>>>>>> there any way to change MPI_COMM_WORLD and set it to the
>>>>>>>>>>>>> intracommunicator that I've created?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Grzegorz Maj
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> users mailing list
>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> users mailing list
>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>> <client.c><server.c>_______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>

Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

Reply via email to