Just so you don't have to wait for 1.4.3 release, here is the patch (doesn't 
include the prior patch).

Attachment: dpm.diff
Description: Binary data


On Jul 12, 2010, at 12:13 PM, Grzegorz Maj wrote:

> 2010/7/12 Ralph Castain <r...@open-mpi.org>:
>> Dug around a bit and found the problem!!
>> 
>> I have no idea who or why this was done, but somebody set a limit of 64 
>> separate jobids in the dynamic init called by ompi_comm_set, which builds 
>> the intercommunicator. Unfortunately, they hard-wired the array size, but 
>> never check that size before adding to it.
>> 
>> So after 64 calls to connect_accept, you are overwriting other areas of the 
>> code. As you found, hitting 66 causes it to segfault.
>> 
>> I'll fix this on the developer's trunk (I'll also add that original patch to 
>> it). Rather than my searching this thread in detail, can you remind me what 
>> version you are using so I can patch it too?
> 
> I'm using 1.4.2
> Thanks a lot and I'm looking forward for the patch.
> 
>> 
>> Thanks for your patience with this!
>> Ralph
>> 
>> 
>> On Jul 12, 2010, at 7:20 AM, Grzegorz Maj wrote:
>> 
>>> 1024 is not the problem: changing it to 2048 hasn't change anything.
>>> Following your advice I've run my process using gdb. Unfortunately I
>>> didn't get anything more than:
>>> 
>>> Program received signal SIGSEGV, Segmentation fault.
>>> [Switching to Thread 0xf7e4c6c0 (LWP 20246)]
>>> 0xf7f39905 in ompi_comm_set () from /home/gmaj/openmpi/lib/libmpi.so.0
>>> 
>>> (gdb) bt
>>> #0  0xf7f39905 in ompi_comm_set () from /home/gmaj/openmpi/lib/libmpi.so.0
>>> #1  0xf7e3ba95 in connect_accept () from
>>> /home/gmaj/openmpi/lib/openmpi/mca_dpm_orte.so
>>> #2  0xf7f62013 in PMPI_Comm_connect () from 
>>> /home/gmaj/openmpi/lib/libmpi.so.0
>>> #3  0x080489ed in main (argc=825832753, argv=0x34393638) at client.c:43
>>> 
>>> What's more: when I've added a breakpoint on ompi_comm_set in 66th
>>> process and stepped a couple of instructions, one of the other
>>> processes crashed (as usualy on ompi_comm_set) earlier than 66th did.
>>> 
>>> Finally I decided to recompile openmpi using -g flag for gcc. In this
>>> case the 66 processes issue has gone! I was running my applications
>>> exactly the same way as previously (even without recompilation) and
>>> I've run successfully over 130 processes.
>>> When switching back to the openmpi compilation without -g it again 
>>> segfaults.
>>> 
>>> Any ideas? I'm really confused.
>>> 
>>> 
>>> 
>>> 2010/7/7 Ralph Castain <r...@open-mpi.org>:
>>>> I would guess the #files limit of 1024. However, if it behaves the same 
>>>> way when spread across multiple machines, I would suspect it is somewhere 
>>>> in your program itself. Given that the segfault is in your process, can 
>>>> you use gdb to look at the core file and see where and why it fails?
>>>> 
>>>> On Jul 7, 2010, at 10:17 AM, Grzegorz Maj wrote:
>>>> 
>>>>> 2010/7/7 Ralph Castain <r...@open-mpi.org>:
>>>>>> 
>>>>>> On Jul 6, 2010, at 8:48 AM, Grzegorz Maj wrote:
>>>>>> 
>>>>>>> Hi Ralph,
>>>>>>> sorry for the late response, but I couldn't find free time to play
>>>>>>> with this. Finally I've applied the patch you prepared. I've launched
>>>>>>> my processes in the way you've described and I think it's working as
>>>>>>> you expected. None of my processes runs the orted daemon and they can
>>>>>>> perform MPI operations. Unfortunately I'm still hitting the 65
>>>>>>> processes issue :(
>>>>>>> Maybe I'm doing something wrong.
>>>>>>> I attach my source code. If anybody could have a look on this, I would
>>>>>>> be grateful.
>>>>>>> 
>>>>>>> When I run that code with clients_count <= 65 everything works fine:
>>>>>>> all the processes create a common grid, exchange some information and
>>>>>>> disconnect.
>>>>>>> When I set clients_count > 65 the 66th process crashes on
>>>>>>> MPI_Comm_connect (segmentation fault).
>>>>>> 
>>>>>> I didn't have time to check the code, but my guess is that you are still 
>>>>>> hitting some kind of file descriptor or other limit. Check to see what 
>>>>>> your limits are - usually "ulimit" will tell you.
>>>>> 
>>>>> My limitations are:
>>>>> time(seconds)        unlimited
>>>>> file(blocks)         unlimited
>>>>> data(kb)             unlimited
>>>>> stack(kb)            10240
>>>>> coredump(blocks)     0
>>>>> memory(kb)           unlimited
>>>>> locked memory(kb)    64
>>>>> process              200704
>>>>> nofiles              1024
>>>>> vmemory(kb)          unlimited
>>>>> locks                unlimited
>>>>> 
>>>>> Which one do you think could be responsible for that?
>>>>> 
>>>>> I was trying to run all the 66 processes on one machine or spread them
>>>>> across several machines and it always crashes the same way on the 66th
>>>>> process.
>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> Another thing I would like to know is if it's normal that any of my
>>>>>>> processes when calling MPI_Comm_connect or MPI_Comm_accept when the
>>>>>>> other side is not ready, is eating up a full CPU available.
>>>>>> 
>>>>>> Yes - the waiting process is polling in a tight loop waiting for the 
>>>>>> connection to be made.
>>>>>> 
>>>>>>> 
>>>>>>> Any help would be appreciated,
>>>>>>> Grzegorz Maj
>>>>>>> 
>>>>>>> 
>>>>>>> 2010/4/24 Ralph Castain <r...@open-mpi.org>:
>>>>>>>> Actually, OMPI is distributed with a daemon that does pretty much what 
>>>>>>>> you
>>>>>>>> want. Checkout "man ompi-server". I originally wrote that code to 
>>>>>>>> support
>>>>>>>> cross-application MPI publish/subscribe operations, but we can utilize 
>>>>>>>> it
>>>>>>>> here too. Have to blame me for not making it more publicly known.
>>>>>>>> The attached patch upgrades ompi-server and modifies the singleton 
>>>>>>>> startup
>>>>>>>> to provide your desired support. This solution works in the following
>>>>>>>> manner:
>>>>>>>> 1. launch "ompi-server -report-uri <filename>". This starts a 
>>>>>>>> persistent
>>>>>>>> daemon called "ompi-server" that acts as a rendezvous point for
>>>>>>>> independently started applications.  The problem with starting 
>>>>>>>> different
>>>>>>>> applications and wanting them to MPI connect/accept lies in the need 
>>>>>>>> to have
>>>>>>>> the applications find each other. If they can't discover contact info 
>>>>>>>> for
>>>>>>>> the other app, then they can't wire up their interconnects. The
>>>>>>>> "ompi-server" tool provides that rendezvous point. I don't like that
>>>>>>>> comm_accept segfaulted - should have just error'd out.
>>>>>>>> 2. set OMPI_MCA_orte_server=file:<filename>" in the environment where 
>>>>>>>> you
>>>>>>>> will start your processes. This will allow your singleton processes to 
>>>>>>>> find
>>>>>>>> the ompi-server. I automatically also set the envar to connect the MPI
>>>>>>>> publish/subscribe system for you.
>>>>>>>> 3. run your processes. As they think they are singletons, they will 
>>>>>>>> detect
>>>>>>>> the presence of the above envar and automatically connect themselves 
>>>>>>>> to the
>>>>>>>> "ompi-server" daemon. This provides each process with the ability to 
>>>>>>>> perform
>>>>>>>> any MPI-2 operation.
>>>>>>>> I tested this on my machines and it worked, so hopefully it will meet 
>>>>>>>> your
>>>>>>>> needs. You only need to run one "ompi-server" period, so long as you 
>>>>>>>> locate
>>>>>>>> it where all of the processes can find the contact file and can open a 
>>>>>>>> TCP
>>>>>>>> socket to the daemon. There is a way to knit multiple ompi-servers 
>>>>>>>> into a
>>>>>>>> broader network (e.g., to connect processes that cannot directly 
>>>>>>>> access a
>>>>>>>> server due to network segmentation), but it's a tad tricky - let me 
>>>>>>>> know if
>>>>>>>> you require it and I'll try to help.
>>>>>>>> If you have trouble wiring them all into a single communicator, you 
>>>>>>>> might
>>>>>>>> ask separately about that and see if one of our MPI experts can provide
>>>>>>>> advice (I'm just the RTE grunt).
>>>>>>>> HTH - let me know how this works for you and I'll incorporate it into 
>>>>>>>> future
>>>>>>>> OMPI releases.
>>>>>>>> Ralph
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Apr 24, 2010, at 1:49 AM, Krzysztof Zarzycki wrote:
>>>>>>>> 
>>>>>>>> Hi Ralph,
>>>>>>>> I'm Krzysztof and I'm working with Grzegorz Maj on this our small
>>>>>>>> project/experiment.
>>>>>>>> We definitely would like to give your patch a try. But could you please
>>>>>>>> explain your solution a little more?
>>>>>>>> You still would like to start one mpirun per mpi grid, and then have
>>>>>>>> processes started by us to join the MPI comm?
>>>>>>>> It is a good solution of course.
>>>>>>>> But it would be especially preferable to have one daemon running
>>>>>>>> persistently on our "entry" machine that can handle several mpi grid 
>>>>>>>> starts.
>>>>>>>> Can your patch help us this way too?
>>>>>>>> Thanks for your help!
>>>>>>>> Krzysztof
>>>>>>>> 
>>>>>>>> On 24 April 2010 03:51, Ralph Castain <r...@open-mpi.org> wrote:
>>>>>>>>> 
>>>>>>>>> In thinking about this, my proposed solution won't entirely fix the
>>>>>>>>> problem - you'll still wind up with all those daemons. I believe I can
>>>>>>>>> resolve that one as well, but it would require a patch.
>>>>>>>>> 
>>>>>>>>> Would you like me to send you something you could try? Might take a 
>>>>>>>>> couple
>>>>>>>>> of iterations to get it right...
>>>>>>>>> 
>>>>>>>>> On Apr 23, 2010, at 12:12 PM, Ralph Castain wrote:
>>>>>>>>> 
>>>>>>>>>> Hmmm....I -think- this will work, but I cannot guarantee it:
>>>>>>>>>> 
>>>>>>>>>> 1. launch one process (can just be a spinner) using mpirun that 
>>>>>>>>>> includes
>>>>>>>>>> the following option:
>>>>>>>>>> 
>>>>>>>>>> mpirun -report-uri file
>>>>>>>>>> 
>>>>>>>>>> where file is some filename that mpirun can create and insert its
>>>>>>>>>> contact info into it. This can be a relative or absolute path. This 
>>>>>>>>>> process
>>>>>>>>>> must remain alive throughout your application - doesn't matter what 
>>>>>>>>>> it does.
>>>>>>>>>> It's purpose is solely to keep mpirun alive.
>>>>>>>>>> 
>>>>>>>>>> 2. set OMPI_MCA_dpm_orte_server=FILE:file in your environment, where
>>>>>>>>>> "file" is the filename given above. This will tell your processes 
>>>>>>>>>> how to
>>>>>>>>>> find mpirun, which is acting as a meeting place to handle the 
>>>>>>>>>> connect/accept
>>>>>>>>>> operations
>>>>>>>>>> 
>>>>>>>>>> Now run your processes, and have them connect/accept to each other.
>>>>>>>>>> 
>>>>>>>>>> The reason I cannot guarantee this will work is that these processes
>>>>>>>>>> will all have the same rank && name since they all start as 
>>>>>>>>>> singletons.
>>>>>>>>>> Hence, connect/accept is likely to fail.
>>>>>>>>>> 
>>>>>>>>>> But it -might- work, so you might want to give it a try.
>>>>>>>>>> 
>>>>>>>>>> On Apr 23, 2010, at 8:10 AM, Grzegorz Maj wrote:
>>>>>>>>>> 
>>>>>>>>>>> To be more precise: by 'server process' I mean some process that I
>>>>>>>>>>> could run once on my system and it could help in creating those
>>>>>>>>>>> groups.
>>>>>>>>>>> My typical scenario is:
>>>>>>>>>>> 1. run N separate processes, each without mpirun
>>>>>>>>>>> 2. connect them into MPI group
>>>>>>>>>>> 3. do some job
>>>>>>>>>>> 4. exit all N processes
>>>>>>>>>>> 5. goto 1
>>>>>>>>>>> 
>>>>>>>>>>> 2010/4/23 Grzegorz Maj <ma...@wp.pl>:
>>>>>>>>>>>> Thank you Ralph for your explanation.
>>>>>>>>>>>> And, apart from that descriptors' issue, is there any other way to
>>>>>>>>>>>> solve my problem, i.e. to run separately a number of processes,
>>>>>>>>>>>> without mpirun and then to collect them into an MPI intracomm 
>>>>>>>>>>>> group?
>>>>>>>>>>>> If I for example would need to run some 'server process' (even 
>>>>>>>>>>>> using
>>>>>>>>>>>> mpirun) for this task, that's OK. Any ideas?
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Grzegorz Maj
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 2010/4/18 Ralph Castain <r...@open-mpi.org>:
>>>>>>>>>>>>> Okay, but here is the problem. If you don't use mpirun, and are 
>>>>>>>>>>>>> not
>>>>>>>>>>>>> operating in an environment we support for "direct" launch (i.e., 
>>>>>>>>>>>>> starting
>>>>>>>>>>>>> processes outside of mpirun), then every one of those processes 
>>>>>>>>>>>>> thinks it is
>>>>>>>>>>>>> a singleton - yes?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> What you may not realize is that each singleton immediately
>>>>>>>>>>>>> fork/exec's an orted daemon that is configured to behave just 
>>>>>>>>>>>>> like mpirun.
>>>>>>>>>>>>> This is required in order to support MPI-2 operations such as
>>>>>>>>>>>>> MPI_Comm_spawn, MPI_Comm_connect/accept, etc.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> So if you launch 64 processes that think they are singletons, then
>>>>>>>>>>>>> you have 64 copies of orted running as well. This eats up a lot 
>>>>>>>>>>>>> of file
>>>>>>>>>>>>> descriptors, which is probably why you are hitting this 65 
>>>>>>>>>>>>> process limit -
>>>>>>>>>>>>> your system is probably running out of file descriptors. You 
>>>>>>>>>>>>> might check you
>>>>>>>>>>>>> system limits and see if you can get them revised upward.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Apr 17, 2010, at 4:24 PM, Grzegorz Maj wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Yes, I know. The problem is that I need to use some special way 
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> running my processes provided by the environment in which I'm
>>>>>>>>>>>>>> working
>>>>>>>>>>>>>> and unfortunately I can't use mpirun.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2010/4/18 Ralph Castain <r...@open-mpi.org>:
>>>>>>>>>>>>>>> Guess I don't understand why you can't use mpirun - all it does 
>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>> start things, provide a means to forward io, etc. It mainly 
>>>>>>>>>>>>>>> sits there
>>>>>>>>>>>>>>> quietly without using any cpu unless required to support the 
>>>>>>>>>>>>>>> job.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Sounds like it would solve your problem. Otherwise, I know of no
>>>>>>>>>>>>>>> way to get all these processes into comm_world.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Apr 17, 2010, at 2:27 PM, Grzegorz Maj wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>> I'd like to dynamically create a group of processes 
>>>>>>>>>>>>>>>> communicating
>>>>>>>>>>>>>>>> via
>>>>>>>>>>>>>>>> MPI. Those processes need to be run without mpirun and create
>>>>>>>>>>>>>>>> intracommunicator after the startup. Any ideas how to do this
>>>>>>>>>>>>>>>> efficiently?
>>>>>>>>>>>>>>>> I came up with a solution in which the processes are connecting
>>>>>>>>>>>>>>>> one by
>>>>>>>>>>>>>>>> one using MPI_Comm_connect, but unfortunately all the processes
>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>> are already in the group need to call MPI_Comm_accept. This 
>>>>>>>>>>>>>>>> means
>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>> when the n-th process wants to connect I need to collect all 
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> n-1
>>>>>>>>>>>>>>>> processes on the MPI_Comm_accept call. After I run about 40
>>>>>>>>>>>>>>>> processes
>>>>>>>>>>>>>>>> every subsequent call takes more and more time, which I'd like 
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> avoid.
>>>>>>>>>>>>>>>> Another problem in this solution is that when I try to connect
>>>>>>>>>>>>>>>> 66-th
>>>>>>>>>>>>>>>> process the root of the existing group segfaults on
>>>>>>>>>>>>>>>> MPI_Comm_accept.
>>>>>>>>>>>>>>>> Maybe it's my bug, but it's weird as everything works fine for 
>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>> most
>>>>>>>>>>>>>>>> 65 processes. Is there any limitation I don't know about?
>>>>>>>>>>>>>>>> My last question is about MPI_COMM_WORLD. When I run my 
>>>>>>>>>>>>>>>> processes
>>>>>>>>>>>>>>>> without mpirun their MPI_COMM_WORLD is the same as 
>>>>>>>>>>>>>>>> MPI_COMM_SELF.
>>>>>>>>>>>>>>>> Is
>>>>>>>>>>>>>>>> there any way to change MPI_COMM_WORLD and set it to the
>>>>>>>>>>>>>>>> intracommunicator that I've created?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Grzegorz Maj
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> users mailing list
>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> 
>>>>>>> <client.c><server.c>_______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to