Grzegorz: something occurred to me. When you start all these processes, how are you staggering their wireup? Are they flooding us, or are you time-shifting them a little?
On Jul 19, 2010, at 10:32 AM, Edgar Gabriel wrote: > Hm, so I am not sure how to approach this. First of all, the test case > works for me. I used up to 80 clients, and for both optimized and > non-optimized compilation. I ran the tests with trunk (not with 1.4 > series, but the communicator code is identical in both cases). Clearly, > the patch from Ralph is necessary to make it work. > > Additionally, I went through the communicator creation code for dynamic > communicators trying to find spots that could create problems. The only > place that I found the number 64 appear is the fortran-to-c mapping > arrays (e.g. for communicators), where the initial size of the table is > 64. I looked twice over the pointer-array code to see whether we could > have a problem their (since it is a key-piece of the cid allocation code > for communicators), but I am fairly confident that it is correct. > > Note, that we have other (non-dynamic tests), were comm_set is called > 100,000 times, and the code per se does not seem to have a problem due > to being called too often. So I am not sure what else to look at. > > Edgar > > > > On 7/13/2010 8:42 PM, Ralph Castain wrote: >> As far as I can tell, it appears the problem is somewhere in our >> communicator setup. The people knowledgeable on that area are going to look >> into it later this week. >> >> I'm creating a ticket to track the problem and will copy you on it. >> >> >> On Jul 13, 2010, at 6:57 AM, Ralph Castain wrote: >> >>> >>> On Jul 13, 2010, at 3:36 AM, Grzegorz Maj wrote: >>> >>>> Bad news.. >>>> I've tried the latest patch with and without the prior one, but it >>>> hasn't changed anything. I've also tried using the old code but with >>>> the OMPI_DPM_BASE_MAXJOBIDS constant changed to 80, but it also didn't >>>> help. >>>> While looking through the sources of openmpi-1.4.2 I couldn't find any >>>> call of the function ompi_dpm_base_mark_dyncomm. >>> >>> It isn't directly called - it shows in ompi_comm_set as >>> ompi_dpm.mark_dyncomm. You were definitely overrunning that array, but I >>> guess something else is also being hit. Have to look further... >>> >>> >>>> >>>> >>>> 2010/7/12 Ralph Castain <r...@open-mpi.org>: >>>>> Just so you don't have to wait for 1.4.3 release, here is the patch >>>>> (doesn't include the prior patch). >>>>> >>>>> >>>>> >>>>> >>>>> On Jul 12, 2010, at 12:13 PM, Grzegorz Maj wrote: >>>>> >>>>>> 2010/7/12 Ralph Castain <r...@open-mpi.org>: >>>>>>> Dug around a bit and found the problem!! >>>>>>> >>>>>>> I have no idea who or why this was done, but somebody set a limit of 64 >>>>>>> separate jobids in the dynamic init called by ompi_comm_set, which >>>>>>> builds the intercommunicator. Unfortunately, they hard-wired the array >>>>>>> size, but never check that size before adding to it. >>>>>>> >>>>>>> So after 64 calls to connect_accept, you are overwriting other areas of >>>>>>> the code. As you found, hitting 66 causes it to segfault. >>>>>>> >>>>>>> I'll fix this on the developer's trunk (I'll also add that original >>>>>>> patch to it). Rather than my searching this thread in detail, can you >>>>>>> remind me what version you are using so I can patch it too? >>>>>> >>>>>> I'm using 1.4.2 >>>>>> Thanks a lot and I'm looking forward for the patch. >>>>>> >>>>>>> >>>>>>> Thanks for your patience with this! >>>>>>> Ralph >>>>>>> >>>>>>> >>>>>>> On Jul 12, 2010, at 7:20 AM, Grzegorz Maj wrote: >>>>>>> >>>>>>>> 1024 is not the problem: changing it to 2048 hasn't change anything. >>>>>>>> Following your advice I've run my process using gdb. Unfortunately I >>>>>>>> didn't get anything more than: >>>>>>>> >>>>>>>> Program received signal SIGSEGV, Segmentation fault. >>>>>>>> [Switching to Thread 0xf7e4c6c0 (LWP 20246)] >>>>>>>> 0xf7f39905 in ompi_comm_set () from /home/gmaj/openmpi/lib/libmpi.so.0 >>>>>>>> >>>>>>>> (gdb) bt >>>>>>>> #0 0xf7f39905 in ompi_comm_set () from >>>>>>>> /home/gmaj/openmpi/lib/libmpi.so.0 >>>>>>>> #1 0xf7e3ba95 in connect_accept () from >>>>>>>> /home/gmaj/openmpi/lib/openmpi/mca_dpm_orte.so >>>>>>>> #2 0xf7f62013 in PMPI_Comm_connect () from >>>>>>>> /home/gmaj/openmpi/lib/libmpi.so.0 >>>>>>>> #3 0x080489ed in main (argc=825832753, argv=0x34393638) at client.c:43 >>>>>>>> >>>>>>>> What's more: when I've added a breakpoint on ompi_comm_set in 66th >>>>>>>> process and stepped a couple of instructions, one of the other >>>>>>>> processes crashed (as usualy on ompi_comm_set) earlier than 66th did. >>>>>>>> >>>>>>>> Finally I decided to recompile openmpi using -g flag for gcc. In this >>>>>>>> case the 66 processes issue has gone! I was running my applications >>>>>>>> exactly the same way as previously (even without recompilation) and >>>>>>>> I've run successfully over 130 processes. >>>>>>>> When switching back to the openmpi compilation without -g it again >>>>>>>> segfaults. >>>>>>>> >>>>>>>> Any ideas? I'm really confused. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 2010/7/7 Ralph Castain <r...@open-mpi.org>: >>>>>>>>> I would guess the #files limit of 1024. However, if it behaves the >>>>>>>>> same way when spread across multiple machines, I would suspect it is >>>>>>>>> somewhere in your program itself. Given that the segfault is in your >>>>>>>>> process, can you use gdb to look at the core file and see where and >>>>>>>>> why it fails? >>>>>>>>> >>>>>>>>> On Jul 7, 2010, at 10:17 AM, Grzegorz Maj wrote: >>>>>>>>> >>>>>>>>>> 2010/7/7 Ralph Castain <r...@open-mpi.org>: >>>>>>>>>>> >>>>>>>>>>> On Jul 6, 2010, at 8:48 AM, Grzegorz Maj wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Ralph, >>>>>>>>>>>> sorry for the late response, but I couldn't find free time to play >>>>>>>>>>>> with this. Finally I've applied the patch you prepared. I've >>>>>>>>>>>> launched >>>>>>>>>>>> my processes in the way you've described and I think it's working >>>>>>>>>>>> as >>>>>>>>>>>> you expected. None of my processes runs the orted daemon and they >>>>>>>>>>>> can >>>>>>>>>>>> perform MPI operations. Unfortunately I'm still hitting the 65 >>>>>>>>>>>> processes issue :( >>>>>>>>>>>> Maybe I'm doing something wrong. >>>>>>>>>>>> I attach my source code. If anybody could have a look on this, I >>>>>>>>>>>> would >>>>>>>>>>>> be grateful. >>>>>>>>>>>> >>>>>>>>>>>> When I run that code with clients_count <= 65 everything works >>>>>>>>>>>> fine: >>>>>>>>>>>> all the processes create a common grid, exchange some information >>>>>>>>>>>> and >>>>>>>>>>>> disconnect. >>>>>>>>>>>> When I set clients_count > 65 the 66th process crashes on >>>>>>>>>>>> MPI_Comm_connect (segmentation fault). >>>>>>>>>>> >>>>>>>>>>> I didn't have time to check the code, but my guess is that you are >>>>>>>>>>> still hitting some kind of file descriptor or other limit. Check to >>>>>>>>>>> see what your limits are - usually "ulimit" will tell you. >>>>>>>>>> >>>>>>>>>> My limitations are: >>>>>>>>>> time(seconds) unlimited >>>>>>>>>> file(blocks) unlimited >>>>>>>>>> data(kb) unlimited >>>>>>>>>> stack(kb) 10240 >>>>>>>>>> coredump(blocks) 0 >>>>>>>>>> memory(kb) unlimited >>>>>>>>>> locked memory(kb) 64 >>>>>>>>>> process 200704 >>>>>>>>>> nofiles 1024 >>>>>>>>>> vmemory(kb) unlimited >>>>>>>>>> locks unlimited >>>>>>>>>> >>>>>>>>>> Which one do you think could be responsible for that? >>>>>>>>>> >>>>>>>>>> I was trying to run all the 66 processes on one machine or spread >>>>>>>>>> them >>>>>>>>>> across several machines and it always crashes the same way on the >>>>>>>>>> 66th >>>>>>>>>> process. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Another thing I would like to know is if it's normal that any of my >>>>>>>>>>>> processes when calling MPI_Comm_connect or MPI_Comm_accept when the >>>>>>>>>>>> other side is not ready, is eating up a full CPU available. >>>>>>>>>>> >>>>>>>>>>> Yes - the waiting process is polling in a tight loop waiting for >>>>>>>>>>> the connection to be made. >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Any help would be appreciated, >>>>>>>>>>>> Grzegorz Maj >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2010/4/24 Ralph Castain <r...@open-mpi.org>: >>>>>>>>>>>>> Actually, OMPI is distributed with a daemon that does pretty much >>>>>>>>>>>>> what you >>>>>>>>>>>>> want. Checkout "man ompi-server". I originally wrote that code to >>>>>>>>>>>>> support >>>>>>>>>>>>> cross-application MPI publish/subscribe operations, but we can >>>>>>>>>>>>> utilize it >>>>>>>>>>>>> here too. Have to blame me for not making it more publicly known. >>>>>>>>>>>>> The attached patch upgrades ompi-server and modifies the >>>>>>>>>>>>> singleton startup >>>>>>>>>>>>> to provide your desired support. This solution works in the >>>>>>>>>>>>> following >>>>>>>>>>>>> manner: >>>>>>>>>>>>> 1. launch "ompi-server -report-uri <filename>". This starts a >>>>>>>>>>>>> persistent >>>>>>>>>>>>> daemon called "ompi-server" that acts as a rendezvous point for >>>>>>>>>>>>> independently started applications. The problem with starting >>>>>>>>>>>>> different >>>>>>>>>>>>> applications and wanting them to MPI connect/accept lies in the >>>>>>>>>>>>> need to have >>>>>>>>>>>>> the applications find each other. If they can't discover contact >>>>>>>>>>>>> info for >>>>>>>>>>>>> the other app, then they can't wire up their interconnects. The >>>>>>>>>>>>> "ompi-server" tool provides that rendezvous point. I don't like >>>>>>>>>>>>> that >>>>>>>>>>>>> comm_accept segfaulted - should have just error'd out. >>>>>>>>>>>>> 2. set OMPI_MCA_orte_server=file:<filename>" in the environment >>>>>>>>>>>>> where you >>>>>>>>>>>>> will start your processes. This will allow your singleton >>>>>>>>>>>>> processes to find >>>>>>>>>>>>> the ompi-server. I automatically also set the envar to connect >>>>>>>>>>>>> the MPI >>>>>>>>>>>>> publish/subscribe system for you. >>>>>>>>>>>>> 3. run your processes. As they think they are singletons, they >>>>>>>>>>>>> will detect >>>>>>>>>>>>> the presence of the above envar and automatically connect >>>>>>>>>>>>> themselves to the >>>>>>>>>>>>> "ompi-server" daemon. This provides each process with the ability >>>>>>>>>>>>> to perform >>>>>>>>>>>>> any MPI-2 operation. >>>>>>>>>>>>> I tested this on my machines and it worked, so hopefully it will >>>>>>>>>>>>> meet your >>>>>>>>>>>>> needs. You only need to run one "ompi-server" period, so long as >>>>>>>>>>>>> you locate >>>>>>>>>>>>> it where all of the processes can find the contact file and can >>>>>>>>>>>>> open a TCP >>>>>>>>>>>>> socket to the daemon. There is a way to knit multiple >>>>>>>>>>>>> ompi-servers into a >>>>>>>>>>>>> broader network (e.g., to connect processes that cannot directly >>>>>>>>>>>>> access a >>>>>>>>>>>>> server due to network segmentation), but it's a tad tricky - let >>>>>>>>>>>>> me know if >>>>>>>>>>>>> you require it and I'll try to help. >>>>>>>>>>>>> If you have trouble wiring them all into a single communicator, >>>>>>>>>>>>> you might >>>>>>>>>>>>> ask separately about that and see if one of our MPI experts can >>>>>>>>>>>>> provide >>>>>>>>>>>>> advice (I'm just the RTE grunt). >>>>>>>>>>>>> HTH - let me know how this works for you and I'll incorporate it >>>>>>>>>>>>> into future >>>>>>>>>>>>> OMPI releases. >>>>>>>>>>>>> Ralph >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Apr 24, 2010, at 1:49 AM, Krzysztof Zarzycki wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Ralph, >>>>>>>>>>>>> I'm Krzysztof and I'm working with Grzegorz Maj on this our small >>>>>>>>>>>>> project/experiment. >>>>>>>>>>>>> We definitely would like to give your patch a try. But could you >>>>>>>>>>>>> please >>>>>>>>>>>>> explain your solution a little more? >>>>>>>>>>>>> You still would like to start one mpirun per mpi grid, and then >>>>>>>>>>>>> have >>>>>>>>>>>>> processes started by us to join the MPI comm? >>>>>>>>>>>>> It is a good solution of course. >>>>>>>>>>>>> But it would be especially preferable to have one daemon running >>>>>>>>>>>>> persistently on our "entry" machine that can handle several mpi >>>>>>>>>>>>> grid starts. >>>>>>>>>>>>> Can your patch help us this way too? >>>>>>>>>>>>> Thanks for your help! >>>>>>>>>>>>> Krzysztof >>>>>>>>>>>>> >>>>>>>>>>>>> On 24 April 2010 03:51, Ralph Castain <r...@open-mpi.org> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> In thinking about this, my proposed solution won't entirely fix >>>>>>>>>>>>>> the >>>>>>>>>>>>>> problem - you'll still wind up with all those daemons. I believe >>>>>>>>>>>>>> I can >>>>>>>>>>>>>> resolve that one as well, but it would require a patch. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Would you like me to send you something you could try? Might >>>>>>>>>>>>>> take a couple >>>>>>>>>>>>>> of iterations to get it right... >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Apr 23, 2010, at 12:12 PM, Ralph Castain wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hmmm....I -think- this will work, but I cannot guarantee it: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. launch one process (can just be a spinner) using mpirun that >>>>>>>>>>>>>>> includes >>>>>>>>>>>>>>> the following option: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> mpirun -report-uri file >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> where file is some filename that mpirun can create and insert >>>>>>>>>>>>>>> its >>>>>>>>>>>>>>> contact info into it. This can be a relative or absolute path. >>>>>>>>>>>>>>> This process >>>>>>>>>>>>>>> must remain alive throughout your application - doesn't matter >>>>>>>>>>>>>>> what it does. >>>>>>>>>>>>>>> It's purpose is solely to keep mpirun alive. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2. set OMPI_MCA_dpm_orte_server=FILE:file in your environment, >>>>>>>>>>>>>>> where >>>>>>>>>>>>>>> "file" is the filename given above. This will tell your >>>>>>>>>>>>>>> processes how to >>>>>>>>>>>>>>> find mpirun, which is acting as a meeting place to handle the >>>>>>>>>>>>>>> connect/accept >>>>>>>>>>>>>>> operations >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Now run your processes, and have them connect/accept to each >>>>>>>>>>>>>>> other. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The reason I cannot guarantee this will work is that these >>>>>>>>>>>>>>> processes >>>>>>>>>>>>>>> will all have the same rank && name since they all start as >>>>>>>>>>>>>>> singletons. >>>>>>>>>>>>>>> Hence, connect/accept is likely to fail. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> But it -might- work, so you might want to give it a try. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Apr 23, 2010, at 8:10 AM, Grzegorz Maj wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> To be more precise: by 'server process' I mean some process >>>>>>>>>>>>>>>> that I >>>>>>>>>>>>>>>> could run once on my system and it could help in creating those >>>>>>>>>>>>>>>> groups. >>>>>>>>>>>>>>>> My typical scenario is: >>>>>>>>>>>>>>>> 1. run N separate processes, each without mpirun >>>>>>>>>>>>>>>> 2. connect them into MPI group >>>>>>>>>>>>>>>> 3. do some job >>>>>>>>>>>>>>>> 4. exit all N processes >>>>>>>>>>>>>>>> 5. goto 1 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2010/4/23 Grzegorz Maj <ma...@wp.pl>: >>>>>>>>>>>>>>>>> Thank you Ralph for your explanation. >>>>>>>>>>>>>>>>> And, apart from that descriptors' issue, is there any other >>>>>>>>>>>>>>>>> way to >>>>>>>>>>>>>>>>> solve my problem, i.e. to run separately a number of >>>>>>>>>>>>>>>>> processes, >>>>>>>>>>>>>>>>> without mpirun and then to collect them into an MPI intracomm >>>>>>>>>>>>>>>>> group? >>>>>>>>>>>>>>>>> If I for example would need to run some 'server process' >>>>>>>>>>>>>>>>> (even using >>>>>>>>>>>>>>>>> mpirun) for this task, that's OK. Any ideas? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Grzegorz Maj >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2010/4/18 Ralph Castain <r...@open-mpi.org>: >>>>>>>>>>>>>>>>>> Okay, but here is the problem. If you don't use mpirun, and >>>>>>>>>>>>>>>>>> are not >>>>>>>>>>>>>>>>>> operating in an environment we support for "direct" launch >>>>>>>>>>>>>>>>>> (i.e., starting >>>>>>>>>>>>>>>>>> processes outside of mpirun), then every one of those >>>>>>>>>>>>>>>>>> processes thinks it is >>>>>>>>>>>>>>>>>> a singleton - yes? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> What you may not realize is that each singleton immediately >>>>>>>>>>>>>>>>>> fork/exec's an orted daemon that is configured to behave >>>>>>>>>>>>>>>>>> just like mpirun. >>>>>>>>>>>>>>>>>> This is required in order to support MPI-2 operations such as >>>>>>>>>>>>>>>>>> MPI_Comm_spawn, MPI_Comm_connect/accept, etc. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> So if you launch 64 processes that think they are >>>>>>>>>>>>>>>>>> singletons, then >>>>>>>>>>>>>>>>>> you have 64 copies of orted running as well. This eats up a >>>>>>>>>>>>>>>>>> lot of file >>>>>>>>>>>>>>>>>> descriptors, which is probably why you are hitting this 65 >>>>>>>>>>>>>>>>>> process limit - >>>>>>>>>>>>>>>>>> your system is probably running out of file descriptors. You >>>>>>>>>>>>>>>>>> might check you >>>>>>>>>>>>>>>>>> system limits and see if you can get them revised upward. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Apr 17, 2010, at 4:24 PM, Grzegorz Maj wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yes, I know. The problem is that I need to use some special >>>>>>>>>>>>>>>>>>> way for >>>>>>>>>>>>>>>>>>> running my processes provided by the environment in which >>>>>>>>>>>>>>>>>>> I'm >>>>>>>>>>>>>>>>>>> working >>>>>>>>>>>>>>>>>>> and unfortunately I can't use mpirun. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 2010/4/18 Ralph Castain <r...@open-mpi.org>: >>>>>>>>>>>>>>>>>>>> Guess I don't understand why you can't use mpirun - all it >>>>>>>>>>>>>>>>>>>> does is >>>>>>>>>>>>>>>>>>>> start things, provide a means to forward io, etc. It >>>>>>>>>>>>>>>>>>>> mainly sits there >>>>>>>>>>>>>>>>>>>> quietly without using any cpu unless required to support >>>>>>>>>>>>>>>>>>>> the job. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Sounds like it would solve your problem. Otherwise, I know >>>>>>>>>>>>>>>>>>>> of no >>>>>>>>>>>>>>>>>>>> way to get all these processes into comm_world. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Apr 17, 2010, at 2:27 PM, Grzegorz Maj wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>> I'd like to dynamically create a group of processes >>>>>>>>>>>>>>>>>>>>> communicating >>>>>>>>>>>>>>>>>>>>> via >>>>>>>>>>>>>>>>>>>>> MPI. Those processes need to be run without mpirun and >>>>>>>>>>>>>>>>>>>>> create >>>>>>>>>>>>>>>>>>>>> intracommunicator after the startup. Any ideas how to do >>>>>>>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>>>>>>> efficiently? >>>>>>>>>>>>>>>>>>>>> I came up with a solution in which the processes are >>>>>>>>>>>>>>>>>>>>> connecting >>>>>>>>>>>>>>>>>>>>> one by >>>>>>>>>>>>>>>>>>>>> one using MPI_Comm_connect, but unfortunately all the >>>>>>>>>>>>>>>>>>>>> processes >>>>>>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>>>> are already in the group need to call MPI_Comm_accept. >>>>>>>>>>>>>>>>>>>>> This means >>>>>>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>>>> when the n-th process wants to connect I need to collect >>>>>>>>>>>>>>>>>>>>> all the >>>>>>>>>>>>>>>>>>>>> n-1 >>>>>>>>>>>>>>>>>>>>> processes on the MPI_Comm_accept call. After I run about >>>>>>>>>>>>>>>>>>>>> 40 >>>>>>>>>>>>>>>>>>>>> processes >>>>>>>>>>>>>>>>>>>>> every subsequent call takes more and more time, which I'd >>>>>>>>>>>>>>>>>>>>> like to >>>>>>>>>>>>>>>>>>>>> avoid. >>>>>>>>>>>>>>>>>>>>> Another problem in this solution is that when I try to >>>>>>>>>>>>>>>>>>>>> connect >>>>>>>>>>>>>>>>>>>>> 66-th >>>>>>>>>>>>>>>>>>>>> process the root of the existing group segfaults on >>>>>>>>>>>>>>>>>>>>> MPI_Comm_accept. >>>>>>>>>>>>>>>>>>>>> Maybe it's my bug, but it's weird as everything works >>>>>>>>>>>>>>>>>>>>> fine for at >>>>>>>>>>>>>>>>>>>>> most >>>>>>>>>>>>>>>>>>>>> 65 processes. Is there any limitation I don't know about? >>>>>>>>>>>>>>>>>>>>> My last question is about MPI_COMM_WORLD. When I run my >>>>>>>>>>>>>>>>>>>>> processes >>>>>>>>>>>>>>>>>>>>> without mpirun their MPI_COMM_WORLD is the same as >>>>>>>>>>>>>>>>>>>>> MPI_COMM_SELF. >>>>>>>>>>>>>>>>>>>>> Is >>>>>>>>>>>>>>>>>>>>> there any way to change MPI_COMM_WORLD and set it to the >>>>>>>>>>>>>>>>>>>>> intracommunicator that I've created? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> Grzegorz Maj >>>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>> users mailing list >>>>>>>>>>>>>>>>>>>>> us...@open-mpi.org >>>>>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>> users mailing list >>>>>>>>>>>>>>>>>>>> us...@open-mpi.org >>>>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>> users mailing list >>>>>>>>>>>>>>>>>>> us...@open-mpi.org >>>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>> users mailing list >>>>>>>>>>>>>>>>>> us...@open-mpi.org >>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>> users mailing list >>>>>>>>>>>>>>>> us...@open-mpi.org >>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> users mailing list >>>>>>>>>>>>>> us...@open-mpi.org >>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>>>>> >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> users mailing list >>>>>>>>>>>>> us...@open-mpi.org >>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>>>>> >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> users mailing list >>>>>>>>>>>>> us...@open-mpi.org >>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>>>>> >>>>>>>>>>>> <client.c><server.c>_______________________________________________ >>>>>>>>>>>> users mailing list >>>>>>>>>>>> us...@open-mpi.org >>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> users mailing list >>>>>>>>>>> us...@open-mpi.org >>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> users mailing list >>>>>>>>>> us...@open-mpi.org >>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> users mailing list >>>>>>>>> us...@open-mpi.org >>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> users mailing list >>>>>>>> us...@open-mpi.org >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> us...@open-mpi.org >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>> >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> us...@open-mpi.org >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users