Hi Wolfgang, Thanks for writing this up. I think that, in my circumstance, I will be able to get away with duplicating communicators since the objects should exist for the entire program run (i.e,. I might make a few dozen instances at most) - e.g., each object can duplicate a communicator and then set up vectors and whatnot with it. I should be able to do all of this in my own code and not in deal.II so I don't think I'll have to dig up any old patches to the library.
> It only manifested in a crash after a program had been running for many hours, and small changes to the program made the bug move to unrelated pieces of the code but not make it appear any earlier in the run time of the program. That's frightening and now I'm not sure how PETSc avoids this problem, and I am somewhat afraid to even look. Best, David ________________________________ From: dealii@googlegroups.com <dealii@googlegroups.com> on behalf of Wolfgang Bangerth <bange...@colostate.edu> Sent: Thursday, February 18, 2021 12:19 PM To: dealii@googlegroups.com <dealii@googlegroups.com> Subject: Re: [deal.II] Tags and Communicators > One fix that I have found (PETSc does this) is to assign every object its > own duplicated communicator which can then keep track of its own tags with > MPI's own get and set attributes functions. Sometime back in the day (in the 2005-2010 range) I spent an inordinate amount of time going through every class that receives a communicator in some way or other in the constructor or a reinit() call, and had that class duplicate the communicator in the way you mention (Utilities::MPI::duplicate_communicator() does that). You can probably still find those patches if you look long enough. The reason I did this was primarily because I wanted to have one added layer of security. If for some reason one process does not participate in a collective communication, you should get a deadlock, whereas you would either get funny errors or just completely unreliable results if that process proceeds to the next communication on the same communicator. Right now, in practice, all communication happens on MPI_COMM_WORLD. But, after spending a good amount of time to duplicate all of these communicators (probably several days of work), I spent *an even larger amount of time* to track down what was quite likely the worst-to-find bug I have ever worked on. It only manifested in a crash after a program had been running for many hours, and small changes to the program made the bug move to unrelated pieces of the code but not make it appear any earlier in the run time of the program. In the end, what it boiled down is that the MPI implementation I was using was only able to provide 64k different communicators and if you asked for the 64k+1'st communicator, it just crashed. In programs that do thousands of time steps and allocate a few vectors for temp operations in each time step, you'd get there in a matter of hours. I had sort of expected that MPI implementations recycle released communicators, and I would expect that that's what they do today, but they didn't back then. So after all of this time spent, I ripped out the duplication of communicators again. You can probably also find that patch in the repository and, with some luck, you might even be able to revert it if you allow some fuzz in indentation etc. It would certainly be interesting to try this again. I'm still in favor of this approach. It's conceptually the right thing, it would help uncover bugs, and it is *necessary* if you want to do things multithreaded. (All of the MPI implementations have reentrant public interfaces these days, but we can't use them unless you also duplicate communicators somewhere.) But I would first want to try with a small test program whether that is scalable to very large numbers of communicators -- I think we would quite easily run into millions of communicators with this approach if programs run long enough, though of course only a rather small number would be live at any given time. Best W. -- ------------------------------------------------------------------------ Wolfgang Bangerth email: bange...@colostate.edu www: http://www.math.colostate.edu/~bangerth/ -- The deal.II project is located at http://www.dealii.org/ For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en --- You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to dealii+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/0785d886-37ad-fde9-9355-9f2a8c56c095%40colostate.edu. -- The deal.II project is located at http://www.dealii.org/ For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en --- You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to dealii+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/BN7PR03MB43566B2237E4CD6CD05F4CE4ED849%40BN7PR03MB4356.namprd03.prod.outlook.com.