On Mon, Jun 14, 2021 at 2:32 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > Why do you think we don't need to check index AM functions? Say we > have an index expression that uses function and if its parallel safety > is changed then probably that also impacts whether we can do insert in > parallel. Because otherwise, we will end up executing some parallel > unsafe function in parallel mode during index insertion.
I'm not saying that we don't need to check index expressions. I agree that we need to check those. The index AM functions are things like btint4cmp(). I don't think that a function like that should ever be parallel-unsafe. > Yeah, this could be one idea but I think even if we use pg_proc OID, > we still need to check all the rel cache entries to find which one > contains the invalidated OID and that could be expensive. I wonder > can't we directly find the relation involved and register invalidation > for the same? We are able to find the relation to which trigger > function is associated during drop function via findDependentObjects > by scanning pg_depend. Assuming, we are able to find the relation for > trigger function by scanning pg_depend, what kinds of problems do we > envision in registering the invalidation for the same? I don't think that finding the relation involved and registering an invalidation for the same will work properly. Suppose there is a concurrently-running transaction which has created a new table and attached a trigger function to it. You can't see any of the catalog entries for that relation yet, so you can't invalidate it, but invalidation needs to happen. Even if you used some snapshot that can see those catalog entries before they are committed, I doubt it fixes the race condition. You can't hold any lock on that relation, because the creating transaction holds AccessExclusiveLock, but the whole invalidation mechanism is built around the assumption that the sender puts messages into the shared queue first and then releases locks, while the receiver first acquires a conflicting lock and then processes messages from the queue. Without locks, that synchronization algorithm can't work reliably. As a consequence of all that, I believe that, not just in this particular case but in general, the invalidation message needs to describe the thing that has actually changed, rather than any derived property. We can make invalidations that say "some function's parallel-safety flag has changed" or "this particular function's parallel-safety flag has changed" or "this particular function has changed in some way" (this one, we have already), but anything that involves trying to figure out what the consequences of such a change might be and saying "hey, you, please update XYZ because I changed something somewhere that could affect that" is not going to be correct. > I think we probably need to worry about the additional cost to find > dependent objects and if there are any race conditions in doing so as > pointed out by Tom in his email [1]. The concern related to cost could > be addressed by your idea of registering such an invalidation only > when the user changes the parallel safety of the function which we > don't expect to be a frequent operation. Now, here the race condition, > I could think of could be that by the time we change parallel-safety > (say making it unsafe) of a function, some of the other sessions might > have already started processing insert on a relation where that > function is associated via trigger or check constraint in which case > there could be a problem. I think to avoid that we need to acquire an > Exclusive lock on the relation as we are doing in Rename Policy kind > of operations. Well, the big issue here is that we don't actually lock functions while they are in use. So there's absolutely nothing that prevents a function from being altered in any arbitrary way, or even dropped, while code that uses it is running. I don't really know what happens in practice if you do that sort of thing: can you get the same query to run with one function definition for the first part of execution and some other definition for the rest of execution? I tend to doubt it, because I suspect we cache the function definition at some point. If that's the case, caching the parallel-safety marking at the same point seems OK too, or at least no worse than what we're doing already. But on the other hand if it is possible for a query's notion of the function definition to shift while the query is in flight, then this is just another example of that and no worse than any other. Instead of changing the parallel-safety flag, somebody could redefine the function so that it divides by zero or produces a syntax error and kaboom, running queries break. Either way, I don't see what the big deal is. As long as we make the handling of parallel-safety consistent with other ways the function could be concurrently redefined, it won't suck any more than the current system already does, or in any fundamentally new ways. Even if this line of thinking is correct, there's a big issue for partitioning hierarchies because there you need to know stuff about relations that you don't have any other reason to open. I'm just arguing that if there's no partitioning, the problem seems reasonably solvable. Either you changed something about the relation, in which case you've got to lock it and issue invalidations, or you've changed something about the function, which could be handled via a new type of invalidation. I don't really see why the cost would be particularly bad. Suppose that for every relation, you have a flag which is either PARALLEL_DML_SAFE, PARALLEL_DML_RESTRICTED, PARALLEL_DML_UNSAFE, or PARALLEL_DML_SAFETY_UNKNOWN. When someone sends a message saying "some existing function's parallel-safety changed!" you reset that flag for every relation in the relcache to PARALLEL_DML_SAFETY_UNKNOWN. Then if somebody does DML on that relation and we want to consider parallelism, it's got to recompute that flag. None of that sounds horribly expensive. I mean, it could be somewhat annoying if you have 100k relations open and sit around all day flipping parallel-safety markings on and off and then doing a single-row insert after each flip, but if that's the only scenario where we incur significant extra overhead from this kind of design, it seems clearly better than forcing users to set a flag manually. Maybe it isn't, but I don't really see what the other problem would be right now. Except, of course, for partitioning, which I'm not quite sure what to do about. -- Robert Haas EDB: http://www.enterprisedb.com