On Wed, Feb 28, 2018 at 8:53 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Tue, Feb 27, 2018 at 5:07 PM, Peter Geoghegan <p...@bowt.ie> wrote: >> I now feel like Simon's suggestion of throwing an error in corner >> cases isn't so bad. It still seems like we could do better, but the >> more I think about it, the less that seems like a cop-out. My reasons >> are: > > I still think we really ought to try not to add a new class of error.
I do too. However, it's only fair to acknowledge that the lack of any strong counterproposal after a month or so tells us something. >> * We can all agree that *not* raising an error in the specific way >> Simon proposes is possible, somehow or other. We also all agree that >> avoiding the broader category of RC errors can only be taken so far >> (e.g. in any event duplicate violations errors are entirely possible, >> in RC mode, when a MERGE inserts a row). So this is a question of what >> exact middle ground to take. Neither of the two extremes (throwing an >> error on the first sign of a RC conflict, and magically preventing >> concurrency anomalies) are actually on the table. > > Just because there's no certainty about which behavior is best doesn't > mean that none of them are better than throwing an error. Bear in mind that the patch doesn't throw an error at the first sign of trouble. It throws an error after doing an EPQ style walk, which individually verifies the extra "WHEN ... AND" quals for every tuple in the update chain (ExecUpdate()/ExecDelete() return after first EPQ tuple check for MERGE, so that an ExecMerge() caller can do that extra part with special EPQ slot). The controversial serialization error only comes when the original basic assessment of MATCHED needs to change, without regard to extra "WHEN ... AND" quals (and only when the MERGE statement was written in such a way that that matters -- it must have a WHEN NOT MATCHED clause, too). AFAICT, the patch will only throw an error when the join quals no longer pass -- not when the extra "WHEN ... AND" quals no longer pass. That would be far worse, IMV. ExecMerge() will retry until it reaches the end of the update chain, possibly changing its mind about which particular WHEN case applies repeatedly, going back and forth between mergeActions (WHEN cases) as the update chain in walked. The patch can do a lot to roll with conflicts before it gives up. Conflicts are generally caused by concurrent DELETEs, so you could say that this offers a kind of symmetry with concurrent INSERTs causing duplicate violation errors. (Concurrent UPDATEs that change the join qual values could provoke the error too, but presumably we're joining on the PK in most cases, so that seems much less likely than a simple DELETE.) Bear in mind that both the SQL Server and Oracle implementations have many significant restrictions/caveats around doing something cute with the main join quals. It's not surprising that we'd have something like that, too. According to Rahila, Oracle doesn't allow MERGE to update the columns in the join qual at all, and only allows a single WHEN MATCHED case (a DELETE can only come after an UPDATE in Oracle, oddly enough). SQL Server's docs have a warning that states: "Do not attempt to improve query performance by filtering out rows in the target table in the ON clause, such as by specifying AND NOT target_table.column_x = value. Doing so may return unexpected and incorrect results.". What does that even mean!? I am slightly concerned that the patch may still have livelock hazards in its approach to RC conflict handling. I haven't studied that aspect in enough detail, though. Since we always make forward progress by following an update chain, any problem here is probably fixable. -- Peter Geoghegan