po 17. 12. 2018 v 15:32 odesÃlatel Corey Huinker <corey.huin...@gmail.com> napsal:
> Back when Pg added statement-level triggers, I was interested in the > potential promise of moving referential integrity checks to statement-level > triggers. > > The initial conversation, along with Kevin Grittner's POC script (in SQL) > that showed a potential for a 98% reduction in time spent doing RI checks. > The original thread is here: > > > https://www.postgresql.org/message-id/CACjxUsM4s9%3DCUmPU4YFOYiD5f%3D2ULVDBjuFSo20Twe7KbUe8Mw%40mail.gmail.com > > I dug around in the code, and was rather surprised at how close we already > are to implementing this. The function RI_Initial_Check() already does a > left-join query via SPI to look for any invalid data, so if we could just > replace the near table with the transition table for inserted rows, we'd be > home free. The function SPI_register_trigger_data() makes the transition > tables visible to SPI, so I started to wonder why this hadn't be done > already. > > I approached Kevin and Thomas Munro seeking feedback on my approach. I > also made it into a session at the PgConf.ASIA un-conference, and then > later with Michael Paquier at that same conference, and the coalesced > feedback was this: > > - the overhead of registering the transition tables probably makes it > unprofitable for single row inserts > - the single row overhead is itself significant, so maybe the transition > tables aren't worse > - there has been talk of replacing transition tables with an in-memory > data structure that would be closer to "free" from a startup perspective > and might even coalesce the transition tables of multiple statements in the > same transaction > - because no declarative code changes, it's trivial to switch from row > level to statement level triggering via pg_upgrade > - assuming that transition tables are an overhead that only pays off when > > N rows have been updated, does it make sense to enforce RI with something > that isn't actually a trigger? > - there was also some mention that parallel query uses a queue mechanism > that might be leveraged to do row-level triggers for updates of <= N rows > and statement level for > N > > That's what I have so far. I'm going to be working on a POC patch so that > I can benchmark a pure-statement-level solution, which if nothing else will > let us know the approximate value of N. > > All suggestions are appreciated. > It is great. I though little bit about it - just theoretically. ROW trigger call RI check too often, and statement trigger too less. I think so ideal design can be call RI check every 10K rows. I think so can be unfriendly if somebody does very long import and it fails on the end. I don't think so there should not be any performance difference, if RI check is called per 1000 or more rows. Regards Pavel