Hi Jim, On Fri, 21 Jun 2019 08:41:11 -0700 (MST) Jim Finnerty <jfinn...@amazon.com> wrote:
> Hi Yugo, > > I'd like to compare the performance of your MV refresh algorithm versus > an approach that logs changes into an mv log table, and can then apply the > changes at some later point in time. I'd like to handle the materialized > join view (mjv) case first, specifically a 2-way left outer join, with a UDF > in the SELECT list of the mjv. Do you mean you have your implementation of IVM that using log tables? I'm so interested in this, and I would appreciate it if you explain the detail. > Does your refresh algorithm handle mjv's with connected join graphs that > consist entirely of inner and left outer joins? > If so, I'd like to measure the overhead of your refresh algorithm on > pgbench, modified to include an mjv, versus a (hand coded) incremental > maintenance algorithm that uses mv log tables populated by ordinary > triggers. We may also want to look at capturing the deltas using logical > replication, which ought to be faster than a trigger-based solution. In the current our implementation, outer joins is not yet supported though we plan to handle this in future. So,we would not be able to compare these directly in the same workload in the current status. However, the current our implementation supports only the way to update materialized views in a trigger, and the performance of modifying base tables will be lower than the approach which uses log tables. This is because queries to update materialized views are issued in the trigger. This is not only a overhead itself, but also takes a lock on a materialized view, which has an ] impact on concurrent execution performance. In the previous our PoC, we implemented IVM using log tables, in which logs are captured by triggers and materialized views are update incrementally by a user command[1]. However, to implement log table approach, we need a infrastructure to maintain these logs. For example, which logs are necessary and which logs can be discarded, etc. We thought this is not trivial work, so we decided to start from the current approach which doesn't use log tables. We are now preparing to implement this in the next step because this is also needed to support deferred maintenance of views. [1] https://www.postgresql.eu/events/pgconfeu2018/schedule/session/2195-implementing-incremental-view-maintenance-on-postgresql/ I agree that capturing the deltas using logical decoding will be faster than using a trigger although we haven't yet consider this well. Best regadrds, Yugo Nagata -- Yugo Nagata <nag...@sraoss.co.jp>