On Tue, Nov 21, 2017 at 4:36 AM, Peter Moser <pitiz...@gmail.com> wrote: > Hi hackers, > we like to rethink our approach... > > For simplicity I'll drop ALIGN for the moment and focus solely on NORMALIZE: > > SELECT * FROM (R NORMALIZE S ON R.x = S.y WITH (R.time, S.time)) c; > > Our normalization executor node needs the following input (for now > expressed in plain SQL): > > SELECT R.*, p1 > FROM (SELECT *, row_id() OVER () rn FROM R) R > LEFT OUTER JOIN ( > SELECT y, LOWER(time) p1 FROM S > UNION > SELECT y, UPPER(time) p1 FROM S > ) S > ON R.x = S.y AND p1 <@ R.time > ORDER BY rn, p1; > > In other words: > 1) The left subquery adds an unique ID to each tuple (i.e., rn). > 2) The right subquery creates two results for each input tuple: one for > the upper and one for the lower bound of each input tuple's valid time > column. The boundaries get put into a single (scalar) column, namely p1. > 3) We join both subqueries if the normalization predicates hold (R.x = S.y) > and p1 is inside the time of the current outer tuple. > 4) Finally, we sort the result by the unique ID (rn) and p1, and give all > columns of the outer relation, rn and p1 back.
So, if I understand correctly, there are three possible outcomes. If S.time and R.time are non-overlapping, then a null-extended row is produced with the original data from R. If S.time overlaps R.time but is not contained within it, then this produces one result row, not null-extended -- either the upper or lower bound will found to be contained within R.time, but the other will not. If S.time overlaps R.time but is not contained within it, then this produces 2 result rows, one with p1 containing S.time's lower bound and one with p1 containing S.time's upper bound. Is that right? Then, after all this happens, the temporal normalizer code runs over the results and uses the p1 values as split points for the ranges from R. I wonder if we could instead think about R NORMALIZE S ON R.x = S.y WITH (R.time, S.time) as an ordinary join on R.x = S.y with some extra processing afterwards. After finding all the join partners for a row in R, extract all the lower and upper bounds from the S.time fields of the join partners and use that to emit as many rows from R as may be necessary. The main problem that I see with that approach is that it seems desirable to separate the extra-processing-afterwards step into a separate executor node, and there's no way for a separate type of plan node to know when the join advances the outer side of the join. That's the purpose that row_id() is serving as you have it; we'd have to invent some other kind of signalling mechanism, which might not be entirely trivial. :-( If we could do it, though, it might be quite a bit more efficient, because it would avoid scanning S twice and performing a UNION on the results of those scans. Also, we wouldn't necessarily need to sort the whole set of rows from R, which I suspect is unavoidable in your implementation. We'd just need to sort the individual groups of rows from S, and my guess is that in many practical cases those groups are fairly small. I wonder what identities hold for NORMALIZE. It does not seem to be commutative, but it looks like it might be associative... i.e. (R NORMALIZE S) NORMALIZE T produces the same output as R NORMALIZE (S NORMALIZE T), perhaps? > Our first attempt to understand the new approach would be as follows: The > left base rel of the inner left-outer-join can be expressed as a WindowAgg > node. However, the right query of the join is much more difficult to build > (maybe through hash aggregates). Both queries could be put together with a > MergeJoin for instance. However, if we create the plan tree by hand and > choose algorithms for it manually, how is it possible to have it optimized > later? Or, if that is not possible, how do we choose the best algorithms > for it? You don't want to generate plan nodes directly like this, I think. Generally, you create paths, and the cheapest path wins at the end of planning. The thing that makes this tricky is that the temporal normalization phase can be separated from the inner left-outer-join by other joins, and the planner doesn't really have a good way of representing that concept. More broadly, the very idea of creating plan nodes suggests that you are approaching this from the point of view of sticking the logic in near the end of the planning process, which I think is not going to work. Whatever is going to happen here needs to happen at path generation time at the latest, or maybe even earlier, before the optimizer even starts processing the join tree. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company