Hi Peter et al,

On 9/1/23 12:56, Paul Jungwirth wrote:
On 9/1/23 11:30, Peter Eisentraut wrote:
I think the WITHOUT OVERLAPS clause should be per-column, so that something like UNIQUE (a WITHOUT OVERLAPS, b, c WITHOUT OVERLAPS) would be possible.  Then the WITHOUT OVERLAPS clause would directly correspond to the choice between equality or overlaps operator per column.
I think allowing multiple uses of `WITHOUT OVERLAPS` (and in any position) is a great recommendation that enables a lot of new functionality.

I've been working on implementing this, but I've come to think it is the wrong way to go.

If we support this in primary key and unique constraints, then we must also support it for foreign keys and UPDATE/DELETE FOR PORTION OF. But implementing that logic is pretty tricky. For example take a foreign key on (id, PERIOD valid_at, PERIOD asserted_at). We need to ensure the referenced two-dimensional time space `contains` the referencing two-dimensional space. You can visualize a rectangle in two-dimensional space for each referencing record (which we validate one at a time). The referenced records must be aggregated and so form a polygon (of all right angles). For example the referencing record may be (1, [0,2), [0,2)) with referenced records of (1, [0,2), [0,1)) and (1, [0,1), [1,2)). (I'm using intranges since they're easier to read, but you could imagine these as dateranges like [2000-01-01,2002-01-01).) Now the range_agg of their valid_ats is [0,2) and of their asserted_ats is [0,2). But the referenced 2d space still doesn't contain the referencing space. It's got one corner missing. This is a well-known problem among game developers. We're lucky not to have arbitrary polygons, but it's still a tough issue.

Besides `contains` we also need to compute `overlaps` and `intersects` to support these temporal features. Implementing that for 2d, 3d, etc looks very complicated, for something that is far outside the normal use case and also not part of the standard. It will cost a little performance for the normal 1d use case too.

I think a better approach (which I want to attempt as an add-on patch, not in this main series) is to support not just range types, but any type with the necessary operators. Then you could have an mdrange (multi-dimensional range) or potentially even an arbitrary n-dimensional polygon. (PostGIS has something like this, but its `contains` operator compares (non-concave) *bounding boxes*, so it would not work for the example above. Still the similarity between temporal and spatial data is striking. I'm going to see if I can get some input from PostGIS folks about how useful any of this is to them.) This approach would also let us use multiranges: not for multiple dimensions, but for non-contiguous time spans stored in a single row. This puts the complexity in the types themselves (which seems more appropriate) and is ultimately more flexible (supporting not just mdrange but also multirange, and other things too).

This approach also means that instead of storing a mask/list of which columns use WITHOUT OVERLAPS, I can just store one attnum. Again, this saves the common use-case from paying a performance penalty to support a much rarer one.

I've still got my multi-WITHOUT OVERLAPS work, but I'm going to switch gears to what I've described here. Please let me know if you disagree!

Yours,

--
Paul              ~{:-)
p...@illuminatedcomputing.com


Reply via email to