Re: SQL:2011 application time

Paul Jungwirth Tue, 10 Oct 2023 21:23:02 -0700

Hi Peter et al,

On 9/1/23 12:56, Paul Jungwirth wrote:

On 9/1/23 11:30, Peter Eisentraut wrote:
I think the WITHOUT OVERLAPS clause should be per-column, so thatsomething like UNIQUE (a WITHOUT OVERLAPS, b, c WITHOUT OVERLAPS)would be possible. Then the WITHOUT OVERLAPS clause would directlycorrespond to the choice between equality or overlaps operator percolumn.
I think allowing multiple uses of `WITHOUT OVERLAPS` (and in anyposition) is a great recommendation that enables a lot of newfunctionality.

I've been working on implementing this, but I've come to think it is thewrong way to go.

If we support this in primary key and unique constraints, then we mustalso support it for foreign keys and UPDATE/DELETE FOR PORTION OF. Butimplementing that logic is pretty tricky. For example take a foreign keyon (id, PERIOD valid_at, PERIOD asserted_at). We need to ensure thereferenced two-dimensional time space `contains` the referencingtwo-dimensional space. You can visualize a rectangle in two-dimensionalspace for each referencing record (which we validate one at a time). Thereferenced records must be aggregated and so form a polygon (of allright angles). For example the referencing record may be (1, [0,2),[0,2)) with referenced records of (1, [0,2), [0,1)) and (1, [0,1),[1,2)). (I'm using intranges since they're easier to read, but you couldimagine these as dateranges like [2000-01-01,2002-01-01).) Now therange_agg of their valid_ats is [0,2) and of their asserted_ats is[0,2). But the referenced 2d space still doesn't contain the referencingspace. It's got one corner missing. This is a well-known problem amonggame developers. We're lucky not to have arbitrary polygons, but it'sstill a tough issue.

Besides `contains` we also need to compute `overlaps` and `intersects`to support these temporal features. Implementing that for 2d, 3d, etclooks very complicated, for something that is far outside the normal usecase and also not part of the standard. It will cost a littleperformance for the normal 1d use case too.

I think a better approach (which I want to attempt as an add-on patch,not in this main series) is to support not just range types, but anytype with the necessary operators. Then you could have an mdrange(multi-dimensional range) or potentially even an arbitrary n-dimensionalpolygon. (PostGIS has something like this, but its `contains` operatorcompares (non-concave) *bounding boxes*, so it would not work for theexample above. Still the similarity between temporal and spatial data isstriking. I'm going to see if I can get some input from PostGIS folksabout how useful any of this is to them.) This approach would also letus use multiranges: not for multiple dimensions, but for non-contiguoustime spans stored in a single row. This puts the complexity in the typesthemselves (which seems more appropriate) and is ultimately moreflexible (supporting not just mdrange but also multirange, and otherthings too).

This approach also means that instead of storing a mask/list of whichcolumns use WITHOUT OVERLAPS, I can just store one attnum. Again, thissaves the common use-case from paying a performance penalty to support amuch rarer one.

I've still got my multi-WITHOUT OVERLAPS work, but I'm going to switchgears to what I've described here. Please let me know if you disagree!


Yours,

--
Paul              ~{:-)
p...@illuminatedcomputing.com

Re: SQL:2011 application time

Reply via email to