Re: SQL:2011 application time

Paul A Jungwirth Wed, 21 May 2025 09:36:09 -0700

Thank you for the review! I think we covered everything here at the
PgConf.dev Advanced Patch Feedback session, but for the sake of the
list I'll reply to the specifics. I'll send a longer email summarizing
more from that session later, probably in a few days.

On Fri, May 9, 2025 at 2:50 PM Robert Haas <robertmh...@gmail.com> wrote:
> I think these patches would benefit from some work to make them more
> understandable for people who don't already know what they're intended
> to accomplish.

I'll improve the commit messages in the next version of the patches. I
agree they have been pretty brief. Also we agreed that there should be
a separate documentation chapter/section introducing the concepts
behind temporal tables and application time. That will help in
general, plus it will give the existing doc changes a place to
reference for new terms.

> The rest of the
> commit message for v51-0001 reads like this:
>
> The procs return SETOF their input type and work like minus but don't
> fail on splits. The results never contain empty elements.
>
> For someone familiar with this work, this probably makes sense, but
> I'm not and it doesn't. I don't even understand to what "work like
> minus but don't fail on splits" is intended to refer. I mean, is there
> some existing minus thing? I don't see it among the GIST_*_PROC
> symbols, and I don't know where else I'd go looking.

Yes, I think I will rename these functions to
{,multi}range_minus_multi. I'm trying to say that they work like
range_minus (typically used via the minus operator). But range_minus
raises an exception if you subtract from the middle of its first
parameter/operand, because the result would yield two separate ranges.
range_minus_multi is a set-returning function to avoid that.

You can't return an array of ranges, because where our polymorphic
type system sees an input parameter of anyrange with a return type of
anyarray, it treats them as range<T> -> array<T>, not range<T> ->
array<range<T>>.

I don't want to return a multirange here either because that design
locks us in to range types, and I'd rather leave the door open to
support arbitrary user-defined types. A SRF means the type can bring
its own function to generate as many "leftovers" as it needs.

Also it sounds like an opclass support proc is not needed here at all.
There is no index involved (or there could be several). Perhaps a type
support proc. But for now we can support just range & multirange with
neither.

> As I read through the documentation changes, I find that you use terms
> like "leftovers" or "leftover row" in various parts of the
> documentation that are quite far away from and not obviously linked to
> the documentation of FOR PORTION OF. I think that's going to be very
> confusing. In other places where we used specialized terms like this,
> we often using <xref> or <link> to reference the place where the term
> is defined. You don't really quite have a place where that happens,
> though, although maybe linking to the documentation of FOR PORTION OF
> would be good enough.

Okay, see above. We also discussed using a term that better signals
its context, like "temporal range leftovers". I think I like just
"temporal leftovers" better, or maybe "temporal update leftovers" or
"temporal delete leftovers", depending on the operation.

> The documentation of FOR PORTION OF makes more sense to me for range
> types than for multirange types.

Okay, this is good feedback. I'll work on it!

> I gather that if I have a range like
> [1,10] and I deleted from 3 to 6, I'm going to end up with two ranges.
> I'm not sure whether 3 or 6 are inclusive or exclusive bounds, so I
> don't know if I should expect to end up with [1,3) and (6,10] or [1,3]
> and [6,10], and I kind of wonder if I should get to say which one I
> want, but anyway now I have two records, my original one having been
> split. But if [1,10] is a multirange, it's unnecessary to split the
> record in two in order to accommodate a deletion: i can just include
> both sub-ranges in the original value. But the documentation doesn't
> seem to mention this one way or the other: will I still (needlessly?)
> create two records, or will I just update the one record I already and
> split the range? Since the documentation is already talking about a
> feature specific to range and multirange types, it seems like this
> kind of stuff should be mentioned.

A multirange always returns just 0 or 1 result from without_portion
aka multirange_minus_multi. As you say, 2 results are unneeded.

> Hmm. I guess this implies that we never do an update -- it's always a
> DELETE followed by zero, one, or two INSERTs. That seems like it's
> leaving quite a bit of efficiency on the table, because updating an
> existing row in place could potentially be a HOT update.
>
> Is this feature as per the SQL standard? And, assuming yes, does the
> SQL standard specify that permission checks should work as you
> describe here, or is that something we decided?

This is not quite right. We never delete then insert the existing row
(except in the usual MVCC sense). The standard is very clear here: a
temporal update always updates the existing row (and automatically
changes its start/end time to not extend beyond the targeted range),
and a temporal delete always deletes the existing row. But if the
row's start/end times extended beyond just the targeted range, there
is still history your update/delete wasn't supposed to touch. So the
update/delete should be followed by 0, 1, or 2 inserts to preserve the
row's original values for those portions of history. I'll include this
in the docs I write.

Where the standard is not very clear is how to fire triggers for the
inserts. I'll leave those details for the next email.

The standard doesn't say anything about permissions either. Personally
I feel that since the inserts aren't adding any history that wasn't
there already, we shouldn't require insert permission. Also if we did
require insert permission, there would be no way to grant someone
access to correct existing history without adding new history. But I
don't have a strong opinion here. As my comment mentioned, perhaps
there is a security flaw, since skipping the insert permission check
means insert triggers fire for unauthorized users. Whatever we decide,
permissions and triggers should ideally cooperate to yield a
consistent mental model.

For that next email I'll start a separate thread, since this one has
gotten so long. Also I think this project is overdue for a wiki page.

Yours,

--
Paul              ~{:-)
p...@illuminatedcomputing.co

Re: SQL:2011 application time

Reply via email to