Re: please define 'statement' in the glossary

2025-07-13 Thread Laurenz Albe
On Fri, 2025-07-11 at 10:34 +, PG Doc comments form wrote:
> the glossary (https://www.postgresql.org/docs/current/glossary.html) does
> not define 'statement' but a definition would be important to understand
> other chapters such as this sentence: "statement_timestamp() returns the
> start time of the current statement (more specifically, the time of receipt
> of the latest command message from the client)." The sentence is from
> https://www.postgresql.org/docs/17/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT.
> From the same page is this snippet: "start of current statement", in Table
> 9.33. Date/Time Functions.
> It seems 'statement' can mean different things such as 'one operation, such
> as a SELECT statement', but it can also mean 'all operations since the last
> command message from the client, which can be several SELECTs'.

In the setting of a relational database system like PostgreSQL, "statement"
means "SQL statement".  I don't think that that requires a glossary entry.
"SQL statement" is pretty well defined.

I guess your difficulties come from the remark in parentheses, which is
intended to clarify, but clearly does not quite succeed in that.

After looking at the code, I guess what made Tom add the remark in commit
eaf8f312c754 was the fact that an SQL statement is not necessarily processed
in a single go: with the extended query protocol (see chapter 52.2.3),
there is a "parse", a "bind" and an "execute" message from the client, and
each one sets the timestamp reported by statement_timestamp() to a new
value.  So, technically, statement_timestamp() has a different value when
the statement is parsed than when it is executed.

However, what matters to the client is the value when the statement starts
executing, because that's the value that will be reported.

So I'd argue that we should remove the parenthetical remark.  It confuses
more than it enlightens, and whoever needs to know that level of detail
had better read the code anyway.

Yours,
Laurenz Albe




Re: Unexpected behaviour: it was documented to return the same value

2025-07-13 Thread David G. Johnston
On Sunday, July 13, 2025, KES  wrote:

> I suppose that DEFAULT syntax should be used to set setting to its default
> value. Eg.
> set_config( 'my.option', DEFAULT, false )
> This is clear what should happen here. Also this syntax will correspond to
> other places, eg. INSERT INTO xx (name) VALUES (DEFAULT).
>
> Using NULL to set reset the value to the default is very very confusing.
> set_config( 'my.option', NULL, false )
> Expected: this call should make option unrecognizable again.
>

I don’t see us changing this even if someone were willing to write a
patch.  You will have to make do with the behavior that exists.

David J.


Re: please define 'statement' in the glossary

2025-07-13 Thread Tom Lane
Laurenz Albe  writes:
> After looking at the code, I guess what made Tom add the remark in commit
> eaf8f312c754 was the fact that an SQL statement is not necessarily processed
> in a single go: with the extended query protocol (see chapter 52.2.3),
> there is a "parse", a "bind" and an "execute" message from the client, and
> each one sets the timestamp reported by statement_timestamp() to a new
> value.  So, technically, statement_timestamp() has a different value when
> the statement is parsed than when it is executed.

> However, what matters to the client is the value when the statement starts
> executing, because that's the value that will be reported.

> So I'd argue that we should remove the parenthetical remark.  It confuses
> more than it enlightens, and whoever needs to know that level of detail
> had better read the code anyway.

After re-reading that text, I feel like the parenthetical remark is
fine, and the real problem is that I used "statement" and "command"
more or less interchangeably in successive sentences.  Perhaps
s/command/statement/g throughout the paragraph would improve matters?
Although "statement message" doesn't feel right, so maybe leave that
one alone.

regards, tom lane




Re: Unexpected behaviour: it was documented to return the same value

2025-07-13 Thread Tom Lane
KES  writes:
> I would prefer to remove the option from the configuration list
> instead of having it reset to empty string.

That is not the definition, and we'd be very unlikely to change
it to do that, because it would make no sense for built-in or
extension-defined settings.

regards, tom lane




Re: correction suggestion for https://www.postgresql.org/docs/17/auth-username-maps.html

2025-07-13 Thread Tom Lane
I figured out how to make the 003_peer.pl tests for \1 less
hacky, and pushed that.  Here's a proposed patch for the
documentation side of things, including your suggestion to
make pg_ident.conf.sample match up better.

regards, tom lane

diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index 832b616a7bb..51b95ed04f3 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -1003,8 +1003,9 @@ local   db1,db2,@demodbs  all   md5
the remainder of the field is treated as a regular expression.
(See  for details of
PostgreSQL's regular expression syntax.)  The regular
-   expression can include a single capture, or parenthesized subexpression,
-   which can then be referenced in the database-username
+   expression can include a single capture, or parenthesized subexpression.
+   The portion of the system user name that matched the capture can then
+   be referenced in the database-username
field as \1 (backslash-one).  This allows the mapping of
multiple user names in a single line, which is particularly useful for
simple syntax substitutions.  For example, these entries
@@ -1022,12 +1023,11 @@ mymap   /^(.*)@otherdomain\.com$   guest
   
If the database-username field starts with
a slash (/), the remainder of the field is treated
-   as a regular expression (see 
-   for details of PostgreSQL's regular
-   expression syntax). It is not possible to use \1
-   to use a capture from regular expression on
-   system-username for a regular expression
-   on database-username.
+   as a regular expression.
+   When the database-username field is a regular
+   expression, it is not possible to use \1 within it to
+   refer to a capture from the system-username
+   field.
   
 
   
diff --git a/src/backend/libpq/pg_ident.conf.sample b/src/backend/libpq/pg_ident.conf.sample
index f5225f26cdf..8ee6c0ba315 100644
--- a/src/backend/libpq/pg_ident.conf.sample
+++ b/src/backend/libpq/pg_ident.conf.sample
@@ -13,25 +13,25 @@
 # user names to their corresponding PostgreSQL user names.  Records
 # are of the form:
 #
-# MAPNAME  SYSTEM-USERNAME  PG-USERNAME
+# MAPNAME  SYSTEM-USERNAME  DATABASE-USERNAME
 #
 # (The uppercase quantities must be replaced by actual values.)
 #
 # MAPNAME is the (otherwise freely chosen) map name that was used in
 # pg_hba.conf.  SYSTEM-USERNAME is the detected user name of the
-# client.  PG-USERNAME is the requested PostgreSQL user name.  The
-# existence of a record specifies that SYSTEM-USERNAME may connect as
-# PG-USERNAME.
+# client.  DATABASE-USERNAME is the requested PostgreSQL user name.
+# The existence of a record specifies that SYSTEM-USERNAME may connect
+# as DATABASE-USERNAME.
 #
-# If SYSTEM-USERNAME starts with a slash (/), it will be treated as a
-# regular expression.  Optionally this can contain a capture (a
-# parenthesized subexpression).  The substring matching the capture
-# will be substituted for \1 (backslash-one) if present in
-# PG-USERNAME.
+# If SYSTEM-USERNAME starts with a slash (/), the rest of it will be
+# treated as a regular expression.  Optionally this can contain a capture
+# (a parenthesized subexpression).  The substring matching the capture
+# will be substituted for \1 (backslash-one) if that appears in
+# DATABASE-USERNAME.
 #
-# PG-USERNAME can be "all", a user name, a group name prefixed with "+", or
-# a regular expression (if it starts with a slash (/)).  If it is a regular
-# expression, the substring matching with \1 has no effect.
+# DATABASE-USERNAME can be "all", a user name, a group name prefixed with "+",
+# or a regular expression (if it starts with a slash (/)).  If it is a regular
+# expression, no substitution for \1 will occur.
 #
 # Multiple maps may be specified in this file and used by pg_hba.conf.
 #
@@ -69,4 +69,4 @@
 # Put your actual configuration here
 # --
 
-# MAPNAME   SYSTEM-USERNAME PG-USERNAME
+# MAPNAME   SYSTEM-USERNAME DATABASE-USERNAME


Re: correction suggestion for https://www.postgresql.org/docs/17/auth-username-maps.html

2025-07-13 Thread David G. Johnston
On Sun, Jul 13, 2025 at 11:30 AM Tom Lane  wrote:

> I figured out how to make the 003_peer.pl tests for \1 less
> hacky, and pushed that.  Here's a proposed patch for the
> documentation side of things, including your suggestion to
> make pg_ident.conf.sample match up better.
>
>
Thanks.  My goal of trying to be a bit more precise regarding the \1
reference is probably counter-productive.  The existing wording makes me
ask "why" to which the answer is "because if database-username is a regexp
a reference to \1 resolves to any capturing groups it defines; and if there
are none the regexp will be malformed and break when you attempt to
reload pg_hba.conf".  But that is a lot of words for something that is
unlikely to be encountered in practice and does distract the reader from
the main point.

(Likewise, the system-username regexp can contain more than one capture -
which are only available later in the system-username regexp - though again
it seems unlikely anyone is going to use that feature in this context.)

I'm good with this.

Thanks!

David J.


Re: correction suggestion for https://www.postgresql.org/docs/17/auth-username-maps.html

2025-07-13 Thread Tom Lane
"David G. Johnston"  writes:
> (Likewise, the system-username regexp can contain more than one capture -
> which are only available later in the system-username regexp - though again
> it seems unlikely anyone is going to use that feature in this context.)

Yeah, I was thinking about that while working on the multiple-\1
patch.  Sooner or later somebody is going to ask why they can't
use \2, \3, etc in the database-username.  I think it would be a
pretty minor finger exercise to make the new code do that, but
I refrained for now.

> I'm good with this.

Cool.  I'll push after the beta2 release freeze lifts.

regards, tom lane




Re: Unexpected behaviour: it was documented to return the same value

2025-07-13 Thread KES
I would prefer to remove the option from the configuration list instead of having it reset to empty string. Eg. set_config( 'my.option', NULL, false ) Having it to return the empty string cause differences in the behaviour when parameter is not recognized at all and when it setup to default. How to make it "unrecognizable" again using set_config/RESET? The exact problem I am trying to resolve was asked here: https://stackoverflow.com/q/50923911/4632019 
Кому: kes-...@yandex.ru (kes-...@yandex.ru), pgsql-docs@lists.postgresql.org (pgsql-docs@lists.postgresql.org);

Тема: Unexpected behaviour: it was documented to return the same value;
07.07.2025, 17:40, "David G. Johnston" :
On Mon, Jul 7, 2025 at 6:51 AM David G. Johnston  wrote:On Saturday, July 5, 2025, PG Doc comments form  wrote:The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/17/functions-admin.html
Description:Yeah, this clarification just wasn't back-patched.https://www.postgresql.org/docs/18/functions-admin.html#FUNCTIONS-ADMIN-SET"set_config accepts the NULL value for new_value, but as settings cannot be null, it is interpreted as a request to reset the setting to its default value."David J.


Re: Unexpected behaviour: it was documented to return the same value

2025-07-13 Thread KES
I suppose that DEFAULT syntax should be used to set setting to its default value. Eg.set_config( 'my.option', DEFAULT, false )This is clear what should happen here. Also this syntax will correspond to other places, eg. INSERT INTO xx (name) VALUES (DEFAULT). Using NULL to set reset the value to the default is very very confusing.set_config( 'my.option', NULL, false )Expected: this call should make option unrecognizable again.
Кому: David G. Johnston (david.g.johns...@gmail.com), pgsql-docs@lists.postgresql.org (pgsql-docs@lists.postgresql.org);

Тема: Unexpected behaviour: it was documented to return the same value;
13.07.2025, 16:34, "KES" :
I would prefer to remove the option from the configuration list instead of having it reset to empty string. Eg. set_config( 'my.option', NULL, false ) Having it to return the empty string cause differences in the behaviour when parameter is not recognized at all and when it setup to default. How to make it "unrecognizable" again using set_config/RESET? The exact problem I am trying to resolve was asked here: https://stackoverflow.com/q/50923911/4632019 
Кому: kes-...@yandex.ru (kes-...@yandex.ru), pgsql-docs@lists.postgresql.org (pgsql-docs@lists.postgresql.org);

Тема: Unexpected behaviour: it was documented to return the same value;
07.07.2025, 17:40, "David G. Johnston" :
On Mon, Jul 7, 2025 at 6:51 AM David G. Johnston  wrote:On Saturday, July 5, 2025, PG Doc comments form  wrote:The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/17/functions-admin.html
Description:Yeah, this clarification just wasn't back-patched.https://www.postgresql.org/docs/18/functions-admin.html#FUNCTIONS-ADMIN-SET"set_config accepts the NULL value for new_value, but as settings cannot be null, it is interpreted as a request to reset the setting to its default value."David J.


Re: please define 'statement' in the glossary

2025-07-13 Thread Laurenz Albe
On Sun, 2025-07-13 at 11:27 -0400, Tom Lane wrote:
> Laurenz Albe  writes:
> > After looking at the code, I guess what made Tom add the remark in commit
> > eaf8f312c754 was the fact that an SQL statement is not necessarily processed
> > in a single go: with the extended query protocol (see chapter 52.2.3),
> > there is a "parse", a "bind" and an "execute" message from the client, and
> > each one sets the timestamp reported by statement_timestamp() to a new
> > value.  So, technically, statement_timestamp() has a different value when
> > the statement is parsed than when it is executed.
> 
> > However, what matters to the client is the value when the statement starts
> > executing, because that's the value that will be reported.
> 
> > So I'd argue that we should remove the parenthetical remark.  It confuses
> > more than it enlightens, and whoever needs to know that level of detail
> > had better read the code anyway.
> 
> After re-reading that text, I feel like the parenthetical remark is
> fine, and the real problem is that I used "statement" and "command"
> more or less interchangeably in successive sentences.  Perhaps
> s/command/statement/g throughout the paragraph would improve matters?
> Although "statement message" doesn't feel right, so maybe leave that
> one alone.

Changing "command" to "statement" would be a good move.

I guess I get the remark now: it wants to say that
a) statement_timestamp() shows the time when the client message that
   started the execution of the current SQL statement reached the server
and
b) the timestamp isn't reset for nested statements

Perhaps the remark should say "protocol message" or "frontend-backend
protocol message" to make clear that we are not talking about an
SQL statement here.

Yours,
Laurenz Albe




Re: please define 'statement' in the glossary

2025-07-13 Thread Tom Lane
Laurenz Albe  writes:
> Perhaps the remark should say "protocol message" or "frontend-backend
> protocol message" to make clear that we are not talking about an
> SQL statement here.

Yeah, that seems like a good idea.

regards, tom lane




Re: please define 'statement' in the glossary

2025-07-13 Thread Tom Lane
... so concretely, about like this?

regards, tom lane

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 6b327d4fd81..ed30dbe7ccd 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -11244,13 +11244,13 @@ now()
 CURRENT_TIMESTAMP, but is named to clearly reflect
 what it returns.
 statement_timestamp() returns the start time of the current
-statement (more specifically, the time of receipt of the latest command
+statement (more specifically, the time of receipt of the latest protocol
 message from the client).
 statement_timestamp() and transaction_timestamp()
-return the same value during the first command of a transaction, but might
-differ during subsequent commands.
+return the same value during the first statement of a transaction, but might
+differ during subsequent statements.
 clock_timestamp() returns the actual current time, and
-therefore its value changes even within a single SQL command.
+therefore its value changes even within a single SQL statement.
 timeofday() is a historical
 PostgreSQL function.  Like
 clock_timestamp(), it returns the actual current time,


Re: please define 'statement' in the glossary

2025-07-13 Thread David G. Johnston
On Sun, Jul 13, 2025 at 2:57 PM Tom Lane  wrote:

> ... so concretely, about like this?
>
>
We seldom if ever resort to including descriptions involving the fe/be
protocol in the SQL portion of the documentation - rightly considering
(IMO) those to be implementation details (e.g., we don't even directly
mention simple protocol in "psql -c" - though we do link to it under
"multi-statement commands").  Is there no way to avoid that here?  I'd be
ok if we'd limit this to a distinction between the simple protocol and the
extended protocol since, as a volatile function, it isn't even like
statement_timestamp can be seen in extended protocol aside from when
execute is sent.  So the special case where it doesn't behave as expected
is a simple protocol multi-statement command.  An example in psql would
serve to make this much more clear than any wording can do.  Possibly added
here or as part of the existing documentation that 'psql -c' points to
[1].  Which probably could be pointed to from here as well.

Seems also like maybe SPI should be mentioned explicitly here since it
seems to act like a client in a relevant way.  I'm assuming a
statement_timestamp executed within a function will return the same
timestamp the calling statement would.

David J.

[1]
https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-MULTI-STATEMENT


Re: Data visibility for returning statement

2025-07-13 Thread David G. Johnston
On Sat, Apr 26, 2025 at 5:50 AM PG Doc comments form 
wrote:

> The following documentation comment has been logged on the website:
>
> Page: https://www.postgresql.org/docs/17/dml-returning.html
> Description:
>
> Today I found a pretty special use-case for the "RETURNING" functionality,
> which I cannot find documentation for?
> If you have a statement as follows:
> UPDATE persons SET name = 'Bob' WHERE id = 4
> RETURNING (SELECT name FROM persons WHERE id = 4)
> The returning data will be whatever the value was before the row was
> modified. This differs from if I were to "RETURNING name". I found this to
> be interesting and could possibly warrant some kind of explanation in the
> documentation?
>
>
I would not want to encourage that form of query.  The novelty is more
problematic than the brevity.  Plus, repetition.

with new_p as ( update persons set name = 'Bob' returning name where id = 4)
select
old_p.id as id,
old_p.name as old_name,
new_p.name as new_name
from persons as old_p
join new_p on new_p.id = old_p.id

And now that we actually allow references to "new" and "old" in v18 that
form is also obsolete and you can do this directly.

David J.


Re: please define 'statement' in the glossary

2025-07-13 Thread Laurenz Albe
On Sun, 2025-07-13 at 17:32 -0700, David G. Johnston wrote:
> On Sun, Jul 13, 2025 at 2:57 PM Tom Lane  wrote:
> > ... so concretely, about like this?

I am fine with the patch as it is.

> We seldom if ever resort to including descriptions involving the fe/be 
> protocol
> in the SQL portion of the documentation - rightly considering (IMO) those to 
> be
> implementation details (e.g., we don't even directly mention simple protocol 
> in
> "psql -c" - though we do link to it under "multi-statement commands").
> Is there no way to avoid that here?

Well, I would have gladly removed the parenthetical remark, thinking that if
somebody needed to know precisely, she'd read up in the code.
But there is also nothing evil about hints for the initiated, lest they are
of a kind that can confuse beginners.

>      I'd be ok if we'd limit this to a
> distinction between the simple protocol and the extended protocol since, as a
> volatile function, it isn't even like statement_timestamp can be seen in 
> extended
> protocol aside from when execute is sent.  So the special case where it 
> doesn't
> behave as expected is a simple protocol multi-statement command.

It is STABLE, not VOLATILE, as befits the name, but yes, I see your point.

>   An example 
> in
> psql would serve to make this much more clear than any wording can do.
> Possibly added here or as part of the existing documentation that 'psql -c'
> points to [1].  Which probably could be pointed to from here as well.

Perhaps - but I feel uneasy about adding even more documentation.  If we show
how statement_timestamp() does *not* work as expected with a multi-statement
command, we might confuse the reader even more.  With the improved parenthetical
remark, I'd expect anybody with superficial knowledge of PostgreSQL to just
skip over the remark, with little damage done ("Ah, some comment about internals
that they couldn't help making.").

But if we add examples, we should be ready to explain in depth why it is the way
it is, and then we would have to get even deeper into the discussion of the
protocol that you bemoaned at the beginning of your mail.

> Seems also like maybe SPI should be mentioned explicitly here since it seems 
> to
> act like a client in a relevant way.  I'm assuming a statement_timestamp 
> executed
> within a function will return the same timestamp the calling statement would.

Well, in this case it doesn't act like a client.  That would mean dragging up
even more details from a section of the documentation that is only of interest
to hackers.

I think we should let the lions sleep.  The documentation of the built-in
functions is mostly of interest to application developers and writers of SQL
and PL/pgSQL, and expanding on SPI and the client-server protocol isn't what's
asked for here.  The documentation should be detailed, but there is a fine
line that you shouldn't cross if you don't want to confuse the reader.

The parenthetical remark is hopefully enough to get the interested reader
on the right track.

Yours,
Laurenz Albe