Re: pg_attribute_noreturn(), MSVC, C11

2024-12-28 Thread Peter Eisentraut

On 14.12.24 18:18, Peter Eisentraut wrote:

On 13.12.24 20:54, Andres Freund wrote:
Another wrinkle: While __attribute__((noreturn)) works for function 
pointers
(or function pointer typedefs) _Noreturn doesn't. Gah.  We only use it 
that

way in two places, but still :(


Yeah, I wrote an experimental patch for noreturn support some years ago, 
and that was also my result back then.  (I assume you have a current 
patch, otherwise I can dig out that one.)  I had also written down that 
there were some problems with Perl and Tcl headers, FWIW.  Did you have 
any problems with those?


I think we can take a small loss here and move with the standard. Unless 
you can think of a way to define pg_noreturn_but_for_function_pointers 
in a systematic way.


I have a patch proposal here.  I discovered a few more complications 
that need to be paid attention to.


First, we can't use bare "noreturn".  There are third-party header files 
(such as Tcl) that use __attribute__((noreturn)), and that will get 
confused.  That's the same reason we don't use bare restrict but 
pg_restrict.


I suggest we define pg_noreturn as

1. If C11 is supported, then _Noreturn, else
2. If GCC-compatible, then __attribute__((noreturn)), else
3. If MSVC, then __declspec((noreturn))

When PostgreSQL starts requiring C11, then the latter two options can be 
dropped.


Note that this also fixes a possible conflict where some third-party 
code includes  and then includes our c.h.  I don't think 
this has been reported, but it's surely bound to happen.


For the function pointers, I don't think there is a good solution.  The 
current behavior is evidently inconsistent and incompatible and not well 
documented, and I don't see how it's going to get better in these 
aspects any time soon.  I think the way forward in the mid-term is to 
avoid designing more interfaces like that and provide wrapper functions 
like json_manifest_parse_failure() where you can enforce the return 
behavior in the normal way.


Finally, while I was looking at this, I figured we could also add 
pg_nodiscard support to non-GCC compilers that support C23 (probably 
none right now, but eventually), by defining pg_nodiscard as 
[[nodiscard]].  But that revealed that clang requires these attributes 
to appear before the extern/static keywords, which is not how it's 
currently written.  So I changed that, too, and also wrote the 
pg_noreturn patch in the same style.  And then I also added a definition 
of pg_noreturn as [[noreturn]] (actually, [[__noreturn__]], for the 
reasons given earlier), for consistency, and also as a hedge in case 
some compiler drops C11 support in a few decades.  (I tried to 
language-lawyer my way through this, and I don't know that clang is 
correct here, but this issue is widely reported out there and everyone 
agrees that the fix is to just swap the things around.  At least we can 
enforce some stylistic consistency that way.)
From 1c85012cbb00cb9dc7216d98a6ffd056b2f3c3c7 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sat, 28 Dec 2024 11:00:24 +0100
Subject: [PATCH 1/4] pg_noreturn to replace pg_attribute_noreturn()

We want to support a "noreturn" decoration on more compilers besides
just GCC-compatible ones, but for that we need to move the decoration
in front of the function declaration instead of either behind it or
wherever, which is the current style afforded by GCC-style attributes.
Also rename the macro to "pg_noreturn" to be similar to the C11
standard "noreturn", and also because it's not necessarily an
attribute anymore.

pg_noreturn is now supported on all compilers that support C11 (using
_Noreturn), as well as GCC-compatible ones (using __attribute__, as
before), as well as MSVC (using __declspec).  (When PostgreSQL
requires C11, the latter two variants can be dropped.)

Now, all supported compilers effectively support pg_noreturn, so the
extra code for !HAVE_PG_ATTRIBUTE_NORETURN can be dropped.

This also fixes a possible problem if third-party code includes
stdnoreturn.h, because then the current definition of

#define pg_attribute_noreturn() __attribute__((noreturn))

would cause an error.

Note that the C standard does not support a noreturn attribute on
function pointer types.  So we have to drop these here.  There are
only two instances at this time, so it's not a big loss.
---
 contrib/dblink/dblink.c   |  6 ++--
 contrib/pgcrypto/px.h |  2 +-
 src/backend/access/transam/xlogrecovery.c |  3 +-
 src/backend/backup/basebackup_incremental.c   |  6 ++--
 src/backend/postmaster/autovacuum.c   |  2 +-
 src/backend/postmaster/launch_backend.c   |  2 +-
 src/backend/postmaster/postmaster.c   |  2 +-
 src/backend/replication/logical/tablesync.c   |  3 +-
 src/backend/replication/walsender.c   |  2 +-
 src/backend/utils/adt/ri_triggers.c   |  8 ++---
 src/backend/utils/fmgr/dfmgr.c|  4 +--
 src/backend/utils/hash/dynahash.

Re: Re: proposal: schema variables

2024-12-28 Thread jian he
hi.

src9=# select 'XLogRecPtr'::regtype;
ERROR:  type "xlogrecptr" does not exist
LINE 1: select 'XLogRecPtr'::regtype;
   ^
so
+ varcreate_lsn XLogRecPtr
should be
varcreate_lsn pg_lsn
?

also
+ 
+  
+   varcreate_lsn XLogRecPtr
+  
+  
+   LSN of the transaction where the variable was created.
+   varcreate_lsn and
+   oid together form the all-time unique
+   identifier (oid alone is not enough, since
+   object identifiers can get reused).
+  
+ 
+
we have "pg_variable_oid_index" PRIMARY KEY, btree (oid)
for table pg_variable.
so I am confused by saying the column "oid" itself is not enough to
prove unique.

in let.sgml
session_variable
should be
session_variable

sql_expression
should be
sql_expression




Re: Add the ability to limit the amount of memory that can be allocated to backends.

2024-12-28 Thread Tomas Vondra



On 12/28/24 13:36, Anton A. Melnikov wrote:
> Hi!
> 
> On 28.12.2024 04:48, Tomas Vondra wrote:
>> On 12/27/24 20:14, James Hunter wrote:
>>> Reviving this thread, because I am thinking about something related --
>>> please ignore the "On Fri, Dec 27, 2024" date, this seems to be an
>>> artifact of me re-sending the message, from the list archive. The
>>> original message was from January 28, 2024.
>>>
>>> On Fri, Dec 27, 2024 at 11:02 AM Tomas Vondra
>>>  wrote:

 Firstly, I agree with the goal of having a way to account for memory
 used by the backends, and also ability to enforce some sort of limit.
 It's difficult to track the memory at the OS level (interpreting RSS
 values is not trivial), and work_mem is not sufficient to enforce a
 backend-level limit, not even talking about a global limit.

 But as I said earlier, it seems quite strange to start by introducing
 some sort of global limit, combining memory for all backends. I do
 understand that the intent is to have such global limit in order to
 prevent issues with the OOM killer and/or not to interfere with other
 stuff running on the same machine. And while I'm not saying we should
 not have such limit, every time I wished to have a memory limit it
 was a
 backend-level one. Ideally a workmem-like limit that would "adjust" the
 work_mem values used by the optimizer (but that's not what this patch
 aims to do), or at least a backstop in case something goes wrong
 (say, a
 memory leak, OLTP application issuing complex queries, etc.).
>>>
>>> I think what Tomas suggests is the right strategy. 
> 
> I'm also interested in this topic. And agreed that it's best to move
> from the limit
> for a separate backend to the global one. In more details let me suggest
> the following steps or parts:
> 1) realize memory limitation for a separate backend independent from the
> work_mem GUC;
> 2) add workmem-like limit that would "adjust" the work_mem values used by
> the optimize as Thomas suggested;
> 3) add global limit for all backends.
> 
> As for p.1 there is a patch that was originally suggested by my colleague
> Maxim Orlov  and which i modified for the current
> master.
> This patch introduces the only max_backend_memory GUC that specifies
> the maximum amount of memory that can be allocated to a backend.
> Zero value means no limit.
> If the allocated memory size is exceeded, a standard "out of memory"
> error will be issued.
> Also the patch introduces the
> pg_get_backend_memory_contexts_total_bytes() function,
> which allows to know how many bytes have already been allocated
> to the process in contexts. And in the build with asserts it adds
> the pg_get_backend_memory_allocation_stats() function that allows
> to get additional information about memory allocations for debug purposes.
> 

Not sure a simple memory limit like in the patch (which just adds memory
accounting + OOM after hitting the limit) can work as anything but a the
last safety measure. It seems to me the limit would have to be set to a
value that's much higher than any backend would realistically need.

The first challenge I envision is that without any feedback (either to
the planner or executor), it may break queries quite easily. It just
takes the planner to add one more sort / hash join / materialize (which
it can do arbitrarily, as it has no concept of the memory limit), and
now the query can't run.

And secondly, there are allocations that we don't restrict by work_mem,
but this memory limit would include them, ofc. The main example that I
can think of is hash join, where we (currently) don't account for the
BufFile arrays, and that can already lead to annoying OOM issues, see
e.g. [1] [2] and [3] (I'm sure there are more threads about the issue).

It's wrong we don't account for the BufFile arrays, so it's not included
in work_mem (or considered is some other way). And maybe we should
finally improve that (not the fault of this patch, ofc). But it's hard,
because as the amount of data grows, we have to add more batches - and
at some point that starts adding more memory than we save. Ultimately,
we end up breaking work_mem one way or the other - either we add more
batches, or allow the hash table to exceed work_mem.

That's a preexisting issue, of course. But wouldn't this simple limit
make the situation worse? The query would likely complete OK (otherwise
we'd get many more reports about OOM), but with the new limit it would
probably fail with OOM. Maybe that's correct, and the response to that
is "Don't set the limit with such queries," although it's hard to say in
advance and it can happen randomly? Not sure.

What bothers me a bit is that users would likely try to reduce work_mem,
but that's the wrong thing to do in this case - it just increases the
number of batches, and thus makes the situation worse.

[1]
https://www.postgresql.org/message-id/20190504003414.bulcbnge3rhwhcsh%40development

[2] h

Re: Proposal to Enable/Disable Index using ALTER INDEX (with patch)

2024-12-28 Thread Shayon Mukherjee
On Fri, Dec 6, 2024 at 11:24 AM Shayon Mukherjee  wrote:

> The patch is also rebased against the latest master and passing in CI.
> Would love to receive any further feedback on it.
>
>
Rebased the last patch against the latest master from today as a v5. No
other changes since last post.

Shayon


v5-0001-Introduce-the-ability-to-enable-disable-indexes-u.patch
Description: Binary data


Re: which wire protocol fields are signed int and which ones are unsigned

2024-12-28 Thread Bruce Momjian
On Tue, Dec 10, 2024 at 11:57:07AM +, PG Doc comments form wrote:
> The following documentation comment has been logged on the website:
> 
> Page: https://www.postgresql.org/docs/17/protocol-message-formats.html
> Description:
> 
> I don't see it clearly stated which Int fields are signed and which are
> unsigned. If it can be assumed that Int is signed and Byte is unsigned then
> the object id fields are wrong because they should be unsigned.

(Thead moved to hackers since it involves internals issues.)

This is a very good point.  I looked at the backend code and couldn't
find a clear answer.  For example, let's look at AuthenticationOk, which uses
src/backend/libpq/auth.c::sendAuthRequest().

You have "AuthRequest" passed in as an unsigned int32, stored in
StringInfoData.cursor as a signed int32, passed to pq_sendint32() as a
signed int32, which gets passed to pq_writeint32(), which casts it back
to a unsigned int32 to send on the socket.  For the length of
AuthenticationOk, it uses StringInfo.len, which is a signed int32, but
that is cast by /pgtop/src/backend/libpq/pqcomm.c::socket_putmessage()
to unsigned int32 in () before sending it on the socket.

Why is StringInfo.len a signed int?  Do we need to use -1 in that field?
Do we not expect that to exceed 2GB?

A related issue is that the format our docs use for this is confusing. 
Is Byte1 signed or unsigned.  Can we assume Int32 is signed since it is
in C?  (In C, the sign of "char" is undetermined.) We display the data
type and value as "Int32(8)", which looks more like a function call than
what is represents.  Any ideas on how to improvem that?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.






Re: Re: proposal: schema variables

2024-12-28 Thread Pavel Stehule
Hi

pá 27. 12. 2024 v 16:20 odesílatel jian he 
napsal:

> hi.
>
> + if (!OidIsValid(varid))
> + AcceptInvalidationMessages();
> + else if (OidIsValid(varid))
> + LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock);
>
> we can change it to
> + if (!OidIsValid(varid))
> + AcceptInvalidationMessages();
> + else
> + LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock);
>

done


>
> inval_count didn't explain at all, other places did actually explain it.
> Can we add some text explaining inval_count? (i didn't fully understand
> this part, that is why i am asking..)
>

done


>
> seems IdentifyVariable all these three ereport(ERROR...) don't have
> regress tests,
> i think we should have it. Am I missing something?
>

done

>
> create variable v2 as int;
> let v2.a = 1;
> ERROR:  type "integer" of target session variable "public.v2" is not a
> composite type
> LINE 1: let v2.a = 1;
> ^
> the error messages look weird.
> IMO, it should either be
> "type of session variable "public.v2" is not a composite type"
> or
> "session variable "public.v2" don't have attribute "a"
>

changed

I did inspiration from transformAssignmentIndirection now

(2024-12-28 16:07:57) postgres=# let x.a = 20;
ERROR:  cannot assign to field "a" of session variable "public.x" because
its type integer is not a composite type
LINE 1: let x.a = 20;
^

>
> also, the above can be as a regress test for:
> + if (attrname && !is_rowtype)
> + ereport(ERROR,
> + (errcode(ERRCODE_WRONG_OBJECT_TYPE),
> + errmsg("type \"%s\" of target session variable \"%s.%s\" is not a
> composite type",
> + format_type_be(typid),
> + get_namespace_name(get_session_variable_namespace(varid)),
> + get_session_variable_name(varid)),
> + parser_errposition(pstate, stmt->location)));
> since we don't have coverage tests for it.
>
>
done


>
> + if (coerced_expr == NULL)
> + ereport(ERROR,
> + (errcode(ERRCODE_DATATYPE_MISMATCH),
> + errmsg("variable \"%s.%s\" is of type %s,"
> + " but expression is of type %s",
> + get_namespace_name(get_session_variable_namespace(varid)),
> + get_session_variable_name(varid),
> + format_type_be(typid),
> + format_type_be(exprtypid)),
> + errhint("You will need to rewrite or cast the expression."),
> + parser_errposition(pstate, exprLocation((Node *) expr;
>
> generally, errmsg(...) should put it into one line for better grep-ability
> so we can change it to:
> +errmsg("variable \"%s.%s\" is of type %s, but expression is of type
> %s"...)
>

done


>
> also no coverage tests?
> simple test case for it:
> create variable v2 as int;
> let v2 = '1'::jsonb;
>

done


>
> ---<<<>>>--
> +let_target:
> + ColId opt_indirection
> + {
> + $$ = list_make1(makeString($1));
> + if ($2)
> +  $$ = list_concat($$,
> +   check_indirection($2, yyscanner));
> + }
> if you look closely, it seems the indentation level is wrong in
> line "$$ = list_concat($$,"?
>

let_target is not needed as separate rule now, so I removed it and little
bit cleaned (really only little bit)


>
> ---<<<>>>--
> i did some refactoring in session_variables.sql for privilege check.
> make the tests more neat, please check attached.
>

merged with minor changes in comment's formatting

I'll send the patch set with next mail

Best regards

Pavel


Re: WARNING: missing lock on database "postgres" (OID 5) @ TID (0,4)

2024-12-28 Thread Noah Misch
On Tue, Dec 17, 2024 at 10:00:10AM +0500, Kirill Reshke wrote:
> PFA v5

Pushed as ff90ee6 with some cosmetic changes.  Thanks.




Re: Add the ability to limit the amount of memory that can be allocated to backends.

2024-12-28 Thread Jeremy Schneider
On Sat, 28 Dec 2024 15:57:44 +0100
Tomas Vondra  wrote:
> 
> Not sure a simple memory limit like in the patch (which just adds
> memory accounting + OOM after hitting the limit) can work as anything
> but a the last safety measure. It seems to me the limit would have to
> be set to a value that's much higher than any backend would
> realistically need.
> 
> The first challenge I envision is that without any feedback (either to
> the planner or executor), it may break queries quite easily. It just
> takes the planner to add one more sort / hash join / materialize
> (which it can do arbitrarily, as it has no concept of the memory
> limit), and now the query can't run.
> 
> And secondly, there are allocations that we don't restrict by
> work_mem, but this memory limit would include them, ofc. The main
> example that I can think of is hash join, where we (currently) don't
> account for the BufFile arrays, and that can already lead to annoying
> OOM issues, see e.g. [1] [2] and [3] (I'm sure there are more threads
> about the issue).

Hi James!

While I don't have a detailed design in mind, I'd like to add a strong
+1 on the general idea that work_mem is hard to effectively use because
queries can vary so widely in how many nodes might need work memory.

I'd almost like to have two limits:

First, a hard per-connection limit which could be set very high - we
can track total memory usage across contexts inside of palloc and pfree
(and maybe this could also be exposed in pg_stat_activity for easy
visibility into a snapshot of memory use across all backends). If
palloc detects that an allocation would take the total over the hard
limit, then you just fail the palloc with an OOM. This protects
postgres from a memory leak in a single backend OOM'ing the whole
system and restarting the whole database; failing a single connection
is better than failing all of them.

Second, a soft per-connection "total_connection_work_mem_target" which
could be set lower. The planner can just look at the total number of
nodes that it expects to allocate work memory, divide the target by
this and then set the work_mem for that query.  There should be a
reasonable floor (minimum) for work_mem - maybe the value of work_mem
itself becomes this and the new target doesn't do anything besides
increasing runtime work_mem.

Maybe even could do a "total_instance_work_mem_target" where it's
divided by the number of average active connections or something.

In practice this target idea doesn't guarantee anything about total work
memory used, but it's the tool I'd most like as an admin. And the
per-connection hard limit is the tool I'd like to have for protecting
myself against memory leaks or individual connections going bonkers and
killing all my connections for an instance restart.

-Jeremy




Re: Document NULL

2024-12-28 Thread Marcos Pegoraro
The word below is commonly used on DOCs, but I didn't find it as a link.
Wouldn't it be better to write what you are linking to, instead of the word
below ?

(See Null-Valued Operands for
more explanation.)
instead of
(See below for more explanation.)

Multi-Element
Comparisons.
instead of
noted below.

This is just an extension of the multi-element testing behavior described
  Testing Multi-Element Values with
Null-Valued Elements  .
instead of
This is just an extension of the multi-element testing behavior described
  below.
regards
Marcos


Re: Re: proposal: schema variables

2024-12-28 Thread jian he
hi.

+ if (stmt->collClause)
+ collation = LookupCollation(pstate,
+ stmt->collClause->collname,
+ stmt->collClause->location);
+ else
+ collation = typcollation;;

two semi-colon. should be only one.
--<<>>>---
+ /* locks the variable with an AccessShareLock */
+ varid = IdentifyVariable(names, &attrname, ¬_unique, false);
+ if (not_unique)
+ ereport(ERROR,
+ (errcode(ERRCODE_AMBIGUOUS_PARAMETER),
+ errmsg("target \"%s\" of LET command is ambiguous",
+ NameListToString(names)),
+ parser_errposition(pstate, stmt->location)));

the following are tests for the above "LET command is ambiguous" error message.

create schema test;
CREATE TYPE test AS (test int);
CREATE variable test.test as test;
set search_path to test;
let test.test = 1;
--<<>>>---
+ else
+ {
+ /* the last field of list can be star too */
+ Assert(IsA(field2, A_Star));
+
+ /*
+ * In this case, the field1 should be variable name. But
+ * direct unboxing of composite session variables is not
+ * supported now, and then we don't need to try lookup
+ * related variable.
+ *
+ * Unboxing is supported by syntax (var).*
+ */
+ return InvalidOid;
+ }
I don't fully understand the above comments,
add
`elog(INFO, "%s:%d called", __FILE__, __LINE__); ` within the ELSE branch.
Then I found out the ELSE branch doesn't have coverage tests.

--<<>>>---
+ /*
+ * a.b.c can mean catalog.schema.variable or
+ * schema.variable.field.

+ /*
+ * a.b can mean "schema"."variable" or "variable"."field".
+ * Check both variants, and returns InvalidOid with
+ * not_unique flag, when both interpretations are
+ * possible.
+ */
here, we use the word "field", but the function IdentifyVariable above
comment, we use
word "attribute", for consistency's sake, we should use "attribute"
instead of "field"

+/* -
+ * IdentifyVariable - try to find a variable from a list of identifiers
+ *
+ * Returns the OID of the variable found, or InvalidOid.
+ *
+ * "names" is a list of up to four identifiers; possible meanings are:
+ * - variable  (searched on the search_path)
+ * - schema.variable
+ * - variable.attribute  (searched on the search_path)
+ * - schema.variable.attribute
+ * - database.schema.variable
+ * - database.schema.variable.attribute

--<<>>>---




Re: Re: proposal: schema variables

2024-12-28 Thread Pavel Stehule
Hi

so 28. 12. 2024 v 11:35 odesílatel jian he 
napsal:

> hi.
>
> src9=# select 'XLogRecPtr'::regtype;
> ERROR:  type "xlogrecptr" does not exist
> LINE 1: select 'XLogRecPtr'::regtype;
>^
> so
> + varcreate_lsn XLogRecPtr
> should be
> varcreate_lsn pg_lsn
> ?
>

done


>
> also
> + 
> +  
> +   varcreate_lsn XLogRecPtr
> +  
> +  
> +   LSN of the transaction where the variable was created.
> +   varcreate_lsn and
> +   oid together form the all-time unique
> +   identifier (oid alone is not enough,
> since
> +   object identifiers can get reused).
> +  
> + 
> +
> we have "pg_variable_oid_index" PRIMARY KEY, btree (oid)
> for table pg_variable.
> so I am confused by saying the column "oid" itself is not enough to
> prove unique.
>

The session variable is stored in memory until the end of the session.
Theoretically, some sessions with used session variables can be opened for
a very long time without any activity - so it is not possible to process
sinval message. Other sessions can drop and create a lot of session
variables (this is very possible with temporary session variables). Oid in
Postgres can overflow, and postgres can reuse used oid of dropped objects
(oid is only 32bit integer). And after some time, the session with the used
variable can be activated, and the session variable can be used. Before
usage the session variable is rechecked against pg_variable, and
theoretically the variable with the same oid can be there (although it is a
different variable with possibly different type). The implementation should
protect against this scenario. The stored value must not be used in this
case - the usage of old value is dangerous - the calculations with the
variable can lose sense or can crash postgres. LSN is forever unique - it
is 64bit integer - so it is our safeguard so we can detect obsolete values
(stored in memory) although there are variables with the same oid.

oid in pg_variable ensures a unique identifier for any session variable in
one moment. Compound key [oid, varcreate_lsn] is a unique identifier for
ever.



> in let.sgml
> session_variable
> should be
> session_variable
>

done


>
> sql_expression
> should be
> sql_expression
>

done


Re: Re: proposal: schema variables

2024-12-28 Thread jian he
On Sun, Dec 29, 2024 at 5:50 AM Pavel Stehule  wrote:
>
> Hi
>
>
>> --<<>>>---
>> + else
>> + {
>> + /* the last field of list can be star too */
>> + Assert(IsA(field2, A_Star));
>> +
>> + /*
>> + * In this case, the field1 should be variable name. But
>> + * direct unboxing of composite session variables is not
>> + * supported now, and then we don't need to try lookup
>> + * related variable.
>> + *
>> + * Unboxing is supported by syntax (var).*
>> + */
>> + return InvalidOid;
>> + }
>> I don't fully understand the above comments,
>
>
> The parser allows only two syntaxes - identifier.identifier or 
> identifier.star. Second
> syntax is not supported by session variables, and then I didn't try to search 
> for the variable.
> Some users can be confused by similar syntaxes identifier.* and 
> (identifier).* Only
> second syntax is composite unboxing, and only second syntax is supported for
> session variables.
>
> Maybe the note about unboxing is messy there?
>
>> add
>> `elog(INFO, "%s:%d called", __FILE__, __LINE__); ` within the ELSE branch.
>> Then I found out the ELSE branch doesn't have coverage tests.
>
>
> I don't understand this comment? I don't use elog(INFO anywhere
>
>

sorry for confusion, i mean,
i added " elog(INFO", the regress test is still successful,
therefore it means the above ELSE branch code doesn't have coverage tests.




Re: Re: proposal: schema variables

2024-12-28 Thread Pavel Stehule
Hi


>> + {
>> + /* the last field of list can be star too */
>> + Assert(IsA(field2, A_Star));
>> +
>> + /*
>> + * In this case, the field1 should be variable name. But
>> + * direct unboxing of composite session variables is not
>> + * supported now, and then we don't need to try lookup
>> + * related variable.
>> + *
>> + * Unboxing is supported by syntax (var).*
>> + */
>> + return InvalidOid;
>> + }
>> I don't fully understand the above comments,
>>
>
> The parser allows only two syntaxes - identifier.identifier or
> identifier.star. Second
> syntax is not supported by session variables, and then I didn't try to
> search for the variable.
> Some users can be confused by similar syntaxes identifier.* and
> (identifier).* Only
> second syntax is composite unboxing, and only second syntax is supported
> for
> session variables.
>
>
I changed this comment to

<--><--><--><--><-->/*
<--><--><--><--><--> * The syntax ident.* is used only by table aliases,
<--><--><--><--><--> * and then this identifier cannot be a reference to
<--><--><--><--><--> * session variable.
<--><--><--><--><--> */


Re: Add the ability to limit the amount of memory that can be allocated to backends.

2024-12-28 Thread Jim Nasby
On Dec 28, 2024, at 12:26 PM, Jeremy Schneider  wrote:
> 
> While I don't have a detailed design in mind, I'd like to add a strong
> +1 on the general idea that work_mem is hard to effectively use because
> queries can vary so widely in how many nodes might need work memory.
> 
> I'd almost like to have two limits:
> 
> First, a hard per-connection limit which could be set very high - we
> can track total memory usage across contexts inside of palloc and pfree
> (and maybe this could also be exposed in pg_stat_activity for easy
> visibility into a snapshot of memory use across all backends). If
> palloc detects that an allocation would take the total over the hard
> limit, then you just fail the palloc with an OOM. This protects
> postgres from a memory leak in a single backend OOM'ing the whole
> system and restarting the whole database; failing a single connection
> is better than failing all of them.
> 
> Second, a soft per-connection "total_connection_work_mem_target" which
> could be set lower. The planner can just look at the total number of
> nodes that it expects to allocate work memory, divide the target by
> this and then set the work_mem for that query.  There should be a
> reasonable floor (minimum) for work_mem - maybe the value of work_mem
> itself becomes this and the new target doesn't do anything besides
> increasing runtime work_mem.
> 
> Maybe even could do a "total_instance_work_mem_target" where it's
> divided by the number of average active connections or something.

IMHO none of this will be very sane until we actually have cluster-level 
limits. One sudden burst in active connections and you still OOM the instance. 
And while we could have such a mechanism do something clever like dynamically 
lowering every sessions query_mem/work_mem/whatever, ultimately I think it 
would also need the ability to deny or delay sessions from starting new 
transactions.

Re: IANA timezone abbreviations versus timezone_abbreviations

2024-12-28 Thread Bruce Momjian
On Mon, Dec 16, 2024 at 02:57:59PM -0500, Tom Lane wrote:
> Andreas Karlsson  writes:
> > On 12/13/24 12:33 AM, Tom Lane wrote:
> >> What I think we should do about this is to teach timestamp
> >> input to look into the current IANA time zone to see if it
> >> knows the given abbreviation, and if so use that meaning
> >> regardless of what timezone_abbreviations might say.
> 
> > I am not convinced this is an improvement. While this patch removes the 
> > round-trip hazard it also makes it confusing to use the 
> > timezone_abbreviations GUC since it can be overridden by IANA data based 
> > on your current timezone. So you need to know all the, sometimes weird, 
> > names for your current timezone. Seems unnecessarily hard to reason 
> > about and wouldn't most people who use timezone_abbreviations rely on 
> > the current behavior?
> 
> Presumably they're not that weird to the locals?
> 
> I am not sure what you mean by "people who use
> timezone_abbreviations".  I think that's about everyone --- it's
> not like the default setting doesn't contain any abbreviations.
> (If it didn't then we'd not have such a problem...)
> 
> > But that said I personally only use ISO timestamps with numerical 
> > offsets. Partially to avoid all this mess.
> 
> If you only use ISO notation then this doesn't matter to you
> either way.

Yes, your patch seems like a big improvement.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.






Re: Document NULL

2024-12-28 Thread jian he
hi.

"SQL specification" should be "SQL standard"?

+   another null vale and unequal to all other values.
typo. should be "another null value"

other places using subsection, many places in doc/src/sgml/nullvalues.sgml
using "sub-section".

+   Notice two important behaviors: first, even though we passed in a
null value to
+   to the set_config function, the
current_setting
there is two "to to".

gmail has grammar checks, chatgpt can also help find typos.
you can give it a try.


+   
+BEGIN;
+ALTER TABLE null_examples ADD CONSTRAINT value_not_1 CHECK (value != 1);
+ROLLBACK;
+   
+   
+BEGIN
+ERROR:  check constraint "value_not_1" of relation
"null_examples" is violated by some row
+ROLLBACK
+   
+   
+BEGIN;
+ALTER TABLE null_examples ADD CONSTRAINT value_not_10 CHECK (value != 10);
+ROLLBACK;
+   
+   
+BEGIN
+ALTER TABLE
+ROLLBACK
+   

i think this part, BEGIN... ROLLBACK is not necessary.
since if we add these check constraints, it won't influence the
later(next) section.




Re: Connection limits/permissions, slotsync workers, etc

2024-12-28 Thread Tom Lane
"Zhijie Hou (Fujitsu)"  writes:
> On Saturday, December 28, 2024 1:31 AM Tom Lane  wrote:
>> Attached is an alternative proposal that groups the autovac launcher and
>> slotsync worker into a new category of "special workers" (better name
>> welcome).  I chose to put them into the existing autovacFreeProcs freelist,
>> partly because the autovac launcher lives there already but mostly because I
>> don't want to add another freelist in a patch we need to put into v17.  (As
>> written, your patch is an ABI break.

> Right, thanks for pointing it out. The new version patch looks good to me.

Pushed, thanks for looking at it.

regards, tom lane




Re: Query regarding pg_prewarm extension

2024-12-28 Thread Bruce Momjian
On Fri, Dec 13, 2024 at 05:20:05PM -0800, Jeremy Schneider wrote:
> On Fri, 13 Dec 2024 16:16:16 +0530
> Ayush Vatsa  wrote:
> 
> > How can I decide which range of pages to prewarm?
> > I assume that it is related to hot pages in the relation,
> > but how can I identify which pages are likely to be hot
> > before they are even in the buffer cache?
> > Additionally, since tuples within a page can move to
> > different pages over time (due to operations like VACUUM FULL or
> > REINDEX), how should I handle this when selecting the pages to
> > prewarm?
> 
> For my part, I've only used the block offsets when I wanted to fire off
> several jobs in parallel, attempting to prewarm a relation faster. I've
> never tried to track the location of specific rows for purposes of
> prewarming.
> 
> You might try the "autoprewarm" feature. After adding pg_prewarm to
> your shared_preload_libraries, it will automatically keep track of the
> contents of the buffer cache and after a restart it will automatically
> prewarm the buffer cache with the blocks that were there before.
> 
> https://www.enterprisedb.com/blog/autoprewarm-new-functionality-pgprewarm
> 
> Alternatively you could just prewarm a few of your most important hot
> tables and indexes with a script after restarts.
> 
> For most smaller databases, slightly slower performance for a short
> period after startup isn't a problem - while reading blocks from disk
> for the first time. After the first read, blocks that are frequently
> accessed will remain in the cache. The Postgres cache management
> algorithm works well in general.
> 
> This is my two cents, anyway

It feels like we should document what the block range is used for, so
attached is a doc patch to do that.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.


diff --git a/doc/src/sgml/pgprewarm.sgml b/doc/src/sgml/pgprewarm.sgml
index 75f45b91b67..6907f214eb8 100644
--- a/doc/src/sgml/pgprewarm.sgml
+++ b/doc/src/sgml/pgprewarm.sgml
@@ -35,8 +35,10 @@ pg_prewarm(regclass, mode text default 'buffer', fork text default 'main',
The fourth argument is the first block number to prewarm
(NULL is accepted as a synonym for zero).  The fifth
argument is the last block number to prewarm (NULL
-   means prewarm through the last block in the relation).  The return value
-   is the number of blocks prewarmed.
+   means prewarm through the last block in the relation).  The block
+   range allows a single relation to be loaded in parallel using multiple
+   concurent function calls.  The return value is the number of blocks
+   prewarmed.
   
 
   


Re: Add the ability to limit the amount of memory that can be allocated to backends.

2024-12-28 Thread Anton A. Melnikov

Hi!

On 28.12.2024 04:48, Tomas Vondra wrote:

On 12/27/24 20:14, James Hunter wrote:

Reviving this thread, because I am thinking about something related --
please ignore the "On Fri, Dec 27, 2024" date, this seems to be an
artifact of me re-sending the message, from the list archive. The
original message was from January 28, 2024.

On Fri, Dec 27, 2024 at 11:02 AM Tomas Vondra
 wrote:


Firstly, I agree with the goal of having a way to account for memory
used by the backends, and also ability to enforce some sort of limit.
It's difficult to track the memory at the OS level (interpreting RSS
values is not trivial), and work_mem is not sufficient to enforce a
backend-level limit, not even talking about a global limit.

But as I said earlier, it seems quite strange to start by introducing
some sort of global limit, combining memory for all backends. I do
understand that the intent is to have such global limit in order to
prevent issues with the OOM killer and/or not to interfere with other
stuff running on the same machine. And while I'm not saying we should
not have such limit, every time I wished to have a memory limit it was a
backend-level one. Ideally a workmem-like limit that would "adjust" the
work_mem values used by the optimizer (but that's not what this patch
aims to do), or at least a backstop in case something goes wrong (say, a
memory leak, OLTP application issuing complex queries, etc.).


I think what Tomas suggests is the right strategy. 


I'm also interested in this topic. And agreed that it's best to move from the 
limit
for a separate backend to the global one. In more details let me suggest
the following steps or parts:
1) realize memory limitation for a separate backend independent from the 
work_mem GUC;
2) add workmem-like limit that would "adjust" the work_mem values used by
the optimize as Thomas suggested;
3) add global limit for all backends.

As for p.1 there is a patch that was originally suggested by my colleague
Maxim Orlov  and which i modified for the current master.
This patch introduces the only max_backend_memory GUC that specifies
the maximum amount of memory that can be allocated to a backend.
Zero value means no limit.
If the allocated memory size is exceeded, a standard "out of memory" error will 
be issued.
Also the patch introduces the pg_get_backend_memory_contexts_total_bytes() 
function,
which allows to know how many bytes have already been allocated
to the process in contexts. And in the build with asserts it adds
the pg_get_backend_memory_allocation_stats() function that allows
to get additional information about memory allocations for debug purposes.


This strategy solves the ongoing problem of how to set work_mem, if
some queries have lots of operators and others don't -- now we just
set backend_work_mem, as a limit on the entire query's total work_mem.
And a bit of integration with the optimizer will allow us to
distribute the total backend_work_mem to individual execution nodes,
with the goal of minimizing spilling, without exceeding the
backend_work_mem limit.


As for p.2 maybe one can set a maximum number of parallel sort or
hash table operations before writing to disk instead of absolute
value in the work_mem GUC? E.g. introduce а max_query_operations GUC
or a variable in such a way that old work_mem will be equal
to max_backend_memory divided by max_query_operations.

What do you think about such approach?


With the best wishes,

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres CompanyFrom 8ec3548604bf6a1f5f4c290495843e7080d09660 Mon Sep 17 00:00:00 2001
From: "Anton A. Melnikov" 
Date: Sat, 28 Dec 2024 12:47:24 +0300
Subject: [PATCH] Limit backend heap memory allocation

This per-backend GUC can limit the amount of heap allocated
memory.  Upon reaching this limit "out of memory" error will
be triggered.

GUC is diabled by default.

Authors: Maxim Orlov , Anton A. Melnikov 
---
 doc/src/sgml/config.sgml  |  17 +
 src/backend/utils/activity/backend_status.c   | 335 ++
 src/backend/utils/misc/guc_tables.c   |  10 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/backend/utils/mmgr/aset.c |  36 +-
 src/backend/utils/mmgr/generation.c   |  18 +-
 src/backend/utils/mmgr/slab.c |  17 +-
 src/include/catalog/pg_proc.dat   |  14 +
 src/include/utils/backend_status.h|  10 +
 src/test/modules/Makefile |   4 +
 .../modules/test_backend_memory/.gitignore|   4 +
 src/test/modules/test_backend_memory/Makefile |  14 +
 src/test/modules/test_backend_memory/README   |   1 +
 .../t/001_max_backend_memory.pl   |  85 +
 14 files changed, 540 insertions(+), 26 deletions(-)
 create mode 100644 src/test/modules/test_backend_memory/.gitignore
 create mode 100644 src/test/modules/test_backend_memory/Makefile
 create mode 100644 src/test/modules/test_