Re: Limitation relates to memory allocation
On 14.10.24 08:03, Ekaterina Kiryanova wrote: We encountered an issue related to internal memory allocation limit: ERROR: invalid memory alloc request size In the documentation on limits: https://www.postgresql.org/docs/17/ limits.html the description suggests that only a single field is limited to 1GB, which could imply that the total tuple size can be larger. So as we planned to store three columns of 1GB each in a table and attempted to insert this data, we got the error. Our research showed that the limit is imposed by the palloc() function, regardless of whether it is a tuple or not, and if the data is serialized or dumped, the effective limit can be even lower, typically around 512MB per row. So for allocations exceeding 1GB, the palloc_extended() function can be used. Please correct me if I'm wrong. I prepared a small patch for master, if it's worth clarifying, could you please review the attachment? The 1 GB limit in palloc() is a safety check, when the code shouldn't be allocating more than that. Code that legitimately wants to allocate more than 1 GB can use the MCXT_ALLOC_HUGE flag. If you see this error, then that could either be corruption somewhere (the kind of thing this safety check is meant to catch) or the code is buggy. In either case, I don't know that it is appropriate to document this as an externally visible system limitation.
Re: missing command?
> On 15 Oct 2024, at 00:57, PG Doc comments form wrote: > In this list of commands: > https://www.postgresql.org/docs/current/reference-client.html > Should this one be included? --> pg_ctl pg_ctl is a server application, it is included here: https://www.postgresql.org/docs/devel/reference-server.html -- Daniel Gustafsson
missing command?
The following documentation comment has been logged on the website: Page: https://www.postgresql.org/docs/17/reference-client.html Description: Hi, In this list of commands: https://www.postgresql.org/docs/current/reference-client.html Should this one be included? --> pg_ctl Saludos. Jose
CLUSTER command
The following documentation comment has been logged on the website: Page: https://www.postgresql.org/docs/17/sql-cluster.html Description: The documentation does not say what happens if you do "CLUSTER tablename" and omit the USING clause, which is shown as optional in the BNF, if I'm reading it right. Does CLUSTER use the PRIMARY KEY in that case? What if no PRIMARY KEY is specified?
Re: CLUSTER command
Hello, On 2024-Oct-15, PG Doc comments form wrote: > The documentation does not say what happens if you do "CLUSTER tablename" > and omit the USING clause, which is shown as optional in the BNF, if I'm > reading it right. Does CLUSTER use the PRIMARY KEY in that case? What if > no PRIMARY KEY is specified? The table is clustered on the index that was previously selected as the cluster index (either by running "CLUSTER table ON idx" or by doing ALTER TABLE tab CLUSTER ON idx"). If no index is selected, an error is thrown. The docs explain it this way: When a table is clustered, PostgreSQL remembers which index it was clustered by. The form CLUSTER table_name reclusters the table using the same index as before. You can also use the CLUSTER or SET WITHOUT CLUSTER forms of ALTER TABLE to set the index to be used for future cluster operations, or to clear any previous setting. I'm not sure if we need to make this clearer. It's perfectly clear to me, but then I already knew what I wanted to read ... -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "After a quick R of TFM, all I can say is HOLY CR** THAT IS COOL! PostgreSQL was amazing when I first started using it at 7.2, and I'm continually astounded by learning new features and techniques made available by the continuing work of the development team." Berend Tober, http://archives.postgresql.org/pgsql-hackers/2007-08/msg01009.php
Re: CLUSTER command
That makes perfect sense, now that you have explained it. I just didn't pick up on that from the text as written. -- D. Richard Hipp d...@sqlite.org On Wednesday, October 16th, 2024 at 9:22 AM, Alvaro Herrera wrote: > Hello, > > On 2024-Oct-15, PG Doc comments form wrote: > > > The documentation does not say what happens if you do "CLUSTER tablename" > > and omit the USING clause, which is shown as optional in the BNF, if I'm > > reading it right. Does CLUSTER use the PRIMARY KEY in that case? What if > > no PRIMARY KEY is specified? > > > The table is clustered on the index that was previously selected as the > cluster index (either by running "CLUSTER table ON idx" or by doing > ALTER TABLE tab CLUSTER ON idx"). If no index is selected, an error is > thrown. > > The docs explain it this way: > > When a table is clustered, PostgreSQL remembers which index it was clustered > by. The form CLUSTER table_name reclusters the table using the same index as > before. You can also use the CLUSTER or SET WITHOUT CLUSTER forms of ALTER > TABLE to set the index to be used for future cluster operations, or to clear > any previous setting. > > I'm not sure if we need to make this clearer. It's perfectly clear to > me, but then I already knew what I wanted to read ... > > -- > Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ > "After a quick R of TFM, all I can say is HOLY CR** THAT IS COOL! PostgreSQL > was > amazing when I first started using it at 7.2, and I'm continually astounded by > learning new features and techniques made available by the continuing work of > the development team." > Berend Tober, > http://archives.postgresql.org/pgsql-hackers/2007-08/msg01009.php
Re: A minor bug in doc. Hovering over heading shows # besides it.
On Thu, 17 Oct 2024 at 11:19, Bruce Momjian wrote: > Where are we on this? I still see "#". I was hoping for some voting for or against my patch or the general idea of it. Also, my web skills in this area amount to trial and error, so it would also be good to have someone who knows what they're doing have a look at it. David
Re: Limitation relates to memory allocation
On Mon, 14 Oct 2024 at 19:03, Ekaterina Kiryanova wrote: > Our research showed that the limit is imposed by the palloc() function, > regardless of whether it is a tuple or not, and if the data is > serialized or dumped, the effective limit can be even lower, typically > around 512MB per row. So for allocations exceeding 1GB, the > palloc_extended() function can be used. Please correct me if I'm wrong. I think it would be nice to document the row length limitation and also add a caveat to the "field size" row to mention that outputting bytea columns larger than 512MB can be problematic and storing values that size or above is best avoided. I don't think wording like: "The practical limit is less than 1 GB" is going to be good enough as it's just not specific enough. The other places that talk about practical limits on that page are mostly there because it's likely impossible that anyone could actually reach the actual limit. For example, 2^32 databases is likely a limit that nobody would be able to get close. It's pretty easy to hit the bytea limit, however: postgres=# create table b (a bytea); CREATE TABLE Time: 2.634 ms postgres=# insert into b values(repeat('a',600*1024*1024)::bytea); INSERT 0 1 Time: 9725.320 ms (00:09.725) postgres=# \o out.txt postgres=# select * from b; ERROR: invalid memory alloc request size 1258291203 Time: 209.082 ms that took me about 10 seconds, so I disagree storing larger bytea values is impractical. David
Re: Connection Info
On Mon, May 6, 2024 at 04:15:24PM +0200, Laurenz Albe wrote: > On Sun, 2024-05-05 at 20:25 +, PG Doc comments form wrote: > > On this page: > > https://www.postgresql.org/docs/14/server-start.html#CLIENT-CONNECTION-PROBLEMS > > > > It notes: " A common mistake is to forget to configure the server to > > allow TCP/IP connections." > > > > I would expect a hyperlink to a section explaining exactly HOW to "configure > > the server to allow TCP/IP connections" - from that note text. > > I agree. > > Perhaps it should say: > > A common mistake is to forget to configure "listen_addresses" so that the > server accepts remote TCP connections. > > Then "listen_addresses" could hyperlink to the parameter's documentation. Agreed, patch attached. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com When a patient asks the doctor, "Am I going to die?", he means "Am I going to die soon?" diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml index 2c4d5ef640d..933cac813c2 100644 --- a/doc/src/sgml/runtime.sgml +++ b/doc/src/sgml/runtime.sgml @@ -637,7 +637,8 @@ psql: error: connection to server at "server.joe.com" (123.123.123.123), port 54 This is the generic I couldn't find a server to talk to failure. It looks like the above when TCP/IP communication is attempted. A common mistake is to forget to - configure the server to allow TCP/IP connections. + configure so that the server + accepts remote TCP connections.
Re: Documentation of .pgpass for Unix is incomplete
On Mon, Aug 19, 2024 at 05:42:33PM -0700, David G. Johnston wrote: > On Mon, Aug 19, 2024 at 5:06 PM Bruce Momjian wrote: > Well, it is more complicated than checking just HOME because it calls > getpwuid_r() if HOME is not set: > > https://doxygen.postgresql.org/fe-connect_8c.html# > a3f49cbb20595c1765bd0db5ff434c9c3 > > Is it worth going into that detail in the docs? > > > > Yes, "the user's home directory" and the "HOME" environment variable are > distinct things. The current docs are wrong. > > The .pgpass file, located in $HOME (a.k.a. ~) on non-Microsoft Windows > systems, > can contain passwords... In the absence of the HOME environment variable, the > path recorded as the user's home directory in the operating system's passwd > file will be checked. This is not a fallback mechanism - if HOME is set, and > the file is not present there, this directory will not be checked). On > Microsoft Windows... Alternatively, the password file to use ... > > I"m somewhat loath to repeat that in: > https://www.postgresql.org/docs/16/libpq-connect.html#LIBPQ-CONNECT-PASSFILE > > passfile > Specifies the name of the file used to store passwords (see Section 34.16). > Defaults to ~/.pgpass, or %APPDATA%\postgresql\pgpass.conf on Microsoft > Windows. (No error is reported if this file does not exist.) > > So I'd suggest just removing the talk of defaults, changing it to: > > "Specifies the name of the file used to store passwords. See Section 34.16 > for > details, including the default file name and path resolution mechanics." I have written the attached patch to add the home directory details. I specified in one place and referenced it to two others. Did I miss any places? -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com When a patient asks the doctor, "Am I going to die?", he means "Am I going to die soon?" diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index afc9346757a..bfefb1289e8 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -9256,7 +9256,9 @@ myEventProc(PGEventId evtId, void *evtInfo, void *passThrough) The file .pgpass in a user's home directory can contain passwords to be used if the connection requires a password (and no password has been - specified otherwise). On Microsoft Windows the file is named + specified otherwise). On Unix systems, the directory can be specified by + the HOME environment variable, or if undefined, the home + directory of the effective user. On Microsoft Windows the file is named %APPDATA%\postgresql\pgpass.conf (where %APPDATA% refers to the Application Data subdirectory in the user's profile). diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml index 627bb5ab5cc..188e8f0b4d0 100644 --- a/doc/src/sgml/postgres-fdw.sgml +++ b/doc/src/sgml/postgres-fdw.sgml @@ -194,7 +194,9 @@ OPTIONS (ADD password_required 'false'); user can potentially use any client certificates, .pgpass, .pg_service.conf etc. in the unix home directory of the -system user the postgres server runs as. They can also use any trust +system user the postgres server runs as. (For details on how home +directories are found, see .) They can +also use any trust relationship granted by authentication modes like peer or ident authentication. diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index b825ca96a23..e42073ed748 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -1048,7 +1048,8 @@ INSERT INTO tbls1 VALUES ($1, $2) \parse stmt1 Changes the current working directory to directory. Without argument, changes - to the current user's home directory. + to the current user's home directory. For details on how home + directories are found, see .
Re: Undocumented optionality of handler_statements
On Wed, Sep 11, 2024 at 03:37:17PM +0900, Michael Paquier wrote: > On Tue, Jul 23, 2024 at 01:25:39PM +0200, Philipp Salvisberg wrote: > > read "optional" as "mandatory". > > They're optional, like in empty being optional. If not specified, the > block goes to its END. > > > Therefore, I suggest to change this example by adding a NULL > > statement as in other examples. This change would make the > > documentation consistent and handle the optionality of > > handler_statements as an implementation detail. I created a patch > > for plpgsql.sgml based on the master branch, adding a NULL statement > > in empty exception handlers (see attached file > > doc_patch_using_null_stmt_instead_of_empty_exception_handler_v1.diff). > > These examples have been around for 20 years with, and I think that it > is helpful to show this pattern as well. So if I were to do something > about that, I would suggest the attached. Do we want to apply this patch? I added a comma to the text, attached. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com When a patient asks the doctor, "Am I going to die?", he means "Am I going to die soon?" iff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml index 78e4983139..3a5e7bc296 100644 --- a/doc/src/sgml/plpgsql.sgml +++ b/doc/src/sgml/plpgsql.sgml @@ -2804,9 +2804,9 @@ BEGIN statements EXCEPTION WHEN condition OR condition ... THEN -handler_statements + handler_statements WHEN condition OR condition ... THEN - handler_statements + handler_statements ... END; @@ -2821,8 +2821,8 @@ END; abandoned, and control passes to the EXCEPTION list. The list is searched for the first condition matching the error that occurred. If a match is found, the - corresponding handler_statements are - executed, and then control passes to the next statement after + corresponding handler_statements, if + specified, are executed, and then control passes to the next statement after END. If no match is found, the error propagates out as though the EXCEPTION clause were not there at all: the error can be caught by an enclosing block with
Re: A minor bug in doc. Hovering over heading shows # besides it.
Where are we on this? I still see "#". --- On Mon, Aug 12, 2024 at 02:19:06PM +1200, David Rowley wrote: > On Fri, 19 Jul 2024 at 06:03, Jonathan S. Katz wrote: > > > > On 7/18/24 1:59 PM, Daniel Gustafsson wrote: > > > Looking a bit closer, the Python documentation does just this, a pilcrow > > > with a > > > tool-tip ("Link to this heading") when hovering over it > > > > I forgot why we went with the "#" and not the (TIL the name) pilcrow > > symbol, but I'm generally used to seeing the pilcrow when I browse docs > > and may have voiced that at the time (though the record may show > > otherwise). So +1 to that. > > I'm not really a web developer, but I did have a go at adjusting the > .css file so we show some relevant tooltip text. I was reminded about > this when looking at [1] this morning. That page is using '#' but the > purpose of it seems quite obvious when combined with the tooltip. > > I've attached a small patch to adjust the CSS with hopes that it might > inspire someone who actually knows what they're doing with CSS to make > it better. > > David > > [1] > https://stackoverflow.blog/2024/07/24/developers-want-more-more-more-the-2024-results-from-stack-overflow-s-annual-developer-survey/ > diff --git a/doc/src/sgml/stylesheet.css b/doc/src/sgml/stylesheet.css > index 86a8edb926..866636eccf 100644 > --- a/doc/src/sgml/stylesheet.css > +++ b/doc/src/sgml/stylesheet.css > @@ -175,6 +175,20 @@ acronym { font-style: inherit; } > a.id_link { > color: inherit; > visibility: hidden; > + text-decoration: none; > +} > + > +a.id_link:hover:after { > + content: "Pemalink to this heading"; > + font-size: 10px; > + font-weight: normal; > + text-decoration: none; > + border: 1px solid #44; > + text-align: center; > + border-radius: 5px 5px 5px 5px; > + padding: 5px 5px 5px 5px; > + position: relative; > + top: 20px; > } > > *:hover > a.id_link { -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com When a patient asks the doctor, "Am I going to die?", he means "Am I going to die soon?"
A minor bug in doc. Hovering over heading shows # besides it.
On Wednesday, October 16, 2024, Bruce Momjian wrote: > > Where are we on this? I still see "#". > > We should document what we are doing in Conventions. The hover stuff is a nice usability add, if a bit limited on mobile. My +1 as well to matching Python with the pilcrow symbol. David J.
Re: incorrect (incomplete) description for "alter domain"
On Mon, Jul 29, 2024 at 11:17:41AM -0400, Tom Lane wrote: > I wrote: > > I think the page is technically correct, but I'm inclined to duplicate > > this text from the CREATE DOMAIN page: > > > where domain_constraint is: > > [ CONSTRAINT constraint_name ] > > { NOT NULL | NULL | CHECK (expression) } > > > rather than making readers go look that up. > > Actually, there *is* a bug in the description, because experimentation > shows that CREATE DOMAIN accepts NULL in this syntax (as advertised) > but ALTER DOMAIN does not. We could alternatively decide that that's > a code bug and make ALTER DOMAIN take it, but I don't think it's worth > any effort (and this behavior may actually have been intentional, too). > I think we should just add > > where domain_constraint is: > > [ CONSTRAINT constraint_name ] > { NOT NULL | CHECK (expression) } > > to the ALTER DOMAIN page, and then remove the claim that it's > identical to CREATE DOMAIN. I have written the attached patch to document this. I assume this should be backpatched to PG 12. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com When a patient asks the doctor, "Am I going to die?", he means "Am I going to die soon?" diff --git a/doc/src/sgml/ref/alter_domain.sgml b/doc/src/sgml/ref/alter_domain.sgml index f6704d7557a..7485517 100644 --- a/doc/src/sgml/ref/alter_domain.sgml +++ b/doc/src/sgml/ref/alter_domain.sgml @@ -41,6 +41,11 @@ ALTER DOMAIN name RENAME TO new_name ALTER DOMAIN name SET SCHEMA new_schema + +where domain_constraint is: + +[ CONSTRAINT constraint_name ] +{ NOT NULL | CHECK (expression) } @@ -79,8 +84,7 @@ ALTER DOMAIN name ADD domain_constraint [ NOT VALID ] - This form adds a new constraint to a domain using the same syntax as - CREATE DOMAIN. + This form adds a new constraint to a domain. When a new constraint is added to a domain, all columns using that domain will be checked against the newly added constraint. These checks can be suppressed by adding the new constraint using the
Re: Undocumented optionality of handler_statements
On Wednesday, October 16, 2024, Bruce Momjian wrote: > On Wed, Sep 11, 2024 at 03:37:17PM +0900, Michael Paquier wrote: > > On Tue, Jul 23, 2024 at 01:25:39PM +0200, Philipp Salvisberg wrote: > > > read "optional" as "mandatory". > > > > They're optional, like in empty being optional. If not specified, the > > block goes to its END. > > > > > Therefore, I suggest to change this example by adding a NULL > > > statement as in other examples. This change would make the > > > documentation consistent and handle the optionality of > > > handler_statements as an implementation detail. I created a patch > > > for plpgsql.sgml based on the master branch, adding a NULL statement > > > in empty exception handlers (see attached file > > > doc_patch_using_null_stmt_instead_of_empty_exception_handler_v1.diff). > > > > These examples have been around for 20 years with, and I think that it > > is helpful to show this pattern as well. So if I were to do something > > about that, I would suggest the attached. > > Do we want to apply this patch? I added a comma to the text, attached. > -1 for me. This establishes a policy change for documenting an empty set of statements without touching all places that would require changing to conform to the new policy. I’m weakly against changing the policy at this point; if we do the patch needs to touch all relevant places. David J.
Re: Logical replication - initial data synchronization
On Sat, May 18, 2024 at 09:02:11PM +, PG Doc comments form wrote: > The following documentation comment has been logged on the website: > > Page: https://www.postgresql.org/docs/16/logical-replication-subscription.html > Description: > > I'm reading up on Logical Replication and have been reading the pages in > order. > > The first 2 pages: > https://www.postgresql.org/docs/current/logical-replication.html and > https://www.postgresql.org/docs/current/logical-replication-publication.html > both speak of the requirement to set up a snapshot and explain that > publication will then send further updates as they happen to subscribers. > > But the 3rd page, > https://www.postgresql.org/docs/current/logical-replication-subscription.html > now mentions this: "Additional replication slots may be required for the > initial data synchronization of pre-existing table data and those will be > dropped at the end of data synchronization." > > For me, reading the first 2 pages implied that I would have to perform some > manual command that starts the creation of a snapshot of pre-existing table > data, and unpack this on the subscriber node somehow. > > The text on the "Subscription" page sounds to me like this is actually > something the publisher<-> subscriber model of the postgres software can > manage on its own. As opposed to a snapshot, which feels more like the > concept of a basebackup. > > Regardless of that being correct or not, my current impression is that the > description isn't consistent across pages. Maybe the text is obvious for > people who've performed setup of logical replication before, but I have > never done this. To me, the description on the first 2 pages seems > inconsistent with the description I just encountered on the 3rd page. I was > under the impression there was no such thing as "initial data > synchronization of pre-existing table data" in terms of postgres doing this > by itself. > > Am I missing something extremely simple, or can the description of the > involved operations be made more consistent across documentation pages? Is the attached patch an improvement? -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com When a patient asks the doctor, "Am I going to die?", he means "Am I going to die soon?" diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml index 98a7ad0c272..cba15fce908 100644 --- a/doc/src/sgml/logical-replication.sgml +++ b/doc/src/sgml/logical-replication.sgml @@ -24,9 +24,9 @@ - Logical replication of a table typically starts with taking a snapshot + Internally logical replication of a table starts by taking a snapshot of the data on the publisher database and copying that to the subscriber. - Once that is done, the changes on the publisher are sent to the subscriber + Once complete, the changes on the publisher are sent to the subscriber as they occur in real-time. The subscriber applies the data in the same order as the publisher so that transactional consistency is guaranteed for publications within a single subscription. This method of data replication @@ -165,7 +165,7 @@ The individual tables can be added and removed dynamically using ALTER PUBLICATION. Both the ADD TABLE and DROP TABLE operations are - transactional; so the table will start or stop replicating at the correct + transactional, so the table will start or stop replicating at the correct snapshot once the transaction has committed. @@ -1953,8 +1953,8 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER Architecture - Logical replication starts by copying a snapshot of the data on the - publisher database. Once that is done, changes on the publisher are sent + Internally logical replication starts by copying a snapshot of the data on the + publisher database. Once complete, changes on the publisher are sent to the subscriber as they occur in real time. The subscriber applies data in the order in which commits were made on the publisher so that transactional consistency is guaranteed for the publications within any