Re: list of TransactionIds
On 2022-May-14, Amit Kapila wrote: > On Sat, May 14, 2022 at 1:57 AM Alvaro Herrera > wrote: > > > > We didn't have any use of TransactionId as members of List, until > > RelationSyncEntry->streamed_txns was introduced (464824323e57, pg14). > > It's currently implemented as a list of int. This is not wrong at > > present, but it may soon be, and I'm sure it rubs some people the wrong > > way. > > > > But is the rubbing way wrong enough to add support for TransactionId in > > pg_list.h, including, say, T_XidList? > > +1. I don't know if we have a need for this at other places but I feel > it is a good idea to make its current use better. I hesitate to add this the day just before beta. This is already in pg14, so maybe it's not a big deal if pg15 remains the same for the time being. Or we can change it for beta2. Or we could just punt until pg16. Any preferences? (Adding this to pg14 seems out of the question. It's probably okay ABI-wise to add a new node tag at the end of the list, but I'm not sure it's warranted.) -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "No deja de ser humillante para una persona de ingenio saber que no hay tonto que no le pueda enseñar algo." (Jean B. Say)
Make name optional in CREATE STATISTICS
Currently, CREATE STATS requires you to think of a name for each stats object, which is fairly painful, so users would prefer an automatically assigned name. Attached patch allows this, which turns out to be very simple, since a name assignment function already exists. The generated name is simple, but that's exactly what users do anyway, so it is not too bad. Tests, docs included. -- Simon Riggshttp://www.EnterpriseDB.com/ create_stats_opt_name.v3.patch Description: Binary data
Re: [PATCH] New [relation] option engine
I'm sorry if you've already said this elsewhere, but can you please state what is the *intention* of this patchset? If it's a pure refactoring (but I don't think it is), then it's a net loss, because after pgindent it summarizes as: 58 files changed, 2714 insertions(+), 2368 deletions(-) so we end up 500+ lines worse than the initial story. However, I suspect that's not the final situation, since I saw a comment somewhere that opclass options have to be rewritten to modify this mechanism, and I suspect that will remove a few lines. And you maybe have a more ambitious goal. But what is it? Please pgindent your code for the next submission, making sure to add your new typedef(s) to typedefs.list so that it doesn't generate stupid spaces. After pgindenting you'll notice the argument lists of some functions look bad (cf. commit c4f113e8fef9). Please fix that too. I notice that you kept the commentary about lock levels in the place where they were previously defined. This is not good. Please move each explanation next to the place where each option is defined. For next time, please use "git format-patch" for submission, and write a tentative commit message. The committer may or may not use your proposed commit message, but with it they will know what you're trying to achieve. The translatability marker for detailmsg for enums is wrong AFAICT. You need gettext_noop() around the strings themselves IIRC. I think you need to get rid of the _() call around the variable that receives that value and use errdetail() instead of errdetail_internal(), to avoid double-translating it; but I'm not 100% sure. Please experiment with "make update-po" until you get the messages in the .po file. You don't need braces around single-statement blocks. Thanks -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "World domination is proceeding according to plan"(Andrew Morton)
Re: make MaxBackends available in _PG_init
On Fri, May 13, 2022 at 09:49:54AM -0400, Robert Haas wrote: > > Committed. Thanks! For the record I submitted patches or pull requests this weekend for all repositories I was aware of to fix the compatibility with this patch, hoping that it will save some time to the packagers when they will take care of the beta.
Re: Make name optional in CREATE STATISTICS
ne 15. 5. 2022 v 14:20 odesílatel Simon Riggs napsal: > Currently, CREATE STATS requires you to think of a name for each stats > object, which is fairly painful, so users would prefer an > automatically assigned name. > > Attached patch allows this, which turns out to be very simple, since a > name assignment function already exists. > > The generated name is simple, but that's exactly what users do anyway, > so it is not too bad. > > good idea Pavel > Tests, docs included. > > -- > Simon Riggshttp://www.EnterpriseDB.com/ >
Re: Asynchronous and "direct" IO support for PostgreSQL.
On Wed, Sep 1, 2021 at 11:27 AM Andres Freund wrote: > > Hi, > > Attached is an updated patch AIO series. The major changes are: Hi Andres, is there a plan to get fallocate changes alone first? I think fallocate API can help parallel inserts work (bulk relation extension currently writes zero filled-pages) and make pre-padding while allocating WAL files faster. Regards, Bharath Rupireddy.
Re: gitmaster access
Greetings, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Bruce Momjian writes: > > On Wed, May 11, 2022 at 08:59:26PM -0400, Bruce Momjian wrote: > >> On Thu, May 12, 2022 at 09:04:38AM +0900, Tatsuo Ishii wrote: > >>> The last line should be > >>> "ssh://g...@gitmaster.postgresql.org/postgresql.git"? > > > I assume the URL list at: > >https://git.postgresql.org/gitweb/?p=postgresql.git;a=summary > > is for non-committers. > > Yeah, I agree with that. If we advertise the gitmaster address here, > the primary result will be that we get a lot of complaints from random > people complaining that they can't access it. A secondary result > is likely to be an increase in attacks against that server. I don't think we could change it very easily without some ugly hacking of the tool that generates that page too, which is no good... We might be able to get rid of the ssh:// URL there though... Will look into that. > The onboarding process for new committers should include explaining > about the separate master repo and how they can access it, but that > is absolutely not something we should advertise widely. It does. Thanks, Stephen signature.asc Description: PGP signature
Re: Reproducible coliisions in jsonb_hash
Greetings, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Robert Haas writes: > > Here, that doesn't seem too likely. You could have a column that > > contains 'tom' and ['tom'] and [['tom']] and [[['tom']]] and so forth > > and they all get mapped onto the same bucket and you're sad. But > > probably not. > > Yeah, that might be a more useful way to think about it: is this likely > to cause performance-critical collisions in practice? I agree that > that doesn't seem like a very likely situation, even given that you > might be using json for erratically-structured data. Particularly for something like jsonb (but maybe other things?) having a hash function that could be user-defined or at least have some options seems like it would be quite nice (similar to compression...). If we were to go in the direction of changing this, I'd suggest that we try to make it something where the existing function could still be used while also allowing a new one to be used. More flexibility would be even better, of course (column-specific hash functions comes to mind...). Agreed with the general conclusion here also, just wanted to share some thoughts on possible future directions to go in. Thanks, Stephen signature.asc Description: PGP signature
[RFC] Improving multi-column filter cardinality estimation using MCVs and HyperLogLog
Note: I am not (currently) planning on implementing this rough idea, just putting it up to share and document the idea, on request of Tomas (cc-ed). The excellent pgconf.de presentation on PostgreSQL's extended statistics system by Tomas Vondra [0] talked about how the current default statistics assume the MCVs of columns to be fully independent, i.e. values of column A do not imply any value of columns B and C, and that for accurate data on correllated values the user needs to manually create statistics on the combined columns (by either STATISTICS or by INDEX). This is said to be due to limitations in our statistics collector: to determine the fraction of the table that contains the value, we store the N most common values with the fraction of their occurrance in the table. This value is quite exact, but combining these values proves difficult: there is nothing in the stored value that can confidently include or exclude parts of the table from a predicate using that MCV, so we can only assume that the values of two columns are independent. After the presentation it came to me that if we were to add an estimator for the number of rows with that value to the MCV lists in the form of HLL sketches (in addition to or replacing the current most_common_elem_freqs fractions), we would be able to make better estimates for multi-column filters by combining the HLL row cardinality sketches for filters that filter on these MCVs. This would remove the immediate need for manual statistics with an cartesian product of the MCVs of those columns with their occurrance fractions, which significantly reduces the need for the creation of manual statistics - the need that exists due to planner mis-estimates in correlated columns. Custom statistics will still be required for expression statistics, but column correlation estimations _within MCVs_ is much improved. How I imagine this would work is that for each value in the MCV, an HLL is maintained that estimates the amount of distinct tuples containing that value. This can be h(TID) or h(PK), or anything else that would uniquely identify returned tuples. Because the keyspace of all HLLs that are generated are on the same table, you can apply join and intersection operations on the HLLs of the MCVs (for OR and AND-operations respectively), and provide fairly accurately estimates for the amount of tuples that would be returned by the filter on that table. The required size of the HLL sketches can be determined by the amount of tuples scanned during analyze, potentially reducing the size required to store these HLL sketches from the usual 1.5kB per sketch to something smaller - we'll only ever need to count nTuples distinct values, so low values for default_statistics_target would allow for smaller values for m in the HLL sketches, whilst still providing fairly accurate result estimates. Kind regards, Matthias van de Meent PS: Several later papers correctly point out that HLL can only count up to 2^32 due to the use of a hash function that outputs only 32 bits; which is not enough for large tables. HLL++ solves this by using a hash function that outputs 64 bits, and can thus be considered a better alternative which provides the same features. But, any other sketch that provides an accurate (but not necessarily: perfect) count-distinct of which results can be combined should be fine as well. [0] https://www.postgresql.eu/events/pgconfde2022/schedule/session/3704-an-overview-of-extended-statistics-in-postgresql/
Re: [RFC] Improving multi-column filter cardinality estimation using MCVs and HyperLogLog
On 5/15/22 21:55, Matthias van de Meent wrote: > Note: I am not (currently) planning on implementing this rough idea, > just putting it up to share and document the idea, on request of Tomas > (cc-ed). > > The excellent pgconf.de presentation on PostgreSQL's extended > statistics system by Tomas Vondra [0] talked about how the current > default statistics assume the MCVs of columns to be fully independent, > i.e. values of column A do not imply any value of columns B and C, and > that for accurate data on correllated values the user needs to > manually create statistics on the combined columns (by either > STATISTICS or by INDEX). > > This is said to be due to limitations in our statistics collector: to > determine the fraction of the table that contains the value, we store > the N most common values with the fraction of their occurrance in the > table. This value is quite exact, but combining these values proves > difficult: there is nothing in the stored value that can confidently > include or exclude parts of the table from a predicate using that MCV, > so we can only assume that the values of two columns are independent. > > After the presentation it came to me that if we were to add an > estimator for the number of rows with that value to the MCV lists in > the form of HLL sketches (in addition to or replacing the current > most_common_elem_freqs fractions), we would be able to make better > estimates for multi-column filters by combining the HLL row > cardinality sketches for filters that filter on these MCVs. This would > remove the immediate need for manual statistics with an cartesian > product of the MCVs of those columns with their occurrance fractions, > which significantly reduces the need for the creation of manual > statistics - the need that exists due to planner mis-estimates in > correlated columns. Custom statistics will still be required for > expression statistics, but column correlation estimations _within > MCVs_ is much improved. > > How I imagine this would work is that for each value in the MCV, an > HLL is maintained that estimates the amount of distinct tuples > containing that value. This can be h(TID) or h(PK), or anything else > that would uniquely identify returned tuples. Because the keyspace of > all HLLs that are generated are on the same table, you can apply join > and intersection operations on the HLLs of the MCVs (for OR and > AND-operations respectively), and provide fairly accurately estimates > for the amount of tuples that would be returned by the filter on that > table. > > The required size of the HLL sketches can be determined by the amount > of tuples scanned during analyze, potentially reducing the size > required to store these HLL sketches from the usual 1.5kB per sketch > to something smaller - we'll only ever need to count nTuples distinct > values, so low values for default_statistics_target would allow for > smaller values for m in the HLL sketches, whilst still providing > fairly accurate result estimates. > I think it's an interesting idea. In principle it allows deducing the multi-column MCV for arbitrary combination of columns, not determined in advance. We'd have the MCV with HLL instead of frequencies for columns A, B and C: (a1, hll(a1)) (a2, hll(a2)) (...) (aK, hll(aK)) (b1, hll(b1)) (b2, hll(b2)) (...) (bL, hll(bL)) (c1, hll(c1)) (c2, hll(c2)) (...) (cM, hll(cM)) and from this we'd be able to build MCV for any combination of those three columns. And in some sense it might even be more efficient/accurate, because the MCV on (A,B,C) might have up to K*L*M items. if there's 100 items in each column, that'd be 1,000,000 combinations, which we can't really store (target is up to 10k). And even if we could, it'd be 1M combinations with frequencies (so ~8-16B per combination). While with the MCV/HLL, we'd have 300 items and HLL. Assuming 256-512B HLL would be enough, that's still way smaller than the multi-column MCV. Even with target=10k it'd still be cheaper to store the separate MCV with HLL values, if I count right, and there'd be no items omitted from the MCV. > Kind regards, > > Matthias van de Meent > > PS: Several later papers correctly point out that HLL can only count > up to 2^32 due to the use of a hash function that outputs only 32 > bits; which is not enough for large tables. HLL++ solves this by using > a hash function that outputs 64 bits, and can thus be considered a > better alternative which provides the same features. But, any other > sketch that provides an accurate (but not necessarily: perfect) > count-distinct of which results can be combined should be fine as > well. > I don't think the 32-bit limitation is a problem for us, because we'd be only ever build HLL on a sample, not the whole table. And the samples are limited to 3M rows (with statistics target = 10k), so we're nowhere near the scale requiring 64-bit hashes. Presumably the statistics target value would determine the necessary HLL parameters (and
Minor improvements to test log navigability
Hi, Speaking as someone who regularly trawls through megabytes of build farm output: 1. It seems a bit useless to have a load of "FATAL: the database system is in recovery mode" spam whenever the server crashes under src/test/regress. Any reason not to just turn that off, as we do for the TAP tests? 2. The TAP test logs are strangely named. Any reason not to call them 001_testname.log, instead of regress_log_001_testname, so they appear next to the corresponding 001_testname_{primary,standby,xxx}.log in directory listings (CI) and dumps (build farm, presumably), and have a traditional .log suffix? From 2ddc124126ad0bb635023b805dbb6ad1b30fc863 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Mon, 16 May 2022 10:34:04 +1200 Subject: [PATCH 1/2] Use restart_after_crash=off for regression tests. Continuing to try to connect while the server is not accepting connections in crash recovery produces a lot of useless extra log material. Let's not do that. --- src/test/regress/pg_regress.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index 982801e029..9a04007651 100644 --- a/src/test/regress/pg_regress.c +++ b/src/test/regress/pg_regress.c @@ -2376,6 +2376,7 @@ regression_main(int argc, char *argv[], snprintf(buf, sizeof(buf), "\"%s%spostgres\" -D \"%s/data\" -F%s " "-c \"listen_addresses=%s\" -k \"%s\" " + "-c \"restart_after_crash=off\" " "> \"%s/log/postmaster.log\" 2>&1", bindir ? bindir : "", bindir ? "/" : "", -- 2.36.0 From aa24fa3b8b25bfc237ca007f9986fc2b8c9c37d7 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Mon, 16 May 2022 10:36:30 +1200 Subject: [PATCH 2/2] Rename TAP test log output files. Instead of "regress_log_001_testname", use "001_testname.log". This will cluster them with the corresponding server logs in sorted directory listings (CI) and dumps (build farm). --- .cirrus.yml| 1 - src/test/perl/PostgreSQL/Test/Utils.pm | 2 +- src/test/perl/README | 2 +- 3 files changed, 2 insertions(+), 3 deletions(-) diff --git a/.cirrus.yml b/.cirrus.yml index f23d6cae55..e7232b7a7c 100644 --- a/.cirrus.yml +++ b/.cirrus.yml @@ -33,7 +33,6 @@ on_failure: &on_failure paths: - "**/*.log" - "**/*.diffs" - - "**/regress_log_*" type: text/plain task: diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm b/src/test/perl/PostgreSQL/Test/Utils.pm index 1ca2cc5917..8d7e20b0eb 100644 --- a/src/test/perl/PostgreSQL/Test/Utils.pm +++ b/src/test/perl/PostgreSQL/Test/Utils.pm @@ -201,7 +201,7 @@ INIT # Open the test log file, whose name depends on the test name. $test_logfile = basename($0); $test_logfile =~ s/\.[^.]+$//; - $test_logfile = "$log_path/regress_log_$test_logfile"; + $test_logfile = "$log_path/$test_logfile.log"; open my $testlog, '>', $test_logfile or die "could not open STDOUT to logfile \"$test_logfile\": $!"; diff --git a/src/test/perl/README b/src/test/perl/README index 4b160cce36..1227944132 100644 --- a/src/test/perl/README +++ b/src/test/perl/README @@ -19,7 +19,7 @@ make check-world PROVE_FLAGS='--verbose' When a test fails, the terminal output from 'prove' is usually not sufficient to diagnose the problem. Look into the log files that are left under -tmp_check/log/ to get more info. Files named 'regress_log_XXX' are log +tmp_check/log/ to get more info. Files named '001_testname.log' are log output from the perl test scripts themselves, and should be examined first. Other files are postmaster logs, and may be helpful as additional data. -- 2.36.0
Re: gitmaster access
> Greetings, > > * Tom Lane (t...@sss.pgh.pa.us) wrote: >> Bruce Momjian writes: >> > On Wed, May 11, 2022 at 08:59:26PM -0400, Bruce Momjian wrote: >> >> On Thu, May 12, 2022 at 09:04:38AM +0900, Tatsuo Ishii wrote: >> >>> The last line should be >> >>> "ssh://g...@gitmaster.postgresql.org/postgresql.git"? >> >> > I assume the URL list at: >> >https://git.postgresql.org/gitweb/?p=postgresql.git;a=summary >> > is for non-committers. >> >> Yeah, I agree with that. If we advertise the gitmaster address here, >> the primary result will be that we get a lot of complaints from random >> people complaining that they can't access it. A secondary result >> is likely to be an increase in attacks against that server. > > I don't think we could change it very easily without some ugly hacking > of the tool that generates that page too, which is no good... > > We might be able to get rid of the ssh:// URL there though... Will look > into that. For postgresql.git, I agree. But for other repositories, I do not agree. Best reagards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Re: Making JIT more granular
> > > 2. You calculate the cost to compare with jit_above_cost as: > > plan->total_cost * plan->est_loops. > > An alternative way might be to consider the rescan cost like > cost_rescan. This should be closer for a final execution cost. > However since it is hard to set a reasonable jit_above_cost, > so I am feeling the current way is OK as well. > There are two observers after thinking more about this. a). due to the rescan cost reason, plan->total_cost * plan->est_loops might be greater than the whole plan's total_cost. This may cause users to be confused why this change can make the plan not JITed in the past, but JITed now. explain analyze select * from t1, t2 where t1.a = t2.a; QUERY PLAN Nested Loop (cost=0.00..154.25 rows=100 width=16) (actual time=0.036..2.618 rows=100 loops=1)Join Filter: (t1.a = t2.a)Rows Removed by Join Filter: 9900-> Seq Scan on t1 (cost=0.00..2.00 rows=100 width=8) (actual time=0.015..0.031 rows=100 loops=1)-> Materialize (cost=0.00..2.50 rows=100 width=8) (actual time=0.000..0.010 rows=100 loops=100) -> Seq Scan on t2 (cost=0.00..2.00 rows=100 width=8) (actual time=0.007..0.023 rows=100 loops=1) Planning Time: 0.299 ms Execution Time: 2.694 ms (8 rows) The overall plan's total_cost is 154.25, but the Materialize's JIT cost is 2.5 * 100 = 250. b). Since the total_cost for a plan counts all the costs for its children, so if one child plan is JITed, I think all its parents would JITed. Is this by design? QUERY PLAN Sort Sort Key: (count(*)) -> HashAggregate Group Key: a -> Seq Scan on t1 (If Seq Scan is JITed, both HashAggregate & Sort will be JITed.) -- Best Regards Andy Fan
RE: First draft of the PG 15 release notes
On Saturday, May 14, 2022 12:07 AM 'Bruce Momjian' wrote: > On Fri, May 13, 2022 at 01:36:04AM +, osumi.takami...@fujitsu.com wrote: > > > > > > This is enabled with the subscriber option "disable_on_error" > > > and avoids possible infinite loops during stream application. > > > > > > > > Thank you ! > > > > In this last paragraph, how about replacing "infinite loops" > > with "infinite error loops" ? I think it makes the situation somewhat > > clear for readers. > > Agreed, done. Thanks ! Best Regards, Takamichi Osumi
Re: list of TransactionIds
On Sun, May 15, 2022 at 5:05 PM Alvaro Herrera wrote: > > On 2022-May-14, Amit Kapila wrote: > > > On Sat, May 14, 2022 at 1:57 AM Alvaro Herrera > > wrote: > > > > > > We didn't have any use of TransactionId as members of List, until > > > RelationSyncEntry->streamed_txns was introduced (464824323e57, pg14). > > > It's currently implemented as a list of int. This is not wrong at > > > present, but it may soon be, and I'm sure it rubs some people the wrong > > > way. > > > > > > But is the rubbing way wrong enough to add support for TransactionId in > > > pg_list.h, including, say, T_XidList? > > > > +1. I don't know if we have a need for this at other places but I feel > > it is a good idea to make its current use better. > > I hesitate to add this the day just before beta. This is already in > pg14, so maybe it's not a big deal if pg15 remains the same for the time > being. Or we can change it for beta2. Or we could just punt until > pg16. Any preferences? > I prefer to do this for pg16 unless we see some bug due to this. -- With Regards, Amit Kapila.
pgbench --partitions=0
Hi, I noticed that $subject causes an error in HEAD: $ pgbench -i --partitions=0 pgbench: error: --partitions must be in range 1..2147483647 However, it works in v13 and v14, assuming no partitions. I think the commit 6f164e6d17 may have unintentionally broken it, by introducing this hunk: @@ -6135,12 +6116,9 @@ main(int argc, char **argv) break; case 11:/* partitions */ initialization_option_set = true; - partitions = atoi(optarg); - if (partitions < 0) - { - pg_log_fatal("invalid number of partitions: \"%s\"", optarg); + if (!option_parse_int(optarg, "--partitions", 1, INT_MAX, + &partitions)) exit(1); - } Attached a patch to fix with a test added. cc'ing Michael who authored that commit. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com pgbench-accept-0-partitions.patch Description: Binary data
Re: PostgreSQL 15 Beta 1 release announcement draft
On Sun, May 15, 2022 at 12:22 AM Jonathan S. Katz wrote: > > Please provide feedback no later than 2022-05-19 0:00 AoE[1]. > > [`recovery_prefetch`](https://www.postgresql.org/docs/15/runtime-config-wal.html#GUC-RECOVERY-PREFETCH) > that can help speed up all recovery operations by prefetching data blocks. Is it okay to say that this feature speeds up *all* recovery operations? See the discussion between Simon and Tomas [1] related to this. [1] - https://www.postgresql.org/message-id/3f4d65c8-ad61-9f57-d5ad-6c1ea841e471%40enterprisedb.com -- With Regards, Amit Kapila.
RE: Skipping schema changes in publication
On Saturday, May 14, 2022 10:33 PM vignesh C wrote: > Thanks for the comments, the attached v5 patch has the changes for the same. > Also I have made the changes for SKIP Table based on the new syntax, the > changes for the same are available in > v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch. Hi, Thank you for updating the patch. I'll share few minor review comments on v5-0001. (1) doc/src/sgml/ref/alter_publication.sgml @@ -73,12 +85,13 @@ ALTER PUBLICATION name RENAME TO ADD ALL TABLES IN SCHEMA and SET ALL TABLES IN SCHEMA to a publication requires the - invoking user to be a superuser. To alter the owner, you must also be a - direct or indirect member of the new owning role. The new owner must have - CREATE privilege on the database. Also, the new owner - of a FOR ALL TABLES or FOR ALL TABLES IN - SCHEMA publication must be a superuser. However, a superuser can - change the ownership of a publication regardless of these restrictions. + invoking user to be a superuser. RESET of publication + requires the invoking user to be a superuser. To alter the owner, you must ... I suggest to combine the first part of your change with one existing sentence before your change, to make our description concise. FROM: "The ADD ALL TABLES IN SCHEMA and SET ALL TABLES IN SCHEMA to a publication requires the invoking user to be a superuser. RESET of publication requires the invoking user to be a superuser." TO: "The ADD ALL TABLES IN SCHEMA, SET ALL TABLES IN SCHEMA to a publication and RESET of publication requires the invoking user to be a superuser." (2) typo +++ b/src/backend/commands/publicationcmds.c @@ -53,6 +53,13 @@ #include "utils/syscache.h" #include "utils/varlena.h" +#define PUB_ATION_INSERT_DEFAULT true +#define PUB_ACTION_UPDATE_DEFAULT true Kindly change FROM: "PUB_ATION_INSERT_DEFAULT" TO: "PUB_ACTION_INSERT_DEFAULT" (3) src/test/regress/expected/publication.out +-- Verify that only superuser can reset a publication +ALTER PUBLICATION testpub_reset OWNER TO regress_publication_user2; +SET ROLE regress_publication_user2; +ALTER PUBLICATION testpub_reset RESET; -- fail We have "-- fail" for one case in this patch. On the other hand, isn't better to add "-- ok" (or "-- success") for other successful statements, when we consider the entire tests description consistency ? Best Regards, Takamichi Osumi
Re: Backends stunk in wait event IPC/MessageQueueInternal
On Sat, 14 May 2022 at 11:01, Thomas Munro wrote: > On Sat, May 14, 2022 at 10:25 AM Thomas Munro wrote: >> Japin, are you able to reproduce the problem reliably? Did I guess >> right, that you're on illumos? Does this help? I used >> defined(__sun__) to select the option, but I don't remember if that's >> the right way to detect that OS family, could you confirm that, or >> adjust as required? > > Better version. Now you can independently set -DWAIT_USE_{POLL,EPOLL} > and -DWAIT_USE_{SELF_PIPE,SIGNALFD} for testing, and it picks a > sensible default. Thanks for your patch! The illumos already defined the following macros. $ gcc -dM -E -
Re: Possible corruption by CreateRestartPoint at promotion
On Mon, May 09, 2022 at 09:24:06AM +0900, Michael Paquier wrote: > Okay, applied this one on HEAD after going back-and-forth on it for > the last couple of days. I have found myself shaping the patch in > what looks like its simplest form, by applying the check based on an > older checkpoint to all the fields updated in the control file, with > the check on DB_IN_ARCHIVE_RECOVERY applying to the addition of > DB_SHUTDOWNED_IN_RECOVERY (got initialially surprised that this was > having side effects on pg_rewind) and the minRecoveryPoint > calculations. It took me some time to make sure that this would be safe, but done now for all the stable branches. > Now, it would be nice to get a test for this stuff, and we are going > to need something cheaper than what's been proposed upthread. This > comes down to the point of being able to put a deterministic stop > in a restart point while it is processing, meaning that we need to > interact with one of the internal routines of CheckPointGuts(). One > fancy way to do so would be to forcibly take a LWLock to stuck the > restart point until it is released. Using a SQL function for that > would be possible, if not artistic. Perhaps we don't need such a > function though, if we could stuck arbitrarily the internals of a > checkpoint? Any ideas? One thing that we could do here is to resurrect the patch that adds support for stop points in the code.. -- Michael signature.asc Description: PGP signature
Re: strange slow query - lost lot of time somewhere
On Fri, 6 May 2022 at 21:27, David Rowley wrote: > I've attached a patch to fix. I'll look at it in more detail after the > weekend. I've now pushed this fix to master and backpatched to 14. David
Re: strange slow query - lost lot of time somewhere
po 16. 5. 2022 v 6:11 odesílatel David Rowley napsal: > On Fri, 6 May 2022 at 21:27, David Rowley wrote: > > I've attached a patch to fix. I'll look at it in more detail after the > weekend. > > I've now pushed this fix to master and backpatched to 14. > Thank you Pavel > > David >
Re: list of TransactionIds
On Mon, May 16, 2022 at 07:58:37AM +0530, Amit Kapila wrote: > I prefer to do this for pg16 unless we see some bug due to this. Agreed. This does not seem worth taking any risk with after beta1, and v14 got released this way. -- Michael signature.asc Description: PGP signature
Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set
On Sat, May 14, 2022 at 01:27:28AM -0700, Noah Misch wrote: > Here, I requested the rationale for the differences you had just described. > You made a choice to stop testing one list of database names and start testing > a different list of database names. Why? Because the shape of the new names does not change the test coverage ("regression" prefix or the addition of the double quotes with backslashes for all the database names), while keeping the code a bit simpler. If you think that the older names are more adapted, I have no objections to use them, FWIW, which is something like the patch attached would achieve. This uses the same convention as vcregress.pl before 322becb, but not the one of test.sh where "regression" was appended to the database names. -- Michael diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl index 8372a85e6e..33a75991d8 100644 --- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl +++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl @@ -13,18 +13,16 @@ use Test::More; # Generate a database with a name made of a range of ASCII characters. sub generate_db { - my ($node, $from_char, $to_char) = @_; + my ($node, $prefix, $from_char, $to_char, $suffix) = @_; - my $dbname = ''; + my $dbname = $prefix; for my $i ($from_char .. $to_char) { next if $i == 7 || $i == 10 || $i == 13;# skip BEL, LF, and CR $dbname = $dbname . sprintf('%c', $i); } - # Exercise backslashes adjacent to double quotes, a Windows special - # case. - $dbname = '\\"\\' . $dbname . '"\\'; + $dbname .= $suffix; $node->command_ok([ 'createdb', $dbname ]); } @@ -79,10 +77,12 @@ else { # Default is to use pg_regress to set up the old instance. - # Create databases with names covering most ASCII bytes - generate_db($oldnode, 1, 45); - generate_db($oldnode, 46, 90); - generate_db($oldnode, 91, 127); + # Create databases with names covering most ASCII bytes. The + # first name exercises backslashes adjacent to double quotes, a + # Windows special case. + generate_db($oldnode, "\\\"\\", 1, 45, "\"\\"); + generate_db($oldnode, '', 46, 90, ''); + generate_db($oldnode, '', 91, 127, ''); # Grab any regression options that may be passed down by caller. my $extra_opts = $ENV{EXTRA_REGRESS_OPTS} || ""; signature.asc Description: PGP signature
Re: Make relfile tombstone files conditional on WAL level
On Thu, May 12, 2022 at 4:27 PM Amul Sul wrote: > Hi Amul, Thanks for the review, actually based on some comments from Robert we have planned to make some design changes. So I am planning to work on that for the July commitfest. I will try to incorporate all your review comments in the new version. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: pgbench --partitions=0
On Mon, May 16, 2022 at 11:34:41AM +0900, Amit Langote wrote: > Attached a patch to fix with a test added. cc'ing Michael who > authored that commit. Indeed, 6f164e6d got that incorrectly. I don't really want to play with the master branch at this stage, even if this is trivial, but I'll fix it after the version is tagged. Thanks for the report, Amit. -- Michael signature.asc Description: PGP signature
Re: pgbench --partitions=0
On Mon, May 16, 2022 at 2:41 PM Michael Paquier wrote: > On Mon, May 16, 2022 at 11:34:41AM +0900, Amit Langote wrote: > > Attached a patch to fix with a test added. cc'ing Michael who > > authored that commit. > > Indeed, 6f164e6d got that incorrectly. I don't really want to play > with the master branch at this stage, even if this is trivial, but > I'll fix it after the version is tagged. Sounds good to me. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Re: pgbench --partitions=0
On Mon, May 16, 2022 at 02:44:47PM +0900, Amit Langote wrote: > Sounds good to me. (I have added an open item, just in case.) -- Michael signature.asc Description: PGP signature
Re: bogus: logical replication rows/cols combinations
On Fri, May 13, 2022 at 11:32 AM houzj.f...@fujitsu.com wrote: > > On Thursday, May 12, 2022 2:45 PM Amit Kapila wrote: > > > > On Wed, May 11, 2022 at 12:55 PM houzj.f...@fujitsu.com > > wrote: > > > > Few comments: > > === > > 1. > > initStringInfo(&cmd); > > - appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, > > t.tablename\n" > > -" FROM pg_catalog.pg_publication_tables t\n" > > + appendStringInfoString(&cmd, > > +"SELECT DISTINCT t.schemaname,\n" > > +"t.tablename,\n" > > +"(CASE WHEN (array_length(pr.prattrs, 1) = > > t.relnatts)\n" > > +"THEN NULL ELSE pr.prattrs END)\n" > > +" FROM (SELECT P.pubname AS pubname,\n" > > +" N.nspname AS schemaname,\n" > > +" C.relname AS tablename,\n" > > +" P.oid AS pubid,\n" > > +" C.oid AS reloid,\n" > > +" C.relnatts\n" > > +" FROM pg_publication P,\n" > > +" LATERAL pg_get_publication_tables(P.pubname) GPT,\n" > > +" pg_class C JOIN pg_namespace N\n" > > +" ON (N.oid = C.relnamespace)\n" > > +" WHERE C.oid = GPT.relid) t\n" > > +" LEFT OUTER JOIN pg_publication_rel pr\n" > > +" ON (t.pubid = pr.prpubid AND\n" > > +"pr.prrelid = reloid)\n" > > > > Can we modify pg_publication_tables to get the row filter and column list > > and > > then use it directly instead of constructing this query? > > Agreed. If we can get columnlist and rowfilter from pg_publication_tables, it > will be more convenient. And I think users that want to fetch the columnlist > and rowfilter of table can also benefit from it. > After the change for this, we will give an error on combining publications where one of the publications specifies all columns in the table and the other doesn't provide any columns. We should not give an error as both mean all columns. > > Attach the new version patch which addressed these comments and update the > document. 0001 patch is to extent the view and 0002 patch is to add > restriction > for column list. > Few comments: = 1. postgres=# select * from pg_publication_tables; pubname | schemaname | tablename | columnlist | rowfilter -++---++--- pub1| public | t1|| pub2| public | t1| 1 2| (c3 < 10) (2 rows) I think it is better to display column names for columnlist in the exposed view similar to attnames in the pg_stats_ext view. I think that will make it easier for users to understand this information. 2. { oid => '6119', descr => 'get OIDs of tables in a publication', proname => 'pg_get_publication_tables', prorows => '1000', proretset => 't', - provolatile => 's', prorettype => 'oid', proargtypes => 'text', - proallargtypes => '{text,oid}', proargmodes => '{i,o}', - proargnames => '{pubname,relid}', prosrc => 'pg_get_publication_tables' }, + provolatile => 's', prorettype => 'record', proargtypes => 'text', + proallargtypes => '{text,oid,int2vector,pg_node_tree}', proargmodes => '{i,o,o,o}', I think we should change the "descr" to something like: 'get information of tables in a publication' 3. + + /* + * We only throw a warning here so that the subcription can still be + * created and let user aware that something is going to fail later and + * they can fix the publications afterwards. + */ + if (list_member(tablelist, rv)) + ereport(WARNING, + errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot use different column lists for table \"%s.%s\" in different publications", +nspname, relname)); Can we extend this comment to explain the case where after Alter Publication, if the user dumps and restores back the subscription, there is a possibility that "CREATE SUBSCRIPTION" won't work if we give ERROR here instead of WARNING? -- With Regards, Amit Kapila.
Re: First draft of the PG 15 release notes
Hi Bruce, "Improve validation of ASCII and UTF-8 text by processing 16 bytes at a time (John Naylor)" The reader might assume here that ASCII is optimized regardless of encoding, but it is only optimized in the context of UTF-8. So I would just mention UTF-8. Thanks! -- John Naylor EDB: http://www.enterprisedb.com
Remove support for Visual Studio 2013
Hi all, Cutting support for now-unsupported versions of Windows is in the air for a couple of months, and while looking at the code a first cleanup that looked rather obvious to me is the removal of support for VS 2013, as of something to do for v16~. The website of Microsoft has only documentation for VS >= 2015 as far as I can see. Note also that VS can be downloaded down to 2012 on their official website, and that the buildfarm members only use VS >= 2017. The patch attached cleans up the following things proper to VS 2013: - Locale handling. - MIN_WINNT assignment. - Some strtof() business, as of win32_port.h. - Removal of _set_FMA3_enable() in main.c related to floating-point operations. - MSVC scripts, but that's less interesting considering the work done with meson. A nice result is that this completely removes all the checks related to the version number of _MSC_VER from the core code, making the code depend only on the definition if the flag. Thanks, -- Michael From 7502f8c9a114ac483e553f444301a23e5c65cc2c Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Mon, 16 May 2022 15:20:26 +0900 Subject: [PATCH] Remove support for VS 2013 --- src/include/port/win32.h | 10 src/include/port/win32_port.h | 7 +- src/backend/main/main.c | 22 - src/backend/utils/adt/float.c | 6 + src/backend/utils/adt/pg_locale.c | 40 --- src/port/chklocale.c | 16 + doc/src/sgml/install-windows.sgml | 10 src/tools/msvc/MSBuildProject.pm | 27 + src/tools/msvc/README | 8 +++ src/tools/msvc/Solution.pm| 28 -- src/tools/msvc/VSObjectFactory.pm | 12 ++ 11 files changed, 19 insertions(+), 167 deletions(-) diff --git a/src/include/port/win32.h b/src/include/port/win32.h index c6213c77c3..539f3ec6d1 100644 --- a/src/include/port/win32.h +++ b/src/include/port/win32.h @@ -11,12 +11,12 @@ /* * Make sure _WIN32_WINNT has the minimum required value. - * Leave a higher value in place. When building with at least Visual - * Studio 2015 the minimum requirement is Windows Vista (0x0600) to - * get support for GetLocaleInfoEx() with locales. For everything else - * the minimum version is Windows XP (0x0501). + * Leave a higher value in place. When building with Visual Studio the + * minimum requirement is Windows Vista (0x0600) to get support for + * GetLocaleInfoEx() with locales. For everything else the minimum + * version is Windows XP (0x0501). */ -#if defined(_MSC_VER) && _MSC_VER >= 1900 +#ifdef _MSC_VER #define MIN_WINNT 0x0600 #else #define MIN_WINNT 0x0501 diff --git a/src/include/port/win32_port.h b/src/include/port/win32_port.h index dbbf88f8e8..c0225603f2 100644 --- a/src/include/port/win32_port.h +++ b/src/include/port/win32_port.h @@ -531,13 +531,8 @@ typedef unsigned short mode_t; #endif /* _MSC_VER */ -#if (defined(_MSC_VER) && (_MSC_VER < 1900)) || \ - defined(__MINGW32__) || defined(__MINGW64__) +#if defined(__MINGW32__) || defined(__MINGW64__) /* - * VS2013 has a strtof() that seems to give correct answers for valid input, - * even on the rounding edge cases, but which doesn't handle out-of-range - * input correctly. Work around that. - * * Mingw claims to have a strtof, and my reading of its source code suggests * that it ought to work (and not need this hack), but the regression test * results disagree with me; whether this is a version issue or not is not diff --git a/src/backend/main/main.c b/src/backend/main/main.c index c43a527d3f..bb782fa1ec 100644 --- a/src/backend/main/main.c +++ b/src/backend/main/main.c @@ -30,11 +30,6 @@ #include #endif -#if defined(_M_AMD64) && _MSC_VER == 1800 -#include -#include -#endif - #include "bootstrap/bootstrap.h" #include "common/username.h" #include "port/atomics.h" @@ -290,23 +285,6 @@ startup_hacks(const char *progname) _CrtSetReportFile(_CRT_ASSERT, _CRTDBG_FILE_STDERR); _CrtSetReportMode(_CRT_WARN, _CRTDBG_MODE_FILE | _CRTDBG_MODE_DEBUG); _CrtSetReportFile(_CRT_WARN, _CRTDBG_FILE_STDERR); - -#if defined(_M_AMD64) && _MSC_VER == 1800 - - /*-- - * Avoid crashing in certain floating-point operations if we were - * compiled for x64 with MS Visual Studio 2013 and are running on - * Windows prior to 7/2008R2 SP1 on an AVX2-capable CPU. - * - * Ref: https://connect.microsoft.com/VisualStudio/feedback/details/811093/visual-studio-2013-rtm-c-x64-code-generation-bug-for-avx2-instructions - *-- - */ - if (!IsWindows7SP1OrGreater()) - { - _set_FMA3_enable(0); - } -#endif /* defined(_M_AMD64) && _MSC_VER == 1800 */ - } #endif /* WIN32 */ diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c index 63bb0f2277..fc8f39a7a9 100644 --- a/src/backend/utils/adt/float.c +++ b/src/backend/utils/adt/float.c @@ -249,13 +249,9 @@ float4in(PG_FUNCTION_ARGS) * precisi