Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING
Hi Dilip! > 17 июля 2020 г., в 15:46, Dilip Kumar написал(а): > > The attached patch allows the vacuum to continue by emitting WARNING > for the corrupted tuple instead of immediately error out as discussed > at [1]. > > Basically, it provides a new GUC called vacuum_tolerate_damage, to > control whether to continue the vacuum or to stop on the occurrence of > a corrupted tuple. So if the vacuum_tolerate_damage is set then in > all the cases in heap_prepare_freeze_tuple where the corrupted xid is > detected, it will emit a warning and return that nothing is changed in > the tuple and the 'tuple_totally_frozen' will also be set to false. > Since we are returning false the caller will not try to freeze such > tuple and the tuple_totally_frozen is also set to false so that the > page will not be marked to all frozen even if all other tuples in the > page are frozen. > > Alternatively, we can try to freeze other XIDs in the tuple which is > not corrupted but I don't think we will gain anything from this, > because if one of the xmin or xmax is wrong then next time also if we > run the vacuum then we are going to get the same WARNING or the ERROR. > Is there any other opinion on this? FWIW AFAIK this ERROR was the reason why we had to use older versions of heap_prepare_freeze_tuple() in our recovery kit [0]. So +1 from me. But I do not think that just ignoring corruption here is sufficient. Soon after this freeze problem user will, probably, have to deal with absent CLOG. I think this GUC is only a part of an incomplete solution. Personally I'd be happy if this is backported - our recovery kit would be much smaller. But this does not seem like a valid reason. Thanks! Best regards, Andrey Borodin. [0] https://github.com/dsarafan/pg_dirty_hands/blob/master/src/pg_dirty_hands.c#L443
Re: Default setting for enable_hashagg_disk
Greetings, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Jeff Davis writes: > > On Fri, 2020-07-17 at 18:38 -0700, Peter Geoghegan wrote: > >> There is also the separate question of what to do about the > >> hashagg_avoid_disk_plan GUC (this is a separate open item that > >> requires a separate resolution). Tom leans slightly towards removing > >> it now. Is your position about the same as before? > > > Yes, I think we should have that GUC (hashagg_avoid_disk_plan) for at > > least one release. > > You'e being optimistic about it being possible to remove a GUC once > we ship it. That seems to be a hard sell most of the time. Agreed. > I'm honestly a bit baffled about the level of fear being expressed > around this feature. We have *frequently* made changes that would > change query plans, perhaps not 100.00% for the better, and never > before have we had this kind of bikeshedding about whether it was > necessary to be able to turn it off. I think the entire discussion > is way out ahead of any field evidence that we need such a knob. > In the absence of evidence, our default position ought to be to > keep it simple, not to accumulate backwards-compatibility kluges. +100 > (The only reason I'm in favor of heap_mem[_multiplier] is that it > seems like it might be possible to use it to get *better* plans > than before. I do not see it as a backwards-compatibility knob.) I still don't think a hash_mem-type thing is really the right direction to go in, even if making a distinction between memory used for sorting and memory used for hashing is, and I'm of the general opinion that we'd be thinking about doing something better and more appropriate- except for the fact that we're talking about adding this in during beta. In other words, if we'd stop trying to shoehorn something in, which we're doing because we're in beta, we'd very likely be talking about all of this in a very different way and probably be contemplating something like a query_mem that provides for an overall memory limit and which favors memory for hashing over memory for sorting, etc. Thanks, Stephen signature.asc Description: PGP signature
Re: Patch for nodeIncrementalSort comment correction.
On Saturday, July 18, 2020, vignesh C wrote: > Hi, > > One of the comments needs correction "sorting all tuples in the the > dataset" should have been "sorting all tuples in the dataset". > The Attached patch has the changes for the same. > > Regards, > Vignesh > EnterpriseDB: http://www.enterprisedb.com > Thanks for fixing this. Looks correct to me. James
Re: OpenSSL 3.0.0 compatibility
On 2020-07-16 13:45, Michael Paquier wrote: On Thu, Jul 16, 2020 at 10:58:58AM +0200, Peter Eisentraut wrote: if test "$with_openssl" = yes ; then dnl Order matters! + AC_DEFINE(OPENSSL_API_COMPAT, [10001], +[Define to the OpenSSL API version in use. This avoids deprecation warnings from newer OpenSSL versions.]) if test "$PORTNAME" != "win32"; then I think that you should additionally mention the version number directly in the description, so as when support for 1.0.1 gets removed it is possible to grep for it, and then adjust the number and the description. Good point. I have committed it with that adjustment. Also, I had the format of the version number wrong, so I changed that, too. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Default setting for enable_hashagg_disk
Stephen Frost writes: > In other words, if we'd stop trying to shoehorn something in, which > we're doing because we're in beta, we'd very likely be talking about all > of this in a very different way and probably be contemplating something > like a query_mem that provides for an overall memory limit and which > favors memory for hashing over memory for sorting, etc. Even if we were at the start of the dev cycle rather than its end, I'm not sure I agree. Yes, replacing work_mem with some more-holistic approach would be great. But that's a research project, one that we can't be sure will yield fruit on any particular schedule. (Seeing that we've understood this to be a problem for *decades*, I would tend to bet on a longer not shorter time frame for a solution.) I think that if we are worried about hashagg-spill behavior in the near term, we have to have some fix that's not conditional on solving that very large problem. The only other practical alternative is "do nothing for v13", and I agree with the camp that doesn't like that. regards, tom lane
Re: WIP: BRIN multi-range indexes
On Wed, Jul 15, 2020 at 05:34:05AM +0300, Alexander Korotkov wrote: On Mon, Jul 13, 2020 at 5:59 PM Tomas Vondra wrote: >> > If we really want to print something nicer, I'd say it needs to be a >> > special function in the BRIN opclass. >> >> If that can be done, then +1. We just need to ensure that the function >> knows and can verify the type of index that the value comes from. I >> guess we can pass the index OID so that it can extract the opclass from >> catalogs to verify. > >+1 from me, too. Perhaps we can have it as optional. If a BRIN opclass >doesn't have it, the 'values' can be null. > I'd say that if the opclass does not have it, then we should print the bytea value (or whatever the opclass uses to store the summary) using the type functions. I've read the recent messages in this thread and I'd like to share my thoughts. I think the way brin_page_items() displays values is not really generic. It uses a range-like textual representation of an array of values, while that array doesn't necessarily have range semantics. However, I think it's good that brin_page_items() uses a type output function to display values. So, it's not necessary to introduce a new BRIN opclass function in order to get values displayed in a human-readable way. Instead, we could just make a standard of BRIN value to be human readable. I see at least two possibilities for that. 1. Use standard container data-types to represent BRIN values. For instance we could use an array of ranges instead of bytea for multirange. Not about how convenient/performant it would be. 2. Introduce new data-type to represent values in BRIN index. And for that type we can define output function with user-readable output. We did similar things for GiST. For instance, pg_trgm defines gtrgm type, which has no input and no output. But for BRIN opclass we can define type with just output. I think there's a number of weak points in this approach. Firstly, it assumes the summaries can be represented as arrays of built-in types, which I'm not really sure about. It clearly is not true for the bloom opclasses, for example. But even for minmax oclasses it's going to be tricky because the ranges may be on different data types so presumably we'd need somewhat nested data structure. Moreover, multi-minmax summary contains either points or intervals, which requires additional fields/flags to indicate that. That further complicates the things ... maybe we could decompose that into separate arrays or something, but honestly it seems somewhat premature - there are far more important aspects to discuss, I think (e.g. how the ranges are built/merged in multi-minmax, or whether bloom opclasses are useful at all). BTW, I've applied the patchset to the current master, but I got a lot of duplicate oids. Could you please resolve these conflicts. I think it would be good to use high oid numbers to evade conflicts during development/review, and rely on committer to set final oids (as discussed in [1]). Links 1. https://www.postgresql.org/message-id/CAH2-WzmMTGMcPuph4OvsO7Ykut0AOCF_i-%3DeaochT0dd2BN9CQ%40mail.gmail.com Did you use the patchset from 2020/07/03? I don't get any duplicate OIDs with it, and it's already using quite high OIDs (part 4 uses >= 8000, part 5 uses >= 9000). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Default setting for enable_hashagg_disk
On Sun, Jul 19, 2020 at 4:38 AM Stephen Frost wrote: > > (The only reason I'm in favor of heap_mem[_multiplier] is that it > > seems like it might be possible to use it to get *better* plans > > than before. I do not see it as a backwards-compatibility knob.) > > I still don't think a hash_mem-type thing is really the right direction > to go in, even if making a distinction between memory used for sorting > and memory used for hashing is, and I'm of the general opinion that we'd > be thinking about doing something better and more appropriate- except > for the fact that we're talking about adding this in during beta. > > In other words, if we'd stop trying to shoehorn something in, which > we're doing because we're in beta, we'd very likely be talking about all > of this in a very different way and probably be contemplating something > like a query_mem that provides for an overall memory limit and which > favors memory for hashing over memory for sorting, etc. > At minimum we'd need a patch we would be happy with dropping in should there be user complaints. And once this conversation ends with that in hand I have my doubts whether there will be interest, or even project desirability, in working toward a "better" solution should this one prove itself "good enough". And as it seems unlikely that this patch would foreclose on other promising solutions, combined with there being a non-trivial behavioral change that we've made, suggests to me that we might as well just deploy whatever short-term solution we come up with now. As for hashagg_avoid_disk_plan... The physical processes we are modelling here: 1. Processing D amount of records takes M amount of memory 2. Processing D amount of records in-memory takes T time per record while doing the same on-disk takes V time per record 3. Processing D amount of records via some other plan has an effective cost U 3. V >> T (is strictly greater than) 4. Having chosen a value for M that ensures T it is still possible for V to end up used Thus: If we get D wrong the user can still tweak the system by changing the hash_mem_multiplier (this is strictly better than v12 which used work_mem) Setting hashagg_avoid_disk_plan = off provides a means to move V infinitely far away from T (set to on by default, off reverts to v12 behavior). There is no way for the user to move V's relative position toward T (n/a in v12) The only way to move T is to make it infinitely large by setting enable_hashagg = off (same as in v12) Is hashagg_disk_cost_multiplier = [0.0, 1,000,000,000.0] i.e., (T * hashagg_disk_cost_multiplier == V) doable? It has a nice symmetry with hash_mem_multiplier and can move V both toward and away from T. To the extent T is tunable or not in v12 it can remain the same in v13. David J.
Fix initdb's unsafe not-null-marking rule
Part of the blame for the pg_subscription.subslotname fiasco can be laid at the feet of initdb's default rule for marking columns NOT NULL; that rule is fairly arbitrary and does not guarantee to make safe choices. I propose that we change it so that it *is* safe, ie it will only mark fields NOT NULL if they'd certainly be safe to access as C struct fields. Keeping the end results the same requires a few more manual applications of BKI_FORCE_NOT_NULL than we had before. But I think that that's fine, because it reduces the amount of poorly-documented magic in this area. I note in particular that bki.sgml was entirely failing to tell the full truth. (Note: this would allow reverting the manual BKI_FORCE_NULL label that I just added to pg_subscription.subslotname, but I feel no great desire to do that.) I propose this only for HEAD, not the back branches. regards, tom lane diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml index 6776c4a3c1..5b871721d1 100644 --- a/doc/src/sgml/bki.sgml +++ b/doc/src/sgml/bki.sgml @@ -119,7 +119,8 @@ require all columns that should be non-nullable to be marked so in pg_attribute. The bootstrap code will automatically mark catalog columns as NOT NULL - if they are fixed-width and are not preceded by any nullable column. + if they are fixed-width and are not preceded by any nullable or + variable-width column. Where this rule is inadequate, you can force correct marking by using BKI_FORCE_NOT_NULL and BKI_FORCE_NULL annotations as needed. But note diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c index 5480a024e0..45b7efbe46 100644 --- a/src/backend/bootstrap/bootstrap.c +++ b/src/backend/bootstrap/bootstrap.c @@ -770,25 +770,18 @@ DefineAttr(char *name, char *type, int attnum, int nullness) /* * Mark as "not null" if type is fixed-width and prior columns are - * too. This corresponds to case where column can be accessed - * directly via C struct declaration. - * - * oidvector and int2vector are also treated as not-nullable, even - * though they are no longer fixed-width. + * likewise fixed-width and not-null. This corresponds to case where + * column can be accessed directly via C struct declaration. */ -#define MARKNOTNULL(att) \ - ((att)->attlen > 0 || \ - (att)->atttypid == OIDVECTOROID || \ - (att)->atttypid == INT2VECTOROID) - - if (MARKNOTNULL(attrtypes[attnum])) + if (attrtypes[attnum]->attlen > 0) { int i; /* check earlier attributes */ for (i = 0; i < attnum; i++) { -if (!attrtypes[i]->attnotnull) +if (attrtypes[i]->attlen <= 0 || + !attrtypes[i]->attnotnull) break; } if (i == attnum) diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl index b07537fbba..dc5f442397 100644 --- a/src/backend/catalog/genbki.pl +++ b/src/backend/catalog/genbki.pl @@ -713,8 +713,8 @@ sub gen_pg_attribute push @tables_needing_macros, $table_name; # Generate entries for user attributes. - my $attnum = 0; - my $priornotnull = 1; + my $attnum = 0; + my $priorfixedwidth = 1; foreach my $attr (@{ $table->{columns} }) { $attnum++; @@ -722,8 +722,12 @@ sub gen_pg_attribute $row{attnum} = $attnum; $row{attrelid} = $table->{relation_oid}; - morph_row_for_pgattr(\%row, $schema, $attr, $priornotnull); - $priornotnull &= ($row{attnotnull} eq 't'); + morph_row_for_pgattr(\%row, $schema, $attr, $priorfixedwidth); + + # Update $priorfixedwidth --- must match morph_row_for_pgattr + $priorfixedwidth &= + ($row{attnotnull} eq 't' + && ($row{attlen} eq 'NAMEDATALEN' || $row{attlen} > 0)); # If it's bootstrapped, put an entry in postgres.bki. print_bki_insert(\%row, $schema) if $table->{bootstrap}; @@ -765,13 +769,13 @@ sub gen_pg_attribute # Given $pgattr_schema (the pg_attribute schema for a catalog sufficient for # AddDefaultValues), $attr (the description of a catalog row), and -# $priornotnull (whether all prior attributes in this catalog are not null), +# $priorfixedwidth (all prior columns are fixed-width and not null), # modify the $row hashref for print_bki_insert. This includes setting data # from the corresponding pg_type element and filling in any default values. # Any value not handled here must be supplied by caller. sub morph_row_for_pgattr { - my ($row, $pgattr_schema, $attr, $priornotnull) = @_; + my ($row, $pgattr_schema, $attr, $priorfixedwidth) = @_; my $attname = $attr->{name}; my $atttype = $attr->{type}; @@ -801,19 +805,18 @@ sub morph_row_for_pgattr { $row->{attnotnull} = 'f'; } - elsif ($priornotnull) + elsif ($priorfixedwidth) { # attnotnull will automatically be set if the type is - # fixed-width and prior columns are all NOT NULL --- - # compare DefineAttr in bootstrap.c. oidvector and - # int2vector are also treated as not-nullable. + # fixed-width and prior columns
Re: [HACKERS] [PATCH] Generic type subscripting
On Sat, May 16, 2020 at 02:28:02PM +0200, Dmitry Dolgov wrote: > > On Mon, Mar 23, 2020 at 07:30:25AM +0100, Pavel Stehule wrote: > > ne 22. 3. 2020 v 20:41 odesílatel Tom Lane napsal: > > > Pavel Stehule writes: > > > > ne 22. 3. 2020 v 18:47 odesílatel Tom Lane napsal: > > > >> cfbot reports this as failing because of missing include files. > > > >> Somebody please post a complete patch set? > > > > > > > here it is > > One more rebase to prepare for 2020-07. I found some minor comment typos. I don't think it's enough to claim a beer: + * container. If you have read until this point, and will submit a meaningful + * review of this patch series, I'll owe you a beer at the next PGConfEU. I'm not sure I understand what this is saying: +* It's necessary only for field selection, since for +* subscripting it's custom code who should define types. should maybe say: "its custom code should define types." (no apostrophe is "possessive") -- Justin >From 6c3fad421619526765127571570e0bcbc7fe1679 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Sun, 19 Jul 2020 13:10:21 -0500 Subject: [PATCH 7/9] fix! Polymorphic subscripting --- src/backend/utils/adt/jsonfuncs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c index 77687e7f71..04d3078f68 100644 --- a/src/backend/utils/adt/jsonfuncs.c +++ b/src/backend/utils/adt/jsonfuncs.c @@ -1750,7 +1750,7 @@ push_null_elements(JsonbParseState **ps, int num) pushJsonbValue(ps, WJB_ELEM, &null); } -/* Perfrom one subscript assignment step */ +/* Perform one subscript assignment step */ static void jsonb_subscript_step_assignment(SubscriptingRefState *sbstate, Datum value, Oid typid, int num, bool isupper) @@ -1831,7 +1831,7 @@ jsonb_subscript_step_assignment(SubscriptingRefState *sbstate, Datum value, pushJsonbValue(&astate->ps, WJB_KEY, &key); } - /* If the value does exits, process and validate its type. */ + /* If the value does exist, process and validate its type. */ if (subscript[1].exists) { if (jbv.type == jbvArray) -- 2.17.0 >From df3d19b401943758577bf939b07b9f338fc81ff3 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Sun, 19 Jul 2020 13:24:42 -0500 Subject: [PATCH 8/9] fix! Subscripting documentation --- doc/src/sgml/json.sgml| 6 +++--- doc/src/sgml/ref/create_type.sgml | 2 +- doc/src/sgml/xsubscripting.sgml | 8 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml index 3bffe8049b..5c538dca05 100644 --- a/doc/src/sgml/json.sgml +++ b/doc/src/sgml/json.sgml @@ -606,8 +606,8 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @> '{"tags": ["qu jsonb Subscripting jsonb data type supports array-style subscripting expressions - to extract or update particular element. It's possible to use multiple - subscripting expressions to extract nested values. In this case a chain of + to extract or update particular elements. It's possible to use multiple + subscripting expressions to extract nested values. In this case, a chain of subscripting expressions follows the same rules as the path argument in jsonb_set function, e.g. in case of arrays it is a 0-based operation or that negative integers @@ -634,7 +634,7 @@ SELECT * FROM table_name WHERE jsonb_field['key'] = '"value"'; There is no special indexing support for such kind of expressions, but you - always can create a functional index that includes it + can always create a functional index that includes it CREATE INDEX idx ON table_name ((jsonb_field['key'])); diff --git a/doc/src/sgml/ref/create_type.sgml b/doc/src/sgml/ref/create_type.sgml index ec67761c66..a34df4d247 100644 --- a/doc/src/sgml/ref/create_type.sgml +++ b/doc/src/sgml/ref/create_type.sgml @@ -470,7 +470,7 @@ CREATE TYPE name and jsonb (jsonb_subscripting_handler) types in src/backend/utils/adt/arrayfuncs.c and - src/backend/utils/adt/jsonfuncs.c corresponding. + src/backend/utils/adt/jsonfuncs.c, respectively. diff --git a/doc/src/sgml/xsubscripting.sgml b/doc/src/sgml/xsubscripting.sgml index d701631223..7224e81fa2 100644 --- a/doc/src/sgml/xsubscripting.sgml +++ b/doc/src/sgml/xsubscripting.sgml @@ -7,8 +7,8 @@ custom subscripting - When you define a new base type, you can also specify a custom procedures to - handle subscripting expressions. They must contain logic for verification and + When you define a new base type, you can also specify a custom procedure to + handle subscripting expressions. It must contain logic for verification and evaluation of this expression, i.e. fetching or updating some data in this data type. For instance: @@ -63,12 +63,12 @@ custom_subscript_assign(Datum cont
Re: pg_subscription.subslotname is wrongly marked NOT NULL
I wrote: > (2) In pre-v13 branches, hack the JIT tuple deconstruction code > to be specifically aware that it can't trust attnotnull for > pg_subscription.subslotname. Yeah, it's ugly, but at least it's > not ugly going forwards. Concretely, as attached for v12. regards, tom lane diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c index 835aea83e9..4b2144e1a7 100644 --- a/src/backend/jit/llvm/llvmjit_deform.c +++ b/src/backend/jit/llvm/llvmjit_deform.c @@ -22,11 +22,24 @@ #include "access/htup_details.h" #include "access/tupdesc_details.h" +#include "catalog/pg_subscription.h" #include "executor/tuptable.h" #include "jit/llvmjit.h" #include "jit/llvmjit_emit.h" +/* + * Through an embarrassing oversight, pre-v13 installations may have + * pg_subscription.subslotname marked as attnotnull, which it should not be. + * To avoid possible crashes, use this macro instead of testing attnotnull + * directly. + */ +#define ATTNOTNULL(att) \ + ((att)->attnotnull && \ + ((att)->attrelid != SubscriptionRelationId || \ + (att)->attnum != Anum_pg_subscription_subslotname)) + + /* * Create a function that deforms a tuple of type desc up to natts columns. */ @@ -121,7 +134,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc, * combination of attisdropped && attnotnull combination shouldn't * exist. */ - if (att->attnotnull && + if (ATTNOTNULL(att) && !att->atthasmissing && !att->attisdropped) guaranteed_column_number = attnum; @@ -419,7 +432,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc, * into account, because if they're present the heaptuple's natts * would have indicated that a slot_getmissingattrs() is needed. */ - if (!att->attnotnull) + if (!ATTNOTNULL(att)) { LLVMBasicBlockRef b_ifnotnull; LLVMBasicBlockRef b_ifnull; @@ -586,6 +599,9 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc, * to generate better code. Without that LLVM can't figure out that * the offset might be constant due to the jumps for previously * decoded columns. + * + * Note: these two tests on attnotnull don't need the ATTNOTNULL hack, + * because they are harmless on pg_subscription anyway. */ if (attguaranteedalign) { diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out index e7add9d2b8..3ba1e5dcdd 100644 --- a/src/test/regress/expected/subscription.out +++ b/src/test/regress/expected/subscription.out @@ -147,6 +147,13 @@ DROP SUBSCRIPTION regress_testsub; ERROR: DROP SUBSCRIPTION cannot run inside a transaction block COMMIT; ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE); +\dRs+ + List of subscriptions + Name | Owner| Enabled | Publication | Synchronous commit | Conninfo +-++-+-++-- + regress_testsub | regress_subscription_user2 | f | {testpub2,testpub3} | local | dbname=regress_doesnotexist2 +(1 row) + -- now it works BEGIN; DROP SUBSCRIPTION regress_testsub; diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql index 9e234ab8b3..1bc58887f7 100644 --- a/src/test/regress/sql/subscription.sql +++ b/src/test/regress/sql/subscription.sql @@ -109,6 +109,8 @@ COMMIT; ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE); +\dRs+ + -- now it works BEGIN; DROP SUBSCRIPTION regress_testsub;
Re: Default setting for enable_hashagg_disk
On Sat, 2020-07-18 at 21:15 -0400, Tom Lane wrote: > Jeff Davis writes: > > What is your opinion about pessimizing the HashAgg disk costs (not > > affecting HashAgg plans expected to stay in memory)? Tomas Vondra > > presented some evidence that Sort had some better IO patterns in > > some > > cases that weren't easily reflected in a principled way in the cost > > model. > > Hm, was that in some other thread? I didn't find any such info > in a quick look through this one. https://www.postgresql.org/message-id/2df2e0728d48f498b9d6954b5f9080a34535c385.camel%40j-davis.com Regards, Jeff Davis
Re: Default setting for enable_hashagg_disk
On Sat, Jul 18, 2020 at 3:04 PM Jeff Davis wrote: > > I think the entire discussion > > is way out ahead of any field evidence that we need such a knob. > > In the absence of evidence, our default position ought to be to > > keep it simple, not to accumulate backwards-compatibility kluges. > > Fair enough. I think that was where Stephen and Amit were coming from, > as well. > That would lessen the number of changed plans, but we could easily > remove the pessimization without controversy later if it turned out to > be unnecessary, or if we further optimize HashAgg IO. Does this mean that we've reached a final conclusion on hashagg_avoid_disk_plan for Postgres 13, which is that it should be removed? If so, I'd appreciate it if you took care of it. I don't think that we need to delay its removal until the details of the HashAgg cost pessimization are finalized. (I expect that that will be totally uncontroversial.) Thanks -- Peter Geoghegan
Re: pg_subscription.subslotname is wrongly marked NOT NULL
I wrote: >> It's also a bit annoying that we have no mechanized checks that >> would catch this inconsistency. If JIT is going to be absolutely >> dependent on NOT NULL markings being accurate, we can't really >> have such a laissez-faire attitude to C code getting it wrong. > It seems like at least in assert-enabled builds, we'd better have > a cross-check for that. I'm not sure where's the best place. I concluded that we should put this into CatalogTupleInsert and CatalogTupleUpdate. The bootstrap data path already has a check (see InsertOneNull()), and so does the executor, so we only need to worry about tuples that're built manually by catalog manipulation code. I think all of that goes through these functions. Hence, as attached. ... and apparently, I should have done this task first, because damn if it didn't immediately expose another bug of the same ilk. pg_subscription_rel.srsublsn also needs to be marked nullable. regards, tom lane diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c index d63fcf58cf..fe277f3ad3 100644 --- a/src/backend/catalog/indexing.c +++ b/src/backend/catalog/indexing.c @@ -167,6 +167,43 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple) ExecDropSingleTupleTableSlot(slot); } +/* + * Subroutine to verify that catalog constraints are honored. + * + * Tuples inserted via CatalogTupleInsert/CatalogTupleUpdate are generally + * "hand made", so that it's possible that they fail to satisfy constraints + * that would be checked if they were being inserted by the executor. That's + * a coding error, so we only bother to check for it in assert-enabled builds. + */ +#ifdef USE_ASSERT_CHECKING + +static void +CatalogTupleCheckConstraints(Relation heapRel, HeapTuple tup) +{ + /* + * Currently, the only constraints implemented for system catalogs are + * attnotnull constraints. + */ + if (HeapTupleHasNulls(tup)) + { + TupleDesc tupdesc = RelationGetDescr(heapRel); + bits8 *bp = tup->t_data->t_bits; + + for (int attnum = 0; attnum < tupdesc->natts; attnum++) + { + Form_pg_attribute thisatt = TupleDescAttr(tupdesc, attnum); + + Assert(!(thisatt->attnotnull && att_isnull(attnum, bp))); + } + } +} + +#else /* !USE_ASSERT_CHECKING */ + +#define CatalogTupleCheckConstraints(heapRel, tup) ((void) 0) + +#endif /* USE_ASSERT_CHECKING */ + /* * CatalogTupleInsert - do heap and indexing work for a new catalog tuple * @@ -184,6 +221,8 @@ CatalogTupleInsert(Relation heapRel, HeapTuple tup) { CatalogIndexState indstate; + CatalogTupleCheckConstraints(heapRel, tup); + indstate = CatalogOpenIndexes(heapRel); simple_heap_insert(heapRel, tup); @@ -204,6 +243,8 @@ void CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup, CatalogIndexState indstate) { + CatalogTupleCheckConstraints(heapRel, tup); + simple_heap_insert(heapRel, tup); CatalogIndexInsert(indstate, tup); @@ -225,6 +266,8 @@ CatalogTupleUpdate(Relation heapRel, ItemPointer otid, HeapTuple tup) { CatalogIndexState indstate; + CatalogTupleCheckConstraints(heapRel, tup); + indstate = CatalogOpenIndexes(heapRel); simple_heap_update(heapRel, otid, tup); @@ -245,6 +288,8 @@ void CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup, CatalogIndexState indstate) { + CatalogTupleCheckConstraints(heapRel, tup); + simple_heap_update(heapRel, otid, tup); CatalogIndexInsert(indstate, tup);
Re: pg_subscription.subslotname is wrongly marked NOT NULL
I wrote: > pg_subscription_rel.srsublsn also needs to be marked nullable. Not only is it wrongly marked attnotnull, but two of the three places that read it are doing so unsafely (ie, as though it *were* non-nullable). So I think we'd better fix it as attached. regards, tom lane diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 80c183b235..13c871d2a8 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -7631,7 +7631,9 @@ SCRAM-SHA-256$:&l srsublsn pg_lsn - End LSN for s and r states. + Remote LSN of the state change used for synchronization coordination + when in s or r states, + otherwise null diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c index e6afb3203e..90bf5cf0c6 100644 --- a/src/backend/catalog/pg_subscription.c +++ b/src/backend/catalog/pg_subscription.c @@ -452,13 +452,20 @@ GetSubscriptionRelations(Oid subid) { Form_pg_subscription_rel subrel; SubscriptionRelState *relstate; + Datum d; + bool isnull; subrel = (Form_pg_subscription_rel) GETSTRUCT(tup); relstate = (SubscriptionRelState *) palloc(sizeof(SubscriptionRelState)); relstate->relid = subrel->srrelid; relstate->state = subrel->srsubstate; - relstate->lsn = subrel->srsublsn; + d = SysCacheGetAttr(SUBSCRIPTIONRELMAP, tup, + Anum_pg_subscription_rel_srsublsn, &isnull); + if (isnull) + relstate->lsn = InvalidXLogRecPtr; + else + relstate->lsn = DatumGetLSN(d); res = lappend(res, relstate); } @@ -504,13 +511,20 @@ GetSubscriptionNotReadyRelations(Oid subid) { Form_pg_subscription_rel subrel; SubscriptionRelState *relstate; + Datum d; + bool isnull; subrel = (Form_pg_subscription_rel) GETSTRUCT(tup); relstate = (SubscriptionRelState *) palloc(sizeof(SubscriptionRelState)); relstate->relid = subrel->srrelid; relstate->state = subrel->srsubstate; - relstate->lsn = subrel->srsublsn; + d = SysCacheGetAttr(SUBSCRIPTIONRELMAP, tup, + Anum_pg_subscription_rel_srsublsn, &isnull); + if (isnull) + relstate->lsn = InvalidXLogRecPtr; + else + relstate->lsn = DatumGetLSN(d); res = lappend(res, relstate); } diff --git a/src/include/catalog/pg_subscription_rel.h b/src/include/catalog/pg_subscription_rel.h index c44df04c72..f384f4e7fa 100644 --- a/src/include/catalog/pg_subscription_rel.h +++ b/src/include/catalog/pg_subscription_rel.h @@ -33,8 +33,18 @@ CATALOG(pg_subscription_rel,6102,SubscriptionRelRelationId) Oid srsubid; /* Oid of subscription */ Oid srrelid; /* Oid of relation */ char srsubstate; /* state of the relation in subscription */ - XLogRecPtr srsublsn; /* remote lsn of the state change used for - * synchronization coordination */ + + /* + * Although srsublsn is a fixed-width type, it is allowed to be NULL, so + * we prevent direct C code access to it just as for a varlena field. + */ +#ifdef CATALOG_VARLEN /* variable-length fields start here */ + + XLogRecPtr srsublsn BKI_FORCE_NULL; /* remote LSN of the state change + * used for synchronization + * coordination, or NULL if not + * valid */ +#endif } FormData_pg_subscription_rel; typedef FormData_pg_subscription_rel *Form_pg_subscription_rel;
Re: Default setting for enable_hashagg_disk
On Sun, Jul 19, 2020 at 02:17:15PM -0700, Jeff Davis wrote: On Sat, 2020-07-18 at 21:15 -0400, Tom Lane wrote: Jeff Davis writes: > What is your opinion about pessimizing the HashAgg disk costs (not > affecting HashAgg plans expected to stay in memory)? Tomas Vondra > presented some evidence that Sort had some better IO patterns in > some > cases that weren't easily reflected in a principled way in the cost > model. Hm, was that in some other thread? I didn't find any such info in a quick look through this one. https://www.postgresql.org/message-id/2df2e0728d48f498b9d6954b5f9080a34535c385.camel%40j-davis.com FWIW the two messages to look at are these two: 1) report with initial data https://www.postgresql.org/message-id/20200519151202.u2p2gpiawoaznsv2%40development 2) updated stats, with the block pre-allocation and tlist projection https://www.postgresql.org/message-id/20200521001255.kfaihp3afv6vy6uq%40development But I'm not convinced we actually need to tweak the costing - we've ended up fixing two things, and I think a lot of the differences in I/O patterns disappeared thanks to this. For sort, the stats of request sizes look like this: type | bytes | count | pct --+-+---+--- RA | 131072 | 26034 | 59.92 RA | 16384 | 6160 | 14.18 RA |8192 | 3636 | 8.37 RA | 32768 | 3406 | 7.84 RA | 65536 | 3270 | 7.53 RA | 24576 | 361 | 0.83 ... W| 1310720 | 8070 | 34.26 W| 262144 | 1213 | 5.15 W| 524288 | 1056 | 4.48 W| 1056768 | 689 | 2.93 W| 786432 | 292 | 1.24 W| 802816 | 199 | 0.84 ... And for the hashagg, it looks like this: type | bytes | count | pct --+-++ RA | 131072 | 200816 | 70.93 RA |8192 | 23640 | 8.35 RA | 16384 | 19324 | 6.83 RA | 32768 | 19279 | 6.81 RA | 65536 | 19273 | 6.81 ... W| 1310720 | 18000 | 65.91 W| 524288 | 2074 | 7.59 W| 1048576 |660 | 2.42 W|8192 |409 | 1.50 W| 786432 |354 | 1.30 ... so it's actually a tad better than sort, because larger proportion of both reads and writes is in larger chunks (reads 128kB, writes 1280kB). I think the device had default read-ahead setting, which I assume explains the 128kB. For the statistics of deltas between requests - for sort type | block_delta | count | pct --+-+---+--- RA | 256 | 13432 | 30.91 RA | 16 | 3291 | 7.57 RA | 32 | 3272 | 7.53 RA | 64 | 3266 | 7.52 RA | 128 | 2877 | 6.62 RA |1808 | 1278 | 2.94 RA | -2320 | 483 | 1.11 RA | 28928 | 386 | 0.89 ... W|2560 | 7856 | 33.35 W|2064 | 4921 | 20.89 W|2080 | 586 | 2.49 W| 30960 | 300 | 1.27 W|2160 | 253 | 1.07 W|1024 | 248 | 1.05 ... and for hashagg: type | block_delta | count | pct --+-++--- RA | 256 | 180955 | 63.91 RA | 32 | 19274 | 6.81 RA | 64 | 19273 | 6.81 RA | 128 | 19264 | 6.80 RA | 16 | 19203 | 6.78 RA | 30480 | 9835 | 3.47 At first this might look worse than sort, but 256 sectors matches the 128kB from the request size stats, and it's good match (64% vs. 70%). There's a minor problem here, though - these stats were collected before we fixed the tlist issue, so hashagg was spilling about 10x the amount of data compared to sort+groupagg. So maybe that's the first thing we should do, before contemplating changes to the costing - collecting fresh data. I can do that, if needed. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Libpq support to connect to standby server as priority
> > Thanks, but now the tests no longer work as the nodes in the test suite are > renamed. While simple enough for a committer to fix, it's always good to see > the tests pass in the CFBot to make sure the variable name error isn't hiding > an actual test error. > Rebased patch attached, all tests currently working as of Jul 19 (a766d6ca22ac7c233e69c896ae0c5f19de916db4). v17-0001-Enhance-libpq-target_session_attrs-and-add-target_se.patch Description: Binary data
Re: Patch for nodeIncrementalSort comment correction.
On Sun, Jul 19, 2020 at 7:25 PM James Coleman wrote: > > On Saturday, July 18, 2020, vignesh C wrote: >> >> Hi, >> >> One of the comments needs correction "sorting all tuples in the the >> dataset" should have been "sorting all tuples in the dataset". >> The Attached patch has the changes for the same. >> > > > Thanks for fixing this. Looks correct to me. > Yeah, looks like a typo will push in some time. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: Is it useful to record whether plans are generic or custom?
On 2020/07/17 16:25, Fujii Masao wrote: On 2020/07/16 11:50, torikoshia wrote: On 2020-07-15 11:44, Fujii Masao wrote: On 2020/07/14 21:24, torikoshia wrote: On 2020-07-10 10:49, torikoshia wrote: On 2020-07-08 16:41, Fujii Masao wrote: On 2020/07/08 10:14, torikoshia wrote: On 2020-07-06 22:16, Fujii Masao wrote: On 2020/06/11 14:59, torikoshia wrote: On 2020-06-10 18:00, Kyotaro Horiguchi wrote: + TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_plan", This could be a problem if we showed the last plan in this view. I think "last_plan_type" would be better. + if (prep_stmt->plansource->last_plan_type == PLAN_CACHE_TYPE_CUSTOM) + values[7] = CStringGetTextDatum("custom"); + else if (prep_stmt->plansource->last_plan_type == PLAN_CACHE_TYPE_GENERIC) + values[7] = CStringGetTextDatum("generic"); + else + nulls[7] = true; Using swith-case prevents future additional type (if any) from being unhandled. I think we are recommending that as a convension. Thanks for your reviewing! I've attached a patch that reflects your comments. Thanks for the patch! Here are the comments. Thanks for your review! + Number of times generic plan was choosen + Number of times custom plan was choosen Typo: "choosen" should be "chosen"? Thanks, fixed them. + + last_plan_type text + + + Tells the last plan type was generic or custom. If the prepared + statement has not executed yet, this field is null + Could you tell me how this information is expected to be used? I think that generic_plans and custom_plans are useful when investigating the cause of performance drop by cached plan mode. But I failed to get how much useful last_plan_type is. This may be an exceptional case, but I once had a case needed to ensure whether generic or custom plan was chosen for specific queries in a development environment. In your case, probably you had to ensure that the last multiple (or every) executions chose generic or custom plan? If yes, I'm afraid that displaying only the last plan mode is not enough for your case. No? So it seems better to check generic_plans or custom_plans columns in the view rather than last_plan_type even in your case. Thought? Yeah, I now feel last_plan is not so necessary and only the numbers of generic/custom plan is enough. If there are no objections, I'm going to remove this column and related codes. As mentioned, I removed last_plan column. Thanks for updating the patch! It basically looks good to me. I have one comment; you added the regression tests for generic and custom plans into prepare.sql. But the similar tests already exist in plancache.sql. So isn't it better to add the tests for generic_plans and custom_plans columns, into plancache.sql? Thanks for your comments! Agreed. I removed tests on prepare.sql and added them to plancache.sql. Thoughts? Thanks for updating the patch! I also applied the following minor changes to the patch. - Number of times generic plan was chosen + Number of times generic plan was chosen - Number of times custom plan was chosen + Number of times custom plan was chosen I got rid of one space character before those descriptions because they should start from the position of 7th character. -- but we can force a custom plan set plan_cache_mode to force_custom_plan; explain (costs off) execute test_mode_pp(2); +select name, generic_plans, custom_plans from pg_prepared_statements + where name = 'test_mode_pp'; In the regression test, I added the execution of pg_prepared_statements after the last execution of test query, to confirm that custom plan is used when force_custom_plan is set, by checking from pg_prepared_statements. I changed the status of this patch to "Ready for Committer" in CF. Barring any objection, I will commit this patch. Committed. Thanks! Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: pg_ctl behavior on Windows
On Sat, Jul 18, 2020 at 6:31 PM Chapman Flack wrote: > > On 07/18/20 05:46, Amit Kapila wrote: > > I don't think so. I think you can use 'pg_ctl start' to achieve that. > > I think the JOBS stuff is primarily required when we use 'register' > > operation (aka runs server via service). For example, if you see one > > of the Job options "JOB_OBJECT_LIMIT_DIE_ON_UNHANDLED_EXCEPTION", it > > suppresses dialog box for a certain type of errors and causes a > > termination of the process with the exception code as the exit status > > Thanks very much, that helps a lot. I still wonder, though, about some > of the other limits also placed on that job object, such as > JOB_OBJECT_SECURITY_NO_ADMIN | JOB_OBJECT_SECURITY_ONLY_TOKEN > > Those seem closely related to the purpose of CreateRestrictedProcess. > Does the NOTE! mean that, when not running as a service, the job object > disappears as soon as pg_ctl exits, > >From the comments in that part of code, it seems like Job object will be closed as soon as pg_ctl exits. However, as per my understanding of specs [1], it will be closed once the process with which it is associated is gone which in this case should be the new process created with "CreateProcessAsUser". This has been added by the below commit, so Magnus might remember something about this. commit a25cd81007e827684343a53a80e8bc90f585ca8e Author: Tom Lane Date: Fri Feb 10 22:00:59 2006 + Enable pg_ctl to give up admin privileges when starting the server under Windows (if newer than NT4, else works same as before). Magnus [1] - https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-createjobobjecta -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: max_slot_wal_keep_size and wal_keep_segments
On 2020/07/17 20:24, David Steele wrote: On 7/17/20 5:11 AM, Fujii Masao wrote: On 2020/07/14 20:30, David Steele wrote: On 7/14/20 12:00 AM, Fujii Masao wrote: The patch was no longer applied cleanly because of recent commit. So I updated the patch. Attached. Barring any objection, I will commit this patch. This doesn't look right: + the most recent megabytes + WAL files plus one WAL file are How about: + megabytes of + WAL files plus one WAL file are Thanks for the comment! Isn't it better to keep "most recent" part? If so, what about either of the followings? 1. megabytes of WAL files plus one WAL file that were most recently generated are kept all time. 2. megabytes + bytes of WAL files that were most recently generated are kept all time. "most recent" seemed implied to me, but I see your point. How about: + the most recent megabytes of + WAL files plus one additional WAL file are I adopted this and pushed the patch. Thanks! Also we need to update the release note for v13. What about adding the following? Rename configuration parameter wal_keep_segments to wal_keep_size. This allows how much WAL files to retain for the standby server, by bytes instead of the number of files. If you previously used wal_keep_segments, the following formula will give you an approximately equivalent setting: wal_keep_size = wal_keep_segments * wal_segment_size (typically 16MB) Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: psql - add SHOW_ALL_RESULTS option
On Sun, Jun 7, 2020 at 2:36 AM Fabien COELHO wrote: > In the meantime, here is a v9 which also fixes the behavior when using > \watch, so that now one can issue several \;-separated queries and have > their progress shown. I just needed that a few days ago and was > disappointed but unsurprised that it did not work. Hi Fabien, This seems to break the 013_crash_restart.pl test.
Re: Is it useful to record whether plans are generic or custom?
On 2020-07-20 11:57, Fujii Masao wrote: On 2020/07/17 16:25, Fujii Masao wrote: On 2020/07/16 11:50, torikoshia wrote: On 2020-07-15 11:44, Fujii Masao wrote: On 2020/07/14 21:24, torikoshia wrote: On 2020-07-10 10:49, torikoshia wrote: On 2020-07-08 16:41, Fujii Masao wrote: On 2020/07/08 10:14, torikoshia wrote: On 2020-07-06 22:16, Fujii Masao wrote: On 2020/06/11 14:59, torikoshia wrote: On 2020-06-10 18:00, Kyotaro Horiguchi wrote: + TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_plan", This could be a problem if we showed the last plan in this view. I think "last_plan_type" would be better. + if (prep_stmt->plansource->last_plan_type == PLAN_CACHE_TYPE_CUSTOM) + values[7] = CStringGetTextDatum("custom"); + else if (prep_stmt->plansource->last_plan_type == PLAN_CACHE_TYPE_GENERIC) + values[7] = CStringGetTextDatum("generic"); + else + nulls[7] = true; Using swith-case prevents future additional type (if any) from being unhandled. I think we are recommending that as a convension. Thanks for your reviewing! I've attached a patch that reflects your comments. Thanks for the patch! Here are the comments. Thanks for your review! + Number of times generic plan was choosen + Number of times custom plan was choosen Typo: "choosen" should be "chosen"? Thanks, fixed them. + role="column_definition"> + last_plan_type text + + + Tells the last plan type was generic or custom. If the prepared + statement has not executed yet, this field is null + Could you tell me how this information is expected to be used? I think that generic_plans and custom_plans are useful when investigating the cause of performance drop by cached plan mode. But I failed to get how much useful last_plan_type is. This may be an exceptional case, but I once had a case needed to ensure whether generic or custom plan was chosen for specific queries in a development environment. In your case, probably you had to ensure that the last multiple (or every) executions chose generic or custom plan? If yes, I'm afraid that displaying only the last plan mode is not enough for your case. No? So it seems better to check generic_plans or custom_plans columns in the view rather than last_plan_type even in your case. Thought? Yeah, I now feel last_plan is not so necessary and only the numbers of generic/custom plan is enough. If there are no objections, I'm going to remove this column and related codes. As mentioned, I removed last_plan column. Thanks for updating the patch! It basically looks good to me. I have one comment; you added the regression tests for generic and custom plans into prepare.sql. But the similar tests already exist in plancache.sql. So isn't it better to add the tests for generic_plans and custom_plans columns, into plancache.sql? Thanks for your comments! Agreed. I removed tests on prepare.sql and added them to plancache.sql. Thoughts? Thanks for updating the patch! I also applied the following minor changes to the patch. - Number of times generic plan was chosen + Number of times generic plan was chosen - Number of times custom plan was chosen + Number of times custom plan was chosen I got rid of one space character before those descriptions because they should start from the position of 7th character. -- but we can force a custom plan set plan_cache_mode to force_custom_plan; explain (costs off) execute test_mode_pp(2); +select name, generic_plans, custom_plans from pg_prepared_statements + where name = 'test_mode_pp'; In the regression test, I added the execution of pg_prepared_statements after the last execution of test query, to confirm that custom plan is used when force_custom_plan is set, by checking from pg_prepared_statements. I changed the status of this patch to "Ready for Committer" in CF. Barring any objection, I will commit this patch. Committed. Thanks! Thanks! As I proposed earlier in this thread, I'm now trying to add information about generic/cudstom plan to pg_stat_statements. I'll share the idea and the poc patch soon. Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
Re: Comment simplehash/dynahash trade-offs
On Fri, May 1, 2020 at 1:53 PM James Coleman wrote: > In another thread [1] I'd mused that "there might be some value in a > README or comments > addition that would be a guide to what the various hash > implementations are useful for...so that we have something to make the > code base a bit more discoverable." +1 > I'd solicited feedback from Andres (as the author of the simplehash > implementation) and gotten further explanation from Tomas (both cc'd > here) and have tried to condense that into the comment changes in this > patch series. > > v1-0001-Summarize-trade-offs-between-simplehash-and-dynah.patch > Contains the summaries mentioned above. + * - It supports partitioning, which is useful for shared memory access using I wonder if we should say a bit more about the shared memory mode. Shared memory dynahash tables are allocated in a fixed size area at startup, and are discoverable by name in other other processes that need to get access to them, while simplehash assumes that it can get memory from a MemoryContext or an allocator with a malloc/free-style interface, which isn't very well suited for use in shared memory. (I'm sure you can convince it to work in shared memory with some work.) > v1-0002-Improve-simplehash-usage-notes.patch + *For convenience the hash table create functions accept a void pointer + *will be stored in the hash table type's member private_data. *that* will be stored? > v1-0003-Show-sample-simplehash-method-signatures.patch > I find it hard to read the macro code "templating" particularly for > seeing what the available API is and so added sample method signatures > in comments to the macro generated method signature defines. I didn't double-check all the expansions of the macros but +1 for this idea, it's very useful.
Re: pg_resetwal --next-transaction-id may cause database failed to restart.
>It may be OK actually; if you're doing multiple dangerous changes, you'd >use --dry-run beforehand ... No? (It's what *I* would do, for sure.) >Which in turns suggests that it would good to ensure that --dry-run >*also* emits a warning (not an error, so that any other warnings can >also be thrown and the user gets the full picture). Yes that's true, I have chaged the patch and will get a warning rather than error when we point a --dry-run option. And I remake the code which looks more clearly. >I think adding multiple different --force switches makes the UI more >complex for little added value. Yes I also feel about that, but I can't convince myself to use --force to finish the mission, because --force is used when something wrong with pg_control file and we can listen to hackers' proposals. Regards, Highgo Software (Canada/China/Pakistan) URL : www.highgo.ca EMAIL: mailto:movead(dot)li(at)highgo(dot)ca pg_resetwal_transaction_limit_v3.patch Description: Binary data
Re: psql - add SHOW_ALL_RESULTS option
In the meantime, here is a v9 which also fixes the behavior when using \watch, so that now one can issue several \;-separated queries and have their progress shown. I just needed that a few days ago and was disappointed but unsurprised that it did not work. This seems to break the 013_crash_restart.pl test. Yes, indeed. I'm planning to investigate, hopefully this week. -- Fabien.