Re: ICU for global collation
On 2019-09-17 15:08, Daniel Verite wrote: When trying databases defined with ICU locales, I see that backends that serve such databases seem to have their LC_CTYPE inherited from the environment (as opposed to a per-database fixed value). fr-utf8=# select to_tsvector('été'); ERROR: invalid multibyte character for locale HINT: The server's LC_CTYPE locale is probably incompatible with the database encoding. I looked into this problem. The way to address this would be adding proper collation support to the text search subsystem. See the TODO markers in src/backend/tsearch/ts_locale.c for starting points. These APIs spread out to a lot of places, so it will take some time to finish. In the meantime, I'm pausing this thread and will set the CF entry as RwF. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Allow CREATE OR REPLACE VIEW to rename the columns
On Thu, Oct 31, 2019 at 1:42 PM Tom Lane wrote: > > Fujii Masao writes: > > Currently CREATE OR REPLACE VIEW command fails if the column names > > are changed. > > That is, I believe, intentional. It's an effective aid to catching > mistakes in view redefinitions, such as misaligning the new set of > columns relative to the old. That's particularly important given > that we allow you to add columns during CREATE OR REPLACE VIEW. > Consider the oversimplified case where you start with > > CREATE VIEW v AS SELECT 1 AS x, 2 AS y; > > and you want to add a column z, and you get sloppy and write > > CREATE OR REPLACE VIEW v AS SELECT 1 AS x, 3 AS z, 2 AS y; > > If we did not throw an error on this, references that formerly > pointed to column y would now point to z (as that's still attnum 2), > which is highly unlikely to be what you wanted. This example makes me wonder if the addtion of column by CREATE OR REPLACE VIEW also has the same (or even worse) issue. That is, it may increase the oppotunity for users' mistake. I'm thinking the case where users mistakenly added new column into the view when replacing the view definition. This mistake can happen because CREATE OR REPLACE VIEW allows new column to be added. But what's the worse is that, currently there is no way to drop the column from the view, except recreation of the view. Neither CREATE OR REPLACE VIEW nor ALTER TABLE support the drop of the column from the view. So, to fix the mistake, users would need to drop the view itself and recreate it. If there are some objects depending the view, they also might need to be recreated. This looks not good. Since the feature has been supported, it's too late to say that, though... At least, the support for ALTER VIEW DROP COLUMN might be necessary to alleviate that situation. Regards, -- Fujii Masao
Re: [PATCH] Do not use StdRdOptions in Access Methods
Hi Michael, On Wed, Oct 30, 2019 at 12:11 PM Michael Paquier wrote: > Looks fine. I have done some refinements as per the attached. Thanks. This stood out to me: + * The result is a structure containing all the parsed option values in + * text-array format. This sentence sounds wrong, because the result structure doesn't contain values in text-array format. Individual values in the struct would be in their native format (C bool for RELOPT_TYPE_BOOL, options, etc.). Maybe we don't need this sentence, because the first line already says we return a struct. > Running the regression tests of dummy_index_am has proved to be able > to break the assertion you have added. This breakage seems to have to do with the fact that the definition of DummyIndexOptions struct and the entries of relopt_parse_elt table don't agree? These are the last two members of DummyIndexOptions struct: int option_string_val_offset; int option_string_null_offset; } DummyIndexOptions; but di_relopt_tab's last two entries are these: add_string_reloption(di_relopt_kind, "option_string_val", "String option for dummy_index_am with non-NULL default", "DefaultValue", &validate_string_option, AccessExclusiveLock); di_relopt_tab[4].optname = "option_string_val"; di_relopt_tab[4].opttype = RELOPT_TYPE_STRING; di_relopt_tab[4].offset = offsetof(DummyIndexOptions, option_string_val_offset); /* * String option for dummy_index_am with NULL default, and without * description. */ add_string_reloption(di_relopt_kind, "option_string_null", NULL, /* description */ NULL, &validate_string_option, AccessExclusiveLock); di_relopt_tab[5].optname = "option_string_null"; di_relopt_tab[5].opttype = RELOPT_TYPE_STRING; di_relopt_tab[5].offset = offsetof(DummyIndexOptions, option_string_null_offset); If I fix the above code like this: @@ -113,7 +113,7 @@ create_reloptions_table(void) "DefaultValue", &validate_string_option, AccessExclusiveLock); di_relopt_tab[4].optname = "option_string_val"; - di_relopt_tab[4].opttype = RELOPT_TYPE_STRING; + di_relopt_tab[4].opttype = RELOPT_TYPE_INT; di_relopt_tab[4].offset = offsetof(DummyIndexOptions, option_string_val_offset); @@ -126,7 +126,7 @@ create_reloptions_table(void) NULL, &validate_string_option, AccessExclusiveLock); di_relopt_tab[5].optname = "option_string_null"; - di_relopt_tab[5].opttype = RELOPT_TYPE_STRING; + di_relopt_tab[5].opttype = RELOPT_TYPE_INT; di_relopt_tab[5].offset = offsetof(DummyIndexOptions, option_string_null_offset); } test passes. But maybe this Assert isn't all that robust, so I'm happy to take it out. > Also if some options are parsed options will never be NULL, so there > is no need to check for it before pfree()-ing it, no? I haven't fully read parseRelOptions(), but I will trust you. :) Thanks, Amit
Re: Allow cluster_name in log_line_prefix
On Mon, Oct 28, 2019 at 1:33 PM Craig Ringer wrote: > > Hi folks > > I was recently surprised to notice that log_line_prefix doesn't support a > cluster_name placeholder. I suggest adding one. If I don't hear objections > I'll send a patch. If we do this, cluster_name should be included in csvlog? Regards, -- Fujii Masao
Re: [PATCH] Do not use StdRdOptions in Access Methods
On Thu, Oct 31, 2019 at 04:38:55PM +0900, Amit Langote wrote: > On Wed, Oct 30, 2019 at 12:11 PM Michael Paquier wrote: > This sentence sounds wrong, because the result structure doesn't > contain values in text-array format. Individual values in the struct > would be in their native format (C bool for RELOPT_TYPE_BOOL, options, > etc.). > > Maybe we don't need this sentence, because the first line already says > we return a struct. Let's remove it then. > This breakage seems to have to do with the fact that the definition of > DummyIndexOptions struct and the entries of relopt_parse_elt table > don't agree? > > These are the last two members of DummyIndexOptions struct: > > @@ -126,7 +126,7 @@ create_reloptions_table(void) > NULL, &validate_string_option, > AccessExclusiveLock); > di_relopt_tab[5].optname = "option_string_null"; > - di_relopt_tab[5].opttype = RELOPT_TYPE_STRING; > + di_relopt_tab[5].opttype = RELOPT_TYPE_INT; > di_relopt_tab[5].offset = offsetof(DummyIndexOptions, > option_string_null_offset); > } > > test passes. > > But maybe this Assert isn't all that robust, so I'm happy to take it out. This one should remain a string reloption, and what's stored in the structure is an offset to get the string. See for example around RelationHasCascadedCheckOption before it got switched to an enum in 773df88 regarding the way to get the value. -- Michael signature.asc Description: PGP signature
Re: update ALTER TABLE with ATTACH PARTITION lock mode
Hello, On Tue, Oct 29, 2019 at 12:13 AM Alvaro Herrera wrote: > On 2019-Oct-28, Michael Paquier wrote: > > On Sun, Oct 27, 2019 at 07:12:07PM -0500, Justin Pryzby wrote: > > > commit #898e5e32 (Allow ATTACH PARTITION with only > > > ShareUpdateExclusiveLock) > > > updates ddl.sgml but not alter_table.sgml, which only says: > > > > > > https://www.postgresql.org/docs/12/release-12.html > > > |An ACCESS EXCLUSIVE lock is held unless explicitly noted. > > > > + > > + Attaching a partition acquires a SHARE UPDATE > > EXCLUSIVE > > + lock on the partitioned table, in addition to an > > + ACCESS EXCLUSIVE lock on the partition. > > + > > Updating the docs of ALTER TABLE sounds like a good idea. This > > sentence looks fine to me. Perhaps others have suggestions? > > Doesn't the command also acquire a lock on the default partition if > there is one? It sounds worth noting. > > > > Find attached patch, which also improve language in several related > > > places. > > > > Not sure that these are actually improvements. > > I think some of them (most?) are clear improvements. As someone who has written some of those lines, I agree that Justin's tweaks make them more readable, so +1 to apply 0002 patch too. Thanks, Amit
Re: [PATCH] Do not use StdRdOptions in Access Methods
On Thu, Oct 31, 2019 at 4:49 PM Michael Paquier wrote: > On Thu, Oct 31, 2019 at 04:38:55PM +0900, Amit Langote wrote: > > On Wed, Oct 30, 2019 at 12:11 PM Michael Paquier > > wrote: > > This sentence sounds wrong, because the result structure doesn't > > contain values in text-array format. Individual values in the struct > > would be in their native format (C bool for RELOPT_TYPE_BOOL, options, > > etc.). > > > > Maybe we don't need this sentence, because the first line already says > > we return a struct. > > Let's remove it then. Removed in the attached. > > This breakage seems to have to do with the fact that the definition of > > DummyIndexOptions struct and the entries of relopt_parse_elt table > > don't agree? > > > > These are the last two members of DummyIndexOptions struct: > > > > @@ -126,7 +126,7 @@ create_reloptions_table(void) > > NULL, &validate_string_option, > > AccessExclusiveLock); > > di_relopt_tab[5].optname = "option_string_null"; > > - di_relopt_tab[5].opttype = RELOPT_TYPE_STRING; > > + di_relopt_tab[5].opttype = RELOPT_TYPE_INT; > > di_relopt_tab[5].offset = offsetof(DummyIndexOptions, > > option_string_null_offset); > > } > > > > test passes. > > > > But maybe this Assert isn't all that robust, so I'm happy to take it out. > > This one should remain a string reloption, and what's stored in the > structure is an offset to get the string. See for example around > RelationHasCascadedCheckOption before it got switched to an enum in > 773df88 regarding the way to get the value. I see. I didn't know that about STRING options when writing that Assert, so it was indeed broken to begin with. v5 attached. Thanks, Amit build_reloptions-v5.patch Description: Binary data
Re: Problem with synchronous replication
On Thu, Oct 31, 2019 at 11:12 AM Michael Paquier wrote: > > On Wed, Oct 30, 2019 at 05:43:04PM +0900, Kyotaro Horiguchi wrote: > > At Wed, 30 Oct 2019 17:21:17 +0900, Fujii Masao > > wrote in > >> This change causes every ending backends to always take the exclusive lock > >> even when it's not in SyncRep queue. This may be problematic, for example, > >> when terminating multiple backends at the same time? If yes, > >> it might be better to check SHMQueueIsDetached() again after taking the > >> lock. > >> That is, > > > > I'm not sure how much that harms but double-checked locking > > (releasing) is simple enough for reducing possible congestion here, I > > think. > > FWIW, I could not measure any actual difference with pgbench -C, up to > 500 sessions and an empty input file (just have one meta-command) and > -c 20. > > I have added some comments in SyncRepCleanupAtProcExit(), and adjusted > the patch with the suggestion from Fujii-san. Any comments? Thanks for the patch! Looks good to me. Regards, -- Fujii Masao
Can avoid list_copy in recomputeNamespacePath() conditionally?
Hi all, I wondered can we have a shortcut somewhat similar to following POC in recomputeNamespacePath () when the recomputed path is the same as the previous baseSearchPath/activeSearchPath : == POC patch == diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c index e251f5a9fdc..b25ef489e47 100644 --- a/src/backend/catalog/namespace.c +++ b/src/backend/catalog/namespace.c @@ -3813,6 +3813,9 @@ recomputeNamespacePath(void) !list_member_oid(oidlist, myTempNamespace)) oidlist = lcons_oid(myTempNamespace, oidlist); + /* TODO: POC */ + if (equal(oidlist, baseSearchPath)) + return; /* * Now that we've successfully built the new list of namespace OIDs, save * it in permanent storage. == POC patch end == It can have two advantages as: 1. Avoid unnecessary list_copy() in TopMemoryContext context & 2. Global pointers like activeSearchPath/baseSearchPath will not change if some implementation end up with cascaded call to recomputeNamespacePath(). Thoughts/Comments? Regards, Amul
Re: MarkBufferDirtyHint() and LSN update
Tomas Vondra wrote: > On Wed, Oct 30, 2019 at 02:44:18PM +0100, Antonin Houska wrote: > >Please consider this scenario (race conditions): > > > >1. FlushBuffer() has written the buffer but hasn't yet managed to clear the > >BM_DIRTY flag (however BM_JUST_DIRTIED could be cleared by now). > > > >2. Another backend modified a hint bit and called MarkBufferDirtyHint(). > > > >3. In MarkBufferDirtyHint(), if XLogHintBitIsNeeded() evaluates to true > >(e.g. due to checksums enabled), new LSN is computed, however it's not > >assigned to the page because the buffer is still dirty: > > > > if (!(buf_state & BM_DIRTY)) > > { > > ... > > > > if (!XLogRecPtrIsInvalid(lsn)) > > PageSetLSN(page, lsn); > > } > > > >4. MarkBufferDirtyHint() completes. > > > >5. In the first session, FlushBuffer()->TerminateBufferIO() will not clear > >BM_DIRTY because MarkBufferDirtyHint() has eventually set > >BM_JUST_DIRTIED. Thus the hint bit change itself will be written by the next > >call of FlushBuffer(). However page LSN is hasn't been updated so the > >requirement that WAL must be flushed first is not met. > > > >I think that PageSetLSN() should be called regardless BM_DIRTY. Do I miss any > >subtle detail? > > > > Isn't this prevented by locking of the buffer header? Both FlushBuffer > and MarkBufferDirtyHint do obtain that lock. I see MarkBufferDirtyHint > does a bit of work before, but that's related to BM_PERMANENT. > > If there really is a race condition, it shouldn't be that difficult to > trigger it by adding a sleep or a breakpoint, I guess. Yes, I had verified it using gdb: inserted a row into a table, then attached gdb to checkpointer and stopped it when FlushBuffer() was about to call TerminateBufferIO(). Then, when scanning the table, I saw that MarkBufferDirtyHint() skipped the call of PageSetLSN(). Finally, before checkpointer unlocked the buffer header in TerminateBufferIO(), buf_state was 3553624066 ~ 0b110100010010. With BM_DIRTY defined as #define BM_DIRTY(1U << 23) the flag appeared to be set. -- Antonin Houska Web: https://www.cybertec-postgresql.com
Remove configure --disable-float4-byval and --disable-float8-byval
AFAICT, these build options were only useful to maintain compatibility for version-0 functions, but those are no longer supported, so these options can be removed. There is a fair amount of code all over the place to support these options, so the cleanup is quite significant. The current behavior became the default in PG9.3. Note that this does not affect on-disk storage. The only upgrade issue that I can see is that pg_upgrade refuses to upgrade incompatible clusters if you have contrib/isn installed. But hopefully everyone who is affected by that will have upgraded at least once since PG9.2 already. float4 is now always pass-by-value; the pass-by-reference code path is completely removed. float8 and related types are now hardcoded to pass-by-value or pass-by-reference depending on whether the build is 64- or 32-bit, as was previously also the default. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services From e009755a160d3d67900c80c9bc17276e27f79baa Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Thu, 31 Oct 2019 09:21:38 +0100 Subject: [PATCH 1/3] Remove configure --disable-float4-byval and --disable-float8-byval These build options were only useful to maintain compatibility for version-0 functions, but those are no longer supported, so these options can be removed. float4 is now always pass-by-value; the pass-by-reference code path is completely removed. float8 and related types are now hardcoded to pass-by-value or pass-by-reference depending on whether the build is 64- or 32-bit, as was previously also the default. --- configure | 118 configure.in| 33 --- doc/src/sgml/func.sgml | 10 -- doc/src/sgml/installation.sgml | 28 -- src/backend/access/index/indexam.c | 5 - src/backend/access/transam/xlog.c | 35 --- src/backend/bootstrap/bootstrap.c | 2 +- src/backend/catalog/genbki.pl | 2 +- src/backend/commands/analyze.c | 2 +- src/backend/utils/adt/numeric.c | 5 +- src/backend/utils/fmgr/dfmgr.c | 18 src/backend/utils/fmgr/fmgr.c | 23 + src/backend/utils/misc/pg_controldata.c | 18 +--- src/bin/initdb/initdb.c | 3 - src/bin/pg_controldata/pg_controldata.c | 4 - src/bin/pg_resetwal/pg_resetwal.c | 6 -- src/bin/pg_upgrade/controldata.c| 7 ++ src/include/c.h | 15 +++ src/include/catalog/catversion.h| 2 +- src/include/catalog/pg_control.h| 6 +- src/include/catalog/pg_proc.dat | 6 +- src/include/catalog/pg_type.dat | 2 +- src/include/fmgr.h | 4 - src/include/pg_config.h.in | 15 --- src/include/postgres.h | 23 + src/tools/msvc/Solution.pm | 25 - src/tools/msvc/config_default.pl| 5 - 27 files changed, 40 insertions(+), 382 deletions(-) diff --git a/configure b/configure index 6b1c779ee3..b38ad1526d 100755 --- a/configure +++ b/configure @@ -866,8 +866,6 @@ with_system_tzdata with_zlib with_gnu_ld enable_largefile -enable_float4_byval -enable_float8_byval ' ac_precious_vars='build_alias host_alias @@ -1525,8 +1523,6 @@ Optional Features: --enable-cassertenable assertion checks (for debugging) --disable-thread-safety disable thread-safety in client libraries --disable-largefile omit support for large files - --disable-float4-byval disable float4 passed by value - --disable-float8-byval disable float8 passed by value Optional Packages: --with-PACKAGE[=ARG]use PACKAGE [ARG=yes] @@ -16801,120 +16797,6 @@ _ACEOF -# Decide whether float4 is passed by value: user-selectable, enabled by default -{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with float4 passed by value" >&5 -$as_echo_n "checking whether to build with float4 passed by value... " >&6; } - - -# Check whether --enable-float4-byval was given. -if test "${enable_float4_byval+set}" = set; then : - enableval=$enable_float4_byval; - case $enableval in -yes) - -$as_echo "#define USE_FLOAT4_BYVAL 1" >>confdefs.h - - float4passbyval=true - ;; -no) - float4passbyval=false - ;; -*) - as_fn_error $? "no argument expected for --enable-float4-byval option" "$LINENO" 5 - ;; - esac - -else - enable_float4_byval=yes - -$as_echo "#define USE_FLOAT4_BYVAL 1" >>confdefs.h - - float4passbyval=true -fi - - -{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $enable_float4_byval" >&5 -$as_echo "$enable_float4_byval" >&6; } - -cat >>confdefs.h <<_ACEOF -#define FLOAT4PASSBYVAL $float4passbyval -_ACEOF - - -# Decide whether float8 is passed by value. -# Note: this setting also controls int8 and related types such as
Re: [PATCH] Do not use StdRdOptions in Access Methods
On Thu, Oct 31, 2019 at 05:18:46PM +0900, Amit Langote wrote: > On Thu, Oct 31, 2019 at 4:49 PM Michael Paquier wrote: >> Let's remove it then. > > Removed in the attached. Thanks. I exactly did the same thing on my local branch. -- Michael signature.asc Description: PGP signature
Re: update ALTER TABLE with ATTACH PARTITION lock mode (docs)
On Mon, Oct 28, 2019 at 10:56:33PM -0500, Justin Pryzby wrote: > I suppose it should something other than partition(ed), since partitions can > be > partitioned, too... > > Attaching a partition acquires a SHARE UPDATE > EXCLUSIVE > lock on the parent table, in addition to > ACCESS EXCLUSIVE locks on the child table and the > DEFAULT partition (if any). In this context, "on the child table" sounds a bit confusing? Would it make more sense to say the "on the table to be attached" instead? -- Michael signature.asc Description: PGP signature
Re: Getting psql to redisplay command after \e
Hello Tom, psql=> select 1... psql-> I cannot move back with readline to edit further, I'm stuck there, which is strange. I don't follow. readline doesn't allow you to edit already-entered lines today, that is, after typing "select 1" you see regression=# select 1 regression-# and there isn't any way to move back and edit the already-entered line within readline. Yep. My point is to possibly not implicitely at the end of \e, but to behave as if we were moving in history, which allows editing the lines, so that you would get psql=> select 1 Instead of the above. I agree it might be nicer if you could do that, but that's *far* beyond the scope of this patch. It would take entirely fundamental rethinking of our use of libreadline, if indeed it's possible at all. I also don't see how we could have syntax-aware per-line prompts if we were allowing readline to treat the whole query as one line. I was suggesting something much simpler than rethinking readline handling. Does not mean that it is a good idea, but while testing the patch I would have liked the unfinished line to be in the current editing buffer, basically as if I had not typed . ISTM more natural that \e behaves like history when coming back from editing, i.e. the \e-edited line is set as the current buffer for readline. In the larger picture, tinkering with how that works would affect every psql user at the level of "muscle memory" editing habits, and I suspect that their reactions would not be uniformly positive. What I propose here doesn't affect anyone who doesn't use \e at all. Even for \e users it doesn't have any effect on what you need to type. -- Fabien.
Re: [HACKERS] advanced partition matching algorithm for partition-wise join
On Tue, Oct 29, 2019 at 7:29 PM Etsuro Fujita wrote: > On Fri, Oct 25, 2019 at 6:59 PM amul sul wrote: > > On Wed, Oct 16, 2019 at 6:20 PM Etsuro Fujita > > wrote: > >> So I'd like to propose to introduce separate functions like > >> process_outer_partition() and process_inner_partition() in the > >> attached, instead of handle_missing_partition(). (I added a fast path > >> to these functions that if both outer/inner sides have the default > >> partitions, give up on merging partitions. Also, I modified the > >> caller sites of proposed functions so that they call these if > >> necessary.) > > > Agree -- process_outer_partition() & process_inner_partition() approach > > looks > > much cleaner than before. > > Note that LHS numbers are the line numbers in your previously posted > > patch[1]. > > > > 455 + if (inner_has_default || > > 456 + jointype == JOIN_LEFT || > > 457 + jointype == JOIN_ANTI || > > 458 + jointype == JOIN_FULL) > > 459 + { > > 460 + if (!process_outer_partition(&outer_map, > > > > 512 + if (outer_has_default || jointype == JOIN_FULL) > > 513 + { > > 514 + if (!process_inner_partition(&outer_map, > > > > How about adding these conditions to the else block of > > process_outer_partition() > > & process_inner_partition() function respectively so that these functions > > can be > > called unconditionally? Thoughts/Comments? > > I'm not sure that's a good idea; these functions might be called many > times, so I just thought it would be better to call these functions > conditionally, to avoid useless function call overhead. The overhead might be small, but isn't zero, so I still think that we should call these functions if necessary. > > 456 + jointype == JOIN_LEFT || > > 457 + jointype == JOIN_ANTI || > > 458 + jointype == JOIN_FULL) > > > > Also, how about using IS_OUTER_JOIN() instead. But we need an assertion to > > restrict JOIN_RIGHT or something. > > Seems like a good idea. Done. > > For the convenience, I did both aforesaid changes in the attached delta > > patch that > > can be applied atop of your previously posted patch[1]. Kindly have a look > > & share > > your thoughts, thanks. > > Thanks for the patch! > > > 1273 + * *next_index is incremented when creating a new merged partition > > associated > > 1274 + * with the given outer partition. > > 1275 + */ > > > > Correction: s/outer/inner > > --- > > > > 1338 +* In range partitioning, if the given outer partition is > > already > > 1339 +* merged (eg, because we found an overlapping range earlier), > > we know > > 1340 +* where it fits in the join result; nothing to do in that > > case. Else > > 1341 +* create a new merged partition. > > > > Correction: s/outer/inner. > > --- > > > > 1712 /* > > 1713 * If the NULL partition was missing from the inner side of > > the join, > > > > s/inner side/outer side > > -- > > Good catch! Will fix. Done. I also added some assertions to process_outer_partition() and process_inner_partition(), including ones as proposed in your patch. Attached is an updated version. If no objections, I'll merge this with the main patch [1]. Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/CAPmGK14WHKckT1P6UJV2B63TZAxPyMn8iZJ99XF=azunhg6...@mail.gmail.com modify-partbounds-changes-2.patch Description: Binary data
Postgres cache
Hi, I want to know how postgres stores catalog relations in cache in-depth. Is there any documentation for that?
Re: allow online change primary_conninfo
Hello > So, I'd like to propose to move the stuff to the second switch(). > (See the attached incomplete patch.) This is rather similar to > Sergei's previous proposal, but the structure of the state > machine is kept. Very similar to my v4 proposal (also move RequestXLogStreaming call), but closer to currentSource reading. No objections from me, attached patch is changed this way. I renamed start_wal_receiver to startWalReceiver - this style looks more consistent to near code. > + /* > + * Else, check if WAL receiver is still > active. > + */ > + else if (!WalRcvStreaming()) I think we still need wait WalRcvStreaming after RequestXLogStreaming call. So I remove else branch and leave separate condition. > In ProcessStartupSigHup, conninfo and slotname don't need to be > retained until the end of the function. Agreed, I move pfree > The log message in the function seems to be too detailed. On the > other hand, if we changed primary_conninfo to '' (stop) or vise > versa (start), the message (restart) looks strange. I have no strong opinion here. These messages was changed many times during this thread lifetime, can be changed anytime. I think this is not issue since we have no consensus about overall design. Write detailed messages was proposed here: https://www.postgresql.org/message-id/20190216151025.GJ2240%40paquier.xyz > or vise versa (start) I explicitly check currentSource and WalRcvRunning, so we have no such messages if user had no walreceiver before. regards, Sergeidiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 0191ec84b1..587031dea8 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -3983,9 +3983,15 @@ ANY num_sync ( @@ -4000,9 +4006,13 @@ ANY num_sync ( ). - This parameter can only be set at server start. + This parameter can only be set in the postgresql.conf + file or on the server command line. This setting has no effect if primary_conninfo is not - set. + set or the server is not in standby mode. + + + The WAL receiver is restarted after an update of primary_slot_name. diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 2e3cc51006..8eee079eb2 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -803,6 +803,12 @@ static XLogSource readSource = 0; /* XLOG_FROM_* code */ static XLogSource currentSource = 0; /* XLOG_FROM_* code */ static bool lastSourceFailed = false; +/* + * Need for restart running WalReceiver due the configuration change. + * Suitable only for XLOG_FROM_STREAM source + */ +static bool pendingWalRcvRestart = false; + typedef struct XLogPageReadPrivate { int emode; @@ -11773,6 +11779,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, for (;;) { int oldSource = currentSource; + bool startWalReceiver = false; /* * First check if we failed to read from the current source, and @@ -11806,54 +11813,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, if (!StandbyMode) return false; - /* - * If primary_conninfo is set, launch walreceiver to try - * to stream the missing WAL. - * - * If fetching_ckpt is true, RecPtr points to the initial - * checkpoint location. In that case, we use RedoStartLSN - * as the streaming start position instead of RecPtr, so - * that when we later jump backwards to start redo at - * RedoStartLSN, we will have the logs streamed already. - */ - if (PrimaryConnInfo && strcmp(PrimaryConnInfo, "") != 0) - { - XLogRecPtr ptr; - TimeLineID tli; - - if (fetching_ckpt) - { - ptr = RedoStartLSN; - tli = ControlFile->checkPointCopy.ThisTimeLineID; - } - else - { - ptr = RecPtr; - - /* - * Use the record begin position to determine the - * TLI, rather than the position we're reading. - */ - tli = tliOfPointInHistory(tliRecPtr, expectedTLEs); - - if (curFileTLI > 0 && tli < curFileTLI) -elog(ERROR, "according to history file, WAL location %X/%X belongs to timeline %u, but previous recovered WAL file came from timeline %u", - (uint32) (tliRecPtr >> 32), - (uint32) tliRecPtr, - tli, curFileTLI); - } - curFileTLI = tli; - RequestXLogStreaming(tli, ptr, PrimaryConnInfo, - PrimarySlotName); - receivedUpto = 0; - } - /* * Move to XLOG_FROM_STREAM state in either case. We'll * get immediate failure if we didn't launch walreceiver, * and move on to the next state. */ currentSource = XLOG_FROM_STREAM; + startWalReceiver = true; break; case X
The command tag of "ALTER MATERIALIZED VIEW RENAME COLUMN"
Hi, The command tag of ALTER MATERIALIZED VIEW is basically "ALTER MATERIALIZED VIEW". For example, =# ALTER MATERIALIZED VIEW test ALTER COLUMN j SET STATISTICS 100; ALTER MATERIALIZED VIEW =# ALTER MATERIALIZED VIEW test OWNER TO CURRENT_USER; ALTER MATERIALIZED VIEW =# ALTER MATERIALIZED VIEW test RENAME TO hoge; ALTER MATERIALIZED VIEW This is ok and looks intuitive to users. But I found that the command tag of ALTER MATERIALIZED VIEW RENAME COLUMN is "ALTER TABLE", not "ALTER VIEW". =# ALTER MATERIALIZED VIEW hoge RENAME COLUMN j TO x; ALTER TABLE Is this intentional? Or bug? Regards, -- Fujii Masao
Re: Allow CREATE OR REPLACE VIEW to rename the columns
On Thu, Oct 31, 2019 at 12:32 PM Fujii Masao wrote: > On Thu, Oct 31, 2019 at 1:42 PM Tom Lane wrote: > > > > Fujii Masao writes: > > > Currently CREATE OR REPLACE VIEW command fails if the column names > > > are changed. > > > > That is, I believe, intentional. It's an effective aid to catching > > mistakes in view redefinitions, such as misaligning the new set of > > columns relative to the old. That's particularly important given > > that we allow you to add columns during CREATE OR REPLACE VIEW. > > Consider the oversimplified case where you start with > > > > CREATE VIEW v AS SELECT 1 AS x, 2 AS y; > > > > and you want to add a column z, and you get sloppy and write > > > > CREATE OR REPLACE VIEW v AS SELECT 1 AS x, 3 AS z, 2 AS y; > > > > If we did not throw an error on this, references that formerly > > pointed to column y would now point to z (as that's still attnum 2), > > which is highly unlikely to be what you wanted. > > This example makes me wonder if the addtion of column by > CREATE OR REPLACE VIEW also has the same (or even worse) issue. > That is, it may increase the oppotunity for users' mistake. > I'm thinking the case where users mistakenly added new column > into the view when replacing the view definition. This mistake can > happen because CREATE OR REPLACE VIEW allows new column to > be added. But what's the worse is that, currently there is no way to > drop the column from the view, except recreation of the view. > Neither CREATE OR REPLACE VIEW nor ALTER TABLE support > the drop of the column from the view. So, to fix the mistake, > users would need to drop the view itself and recreate it. If there are > some objects depending the view, they also might need to be recreated. > This looks not good. Since the feature has been supported, > it's too late to say that, though... > > At least, the support for ALTER VIEW DROP COLUMN might be > necessary to alleviate that situation. > > - Is this intentional not implemented the "RENAME COLUMN" statement for VIEW because it is implemented for Materialized View? I have made just a similar change to view and it works. ALTER VIEW v RENAME COLUMN d to e; - For "DROP COLUMN" for VIEW is throwing error. postgres=# alter view v drop column e; ERROR: "v" is not a table, composite type, or foreign table Regards, > > -- > Fujii Masao > > > -- Ibrar Ahmed 001_alter_view_rename_column_ibrar_v1.patch Description: Binary data
Re: Allow CREATE OR REPLACE VIEW to rename the columns
On Thu, Oct 31, 2019 at 7:59 PM Ibrar Ahmed wrote: > > > > On Thu, Oct 31, 2019 at 12:32 PM Fujii Masao wrote: >> >> On Thu, Oct 31, 2019 at 1:42 PM Tom Lane wrote: >> > >> > Fujii Masao writes: >> > > Currently CREATE OR REPLACE VIEW command fails if the column names >> > > are changed. >> > >> > That is, I believe, intentional. It's an effective aid to catching >> > mistakes in view redefinitions, such as misaligning the new set of >> > columns relative to the old. That's particularly important given >> > that we allow you to add columns during CREATE OR REPLACE VIEW. >> > Consider the oversimplified case where you start with >> > >> > CREATE VIEW v AS SELECT 1 AS x, 2 AS y; >> > >> > and you want to add a column z, and you get sloppy and write >> > >> > CREATE OR REPLACE VIEW v AS SELECT 1 AS x, 3 AS z, 2 AS y; >> > >> > If we did not throw an error on this, references that formerly >> > pointed to column y would now point to z (as that's still attnum 2), >> > which is highly unlikely to be what you wanted. >> >> This example makes me wonder if the addtion of column by >> CREATE OR REPLACE VIEW also has the same (or even worse) issue. >> That is, it may increase the oppotunity for users' mistake. >> I'm thinking the case where users mistakenly added new column >> into the view when replacing the view definition. This mistake can >> happen because CREATE OR REPLACE VIEW allows new column to >> be added. But what's the worse is that, currently there is no way to >> drop the column from the view, except recreation of the view. >> Neither CREATE OR REPLACE VIEW nor ALTER TABLE support >> the drop of the column from the view. So, to fix the mistake, >> users would need to drop the view itself and recreate it. If there are >> some objects depending the view, they also might need to be recreated. >> This looks not good. Since the feature has been supported, >> it's too late to say that, though... >> >> At least, the support for ALTER VIEW DROP COLUMN might be >> necessary to alleviate that situation. >> > > - Is this intentional not implemented the "RENAME COLUMN" statement for > VIEW because it is implemented for Materialized View? Not sure that, but Tom's suggestion to support ALTER VIEW RENAME COLUMN sounds reasonable whether we support the rename of columns when replacing the view definition, or not. Attached is the patch that adds support for ALTER VIEW RENAME COLUMN command. > I have made just a similar > change to view and it works. Yeah, ISTM that we made the same patch at the same time. You changed gram.y, but I added the following changes additionally. - Update the doc - Add HINT message emit when CRAETE OR REPLACE VIEW fails to rename the columns - Update tab-complete.c - Add regression test One issue I've not addressed yet is about the command tag of "ALTER VIEW RENAME COLUMN". Currently "ALTER TABLE" is returned as the tag like ALTER MATERIALIZED VIEW RENAME COLUMN, but ISTM that "ALTER VIEW" is better. I started the discussion about the command tag of "ALTER MATERIALIZED VIEW" at another thread. I will update the patch according to the result of that discussion. https://postgr.es/m/CAHGQGwGUaC03FFdTFoHsCuDrrNvFvNVQ6xyd40==p25wvub...@mail.gmail.com Regards, -- Fujii Masao alter_view_rename_column_v1.patch Description: Binary data
Re: Allow CREATE OR REPLACE VIEW to rename the columns
On Thu, Oct 31, 2019 at 5:01 PM Fujii Masao wrote: > On Thu, Oct 31, 2019 at 7:59 PM Ibrar Ahmed wrote: > > > > > > > > On Thu, Oct 31, 2019 at 12:32 PM Fujii Masao > wrote: > >> > >> On Thu, Oct 31, 2019 at 1:42 PM Tom Lane wrote: > >> > > >> > Fujii Masao writes: > >> > > Currently CREATE OR REPLACE VIEW command fails if the column names > >> > > are changed. > >> > > >> > That is, I believe, intentional. It's an effective aid to catching > >> > mistakes in view redefinitions, such as misaligning the new set of > >> > columns relative to the old. That's particularly important given > >> > that we allow you to add columns during CREATE OR REPLACE VIEW. > >> > Consider the oversimplified case where you start with > >> > > >> > CREATE VIEW v AS SELECT 1 AS x, 2 AS y; > >> > > >> > and you want to add a column z, and you get sloppy and write > >> > > >> > CREATE OR REPLACE VIEW v AS SELECT 1 AS x, 3 AS z, 2 AS y; > >> > > >> > If we did not throw an error on this, references that formerly > >> > pointed to column y would now point to z (as that's still attnum 2), > >> > which is highly unlikely to be what you wanted. > >> > >> This example makes me wonder if the addtion of column by > >> CREATE OR REPLACE VIEW also has the same (or even worse) issue. > >> That is, it may increase the oppotunity for users' mistake. > >> I'm thinking the case where users mistakenly added new column > >> into the view when replacing the view definition. This mistake can > >> happen because CREATE OR REPLACE VIEW allows new column to > >> be added. But what's the worse is that, currently there is no way to > >> drop the column from the view, except recreation of the view. > >> Neither CREATE OR REPLACE VIEW nor ALTER TABLE support > >> the drop of the column from the view. So, to fix the mistake, > >> users would need to drop the view itself and recreate it. If there are > >> some objects depending the view, they also might need to be recreated. > >> This looks not good. Since the feature has been supported, > >> it's too late to say that, though... > >> > >> At least, the support for ALTER VIEW DROP COLUMN might be > >> necessary to alleviate that situation. > >> > > > > - Is this intentional not implemented the "RENAME COLUMN" statement for > > VIEW because it is implemented for Materialized View? > > Not sure that, but Tom's suggestion to support ALTER VIEW RENAME COLUMN > sounds reasonable whether we support the rename of columns when replacing > the view definition, or not. Attached is the patch that adds support for > ALTER VIEW RENAME COLUMN command. > > > I have made just a similar > > change to view and it works. > > Yeah, ISTM that we made the same patch at the same time. You changed > gram.y, > but I added the following changes additionally. > > - Update the doc > - Add HINT message emit when CRAETE OR REPLACE VIEW fails to rename the > columns > - Update tab-complete.c > - Add regression test > > Oh, I just sent the patch to ask, good you do that in all the places. One issue I've not addressed yet is about the command tag of > "ALTER VIEW RENAME COLUMN". Currently "ALTER TABLE" is returned as the tag > like ALTER MATERIALIZED VIEW RENAME COLUMN, but ISTM that "ALTER VIEW" > is better. I started the discussion about the command tag of > "ALTER MATERIALIZED VIEW" at another thread. I will update the patch > according > to the result of that discussion. > > https://postgr.es/m/CAHGQGwGUaC03FFdTFoHsCuDrrNvFvNVQ6xyd40==p25wvub...@mail.gmail.com > > Attached patch contain small change for ALTER MATERIALIZED VIEW. > Regards, > > -- > Fujii Masao > -- Ibrar Ahmed alter_view_rename_column_v2.patch Description: Binary data
Re: Allow CREATE OR REPLACE VIEW to rename the columns
On Thu, Oct 31, 2019 at 5:11 PM Ibrar Ahmed wrote: > > > On Thu, Oct 31, 2019 at 5:01 PM Fujii Masao wrote: > >> On Thu, Oct 31, 2019 at 7:59 PM Ibrar Ahmed >> wrote: >> > >> > >> > >> > On Thu, Oct 31, 2019 at 12:32 PM Fujii Masao >> wrote: >> >> >> >> On Thu, Oct 31, 2019 at 1:42 PM Tom Lane wrote: >> >> > >> >> > Fujii Masao writes: >> >> > > Currently CREATE OR REPLACE VIEW command fails if the column names >> >> > > are changed. >> >> > >> >> > That is, I believe, intentional. It's an effective aid to catching >> >> > mistakes in view redefinitions, such as misaligning the new set of >> >> > columns relative to the old. That's particularly important given >> >> > that we allow you to add columns during CREATE OR REPLACE VIEW. >> >> > Consider the oversimplified case where you start with >> >> > >> >> > CREATE VIEW v AS SELECT 1 AS x, 2 AS y; >> >> > >> >> > and you want to add a column z, and you get sloppy and write >> >> > >> >> > CREATE OR REPLACE VIEW v AS SELECT 1 AS x, 3 AS z, 2 AS y; >> >> > >> >> > If we did not throw an error on this, references that formerly >> >> > pointed to column y would now point to z (as that's still attnum 2), >> >> > which is highly unlikely to be what you wanted. >> >> >> >> This example makes me wonder if the addtion of column by >> >> CREATE OR REPLACE VIEW also has the same (or even worse) issue. >> >> That is, it may increase the oppotunity for users' mistake. >> >> I'm thinking the case where users mistakenly added new column >> >> into the view when replacing the view definition. This mistake can >> >> happen because CREATE OR REPLACE VIEW allows new column to >> >> be added. But what's the worse is that, currently there is no way to >> >> drop the column from the view, except recreation of the view. >> >> Neither CREATE OR REPLACE VIEW nor ALTER TABLE support >> >> the drop of the column from the view. So, to fix the mistake, >> >> users would need to drop the view itself and recreate it. If there are >> >> some objects depending the view, they also might need to be recreated. >> >> This looks not good. Since the feature has been supported, >> >> it's too late to say that, though... >> >> >> >> At least, the support for ALTER VIEW DROP COLUMN might be >> >> necessary to alleviate that situation. >> >> >> > >> > - Is this intentional not implemented the "RENAME COLUMN" statement for >> > VIEW because it is implemented for Materialized View? >> >> Not sure that, but Tom's suggestion to support ALTER VIEW RENAME COLUMN >> sounds reasonable whether we support the rename of columns when replacing >> the view definition, or not. Attached is the patch that adds support for >> ALTER VIEW RENAME COLUMN command. >> >> > I have made just a similar >> > change to view and it works. >> >> Yeah, ISTM that we made the same patch at the same time. You changed >> gram.y, >> but I added the following changes additionally. >> >> - Update the doc >> - Add HINT message emit when CRAETE OR REPLACE VIEW fails to rename the >> columns >> - Update tab-complete.c >> - Add regression test >> >> > Oh, I just sent the patch to ask, good you do that in all the places. > > One issue I've not addressed yet is about the command tag of >> "ALTER VIEW RENAME COLUMN". Currently "ALTER TABLE" is returned as the tag >> like ALTER MATERIALIZED VIEW RENAME COLUMN, but ISTM that "ALTER VIEW" >> is better. I started the discussion about the command tag of >> "ALTER MATERIALIZED VIEW" at another thread. I will update the patch >> according >> to the result of that discussion. >> >> https://postgr.es/m/CAHGQGwGUaC03FFdTFoHsCuDrrNvFvNVQ6xyd40==p25wvub...@mail.gmail.com >> >> Attached patch contain small change for ALTER MATERIALIZED VIEW. > > Hmm, my small change of "ALTER MATERIALIZED VIEW" does not work in some cases need more work on that. > > >> Regards, >> >> -- >> Fujii Masao >> > > > -- > Ibrar Ahmed > -- Ibrar Ahmed
ECPG: proposal for new DECLARE STATEMENT
Dear hackers, As declared last month, I propose again the new ECPG grammar, DECLARE STATEMENT. This had been committed once, but it removed from PG12 because of some problems. In this mail, I want to report some problems that previous implementation has, produce a new solution, and attach a WIP patch. [Basic function, Grammar, and Use case] This statement will be used for the purpose of designating a connection easily. Please see below: https://www.postgresql.org/message-id/flat/4E72940DA2BF16479384A86D54D0988A4D80D3C9@G01JPEXMBKW04 The Oracle's manual will also help your understanding: https://docs.oracle.com/en/database/oracle/oracle-database/19/lnpcc/embedded-SQL-statements-and-directives.html#GUID-0A30B7B4-BD91-42EA-AACE-2E9CBF7E9C1A [Issues] That's why this feature has been reverted. 1. The namespace of the identifier was not clear. If you use a same identifier for other SQL statements, these interfered each other and statements might be executed at the unexpected connection. 2. Declaring at the outside of functions was not allowed. This specification is quite different from the other declarative statements, so some users might be confused. For instance, the following example was rejected. ``` EXEC SQL DECLARE stmt STATEMENT; int main() { ... EXEC SQL DECLARE cur CURSOR FOR stmt; ... } ``` 3. These specifications were not compatible with other DBMSs. [Solutions] The namespace is set to be a file unit. This follows other DBMSs. When the DECLARE SATATEMENT statement is read, the name, identifier and the related connection are recorded. And if you use the declared identifier in order to prepare or declare cursor, the fourth argument for ECPGdo(it represents the connection) will be overwritten. This declaration is enabled only the precompile phase. [Limitations] The declaration must be appeared before using it. This also follows Pro*C precompiler. A WIP patch is attached. Confirm that all ECPG tests have passed, however, some documents are not included. They will be added later. I applied the pgindent as a test, but it might be failed because this is the first time for me. Best regards Hayato Kuroda FUJITSU LIMITED E-Mail:kuroda.hay...@fujitsu.com DeclareStmt01.patch Description: DeclareStmt01.patch
Re: Allow CREATE OR REPLACE VIEW to rename the columns
On Thu, Oct 31, 2019 at 5:28 PM Ibrar Ahmed wrote: > > > On Thu, Oct 31, 2019 at 5:11 PM Ibrar Ahmed wrote: > >> >> >> On Thu, Oct 31, 2019 at 5:01 PM Fujii Masao >> wrote: >> >>> On Thu, Oct 31, 2019 at 7:59 PM Ibrar Ahmed >>> wrote: >>> > >>> > >>> > >>> > On Thu, Oct 31, 2019 at 12:32 PM Fujii Masao >>> wrote: >>> >> >>> >> On Thu, Oct 31, 2019 at 1:42 PM Tom Lane wrote: >>> >> > >>> >> > Fujii Masao writes: >>> >> > > Currently CREATE OR REPLACE VIEW command fails if the column names >>> >> > > are changed. >>> >> > >>> >> > That is, I believe, intentional. It's an effective aid to catching >>> >> > mistakes in view redefinitions, such as misaligning the new set of >>> >> > columns relative to the old. That's particularly important given >>> >> > that we allow you to add columns during CREATE OR REPLACE VIEW. >>> >> > Consider the oversimplified case where you start with >>> >> > >>> >> > CREATE VIEW v AS SELECT 1 AS x, 2 AS y; >>> >> > >>> >> > and you want to add a column z, and you get sloppy and write >>> >> > >>> >> > CREATE OR REPLACE VIEW v AS SELECT 1 AS x, 3 AS z, 2 AS y; >>> >> > >>> >> > If we did not throw an error on this, references that formerly >>> >> > pointed to column y would now point to z (as that's still attnum 2), >>> >> > which is highly unlikely to be what you wanted. >>> >> >>> >> This example makes me wonder if the addtion of column by >>> >> CREATE OR REPLACE VIEW also has the same (or even worse) issue. >>> >> That is, it may increase the oppotunity for users' mistake. >>> >> I'm thinking the case where users mistakenly added new column >>> >> into the view when replacing the view definition. This mistake can >>> >> happen because CREATE OR REPLACE VIEW allows new column to >>> >> be added. But what's the worse is that, currently there is no way to >>> >> drop the column from the view, except recreation of the view. >>> >> Neither CREATE OR REPLACE VIEW nor ALTER TABLE support >>> >> the drop of the column from the view. So, to fix the mistake, >>> >> users would need to drop the view itself and recreate it. If there are >>> >> some objects depending the view, they also might need to be recreated. >>> >> This looks not good. Since the feature has been supported, >>> >> it's too late to say that, though... >>> >> >>> >> At least, the support for ALTER VIEW DROP COLUMN might be >>> >> necessary to alleviate that situation. >>> >> >>> > >>> > - Is this intentional not implemented the "RENAME COLUMN" statement for >>> > VIEW because it is implemented for Materialized View? >>> >>> Not sure that, but Tom's suggestion to support ALTER VIEW RENAME COLUMN >>> sounds reasonable whether we support the rename of columns when replacing >>> the view definition, or not. Attached is the patch that adds support for >>> ALTER VIEW RENAME COLUMN command. >>> >>> > I have made just a similar >>> > change to view and it works. >>> >>> Yeah, ISTM that we made the same patch at the same time. You changed >>> gram.y, >>> but I added the following changes additionally. >>> >>> - Update the doc >>> - Add HINT message emit when CRAETE OR REPLACE VIEW fails to rename the >>> columns >>> - Update tab-complete.c >>> - Add regression test >>> >>> >> Oh, I just sent the patch to ask, good you do that in all the places. >> >> One issue I've not addressed yet is about the command tag of >>> "ALTER VIEW RENAME COLUMN". Currently "ALTER TABLE" is returned as the >>> tag >>> like ALTER MATERIALIZED VIEW RENAME COLUMN, but ISTM that "ALTER VIEW" >>> is better. I started the discussion about the command tag of >>> "ALTER MATERIALIZED VIEW" at another thread. I will update the patch >>> according >>> to the result of that discussion. >>> >>> https://postgr.es/m/CAHGQGwGUaC03FFdTFoHsCuDrrNvFvNVQ6xyd40==p25wvub...@mail.gmail.com >>> >>> Attached patch contain small change for ALTER MATERIALIZED VIEW. >> >> > Hmm, my small change of "ALTER MATERIALIZED VIEW" does not work in some > cases need more work on that. > > The AlterObjectTypeCommandTag function just take one parameter, but to show "ALTER MATERIALIZED VIEW" instead of ALTER TABLE we need to pass "relationType = OBJECT_MATVIEW" along with "renameType = OBJECT_COLUMN" and handle that in the function. The "AlterObjectTypeCommandTag" function has many calls. Do you think just for the command tag, we should do all this change? > >> >>> Regards, >>> >>> -- >>> Fujii Masao >>> >> >> >> -- >> Ibrar Ahmed >> > > > -- > Ibrar Ahmed > -- Ibrar Ahmed
Application name for pg_basebackup and friends
Hi, The attached patch adds an -a / --appname command line switch to pg_basebackup, pg_receivewal and pg_recvlogical. This is useful when f.ex. pg_receivewal needs to connect as a synchronous client (synchronous_standby_names), pg_receivewal -h myhost -p 5432 -S replica1 -a replica1 --synchronous -D /wal I'll add the patch to the CommitFest for discussion, as there is overlap with the -d switch. Best regards, Jesper >From 3aee659423137a547ed178a1dab34fe3caf30702 Mon Sep 17 00:00:00 2001 From: jesperpedersen Date: Thu, 31 Oct 2019 08:34:41 -0400 Subject: [PATCH] Add an -a / --appname command line switch to control the application_name property. Author: Jesper Pedersen --- doc/src/sgml/ref/pg_basebackup.sgml| 11 +++ doc/src/sgml/ref/pg_receivewal.sgml| 11 +++ doc/src/sgml/ref/pg_recvlogical.sgml | 11 +++ src/bin/pg_basebackup/pg_basebackup.c | 7 ++- src/bin/pg_basebackup/pg_receivewal.c | 7 ++- src/bin/pg_basebackup/pg_recvlogical.c | 7 ++- src/bin/pg_basebackup/streamutil.c | 7 +++ src/bin/pg_basebackup/streamutil.h | 1 + 8 files changed, 59 insertions(+), 3 deletions(-) diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml index fc9e222f8d..28b5dee206 100644 --- a/doc/src/sgml/ref/pg_basebackup.sgml +++ b/doc/src/sgml/ref/pg_basebackup.sgml @@ -543,6 +543,17 @@ PostgreSQL documentation The following command-line options control the database connection parameters. + + -a application_name + --appname=application_name + + +Specifies the application name used to connect to the server. +See for more information. + + + + -d connstr --dbname=connstr diff --git a/doc/src/sgml/ref/pg_receivewal.sgml b/doc/src/sgml/ref/pg_receivewal.sgml index 177e9211c0..0957e0a9f5 100644 --- a/doc/src/sgml/ref/pg_receivewal.sgml +++ b/doc/src/sgml/ref/pg_receivewal.sgml @@ -248,6 +248,17 @@ PostgreSQL documentation The following command-line options control the database connection parameters. + + -a application_name + --appname=application_name + + +Specifies the application name used to connect to the server. +See for more information. + + + + -d connstr --dbname=connstr diff --git a/doc/src/sgml/ref/pg_recvlogical.sgml b/doc/src/sgml/ref/pg_recvlogical.sgml index 4c79f90414..b2d9b35362 100644 --- a/doc/src/sgml/ref/pg_recvlogical.sgml +++ b/doc/src/sgml/ref/pg_recvlogical.sgml @@ -272,6 +272,17 @@ PostgreSQL documentation The following command-line options control the database connection parameters. + + -a application_name + --appname=application_name + + + Specifies the application name used to connect to the server. + See for more information. + + + + -d database --dbname=database diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index a9d162a7da..237945f879 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -351,6 +351,7 @@ usage(void) " do not verify checksums\n")); printf(_(" -?, --help show this help, then exit\n")); printf(_("\nConnection options:\n")); + printf(_(" -a, --appname=NAME application name\n")); printf(_(" -d, --dbname=CONNSTR connection string\n")); printf(_(" -h, --host=HOSTNAMEdatabase server host or socket directory\n")); printf(_(" -p, --port=PORTdatabase server port number\n")); @@ -2031,6 +2032,7 @@ main(int argc, char **argv) {"label", required_argument, NULL, 'l'}, {"no-clean", no_argument, NULL, 'n'}, {"no-sync", no_argument, NULL, 'N'}, + {"appname", required_argument, NULL, 'a'}, {"dbname", required_argument, NULL, 'd'}, {"host", required_argument, NULL, 'h'}, {"port", required_argument, NULL, 'p'}, @@ -2070,7 +2072,7 @@ main(int argc, char **argv) atexit(cleanup_directories_atexit); - while ((c = getopt_long(argc, argv, "CD:F:r:RS:T:X:l:nNzZ:d:c:h:p:U:s:wWkvP", + while ((c = getopt_long(argc, argv, "CD:F:r:RS:T:X:l:nNzZ:a:d:c:h:p:U:s:wWkvP", long_options, &option_index)) != -1) { switch (c) @@ -2176,6 +2178,9 @@ main(int argc, char **argv) exit(1); } break; + case 'a': +application_name = pg_strdup(optarg); +break; case 'd': connection_string = pg_strdup(optarg); break; diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c index c0c8747982..863dcdb161 100644 --- a/src/bin/pg_basebackup/pg_receivewal.c +++ b/src/bin/pg_basebackup/pg_receivewal.c @@ -93,6 +93,7 @@ usage(void) printf(_(" -Z, --compress=0-9 compress logs with given compression level\n")); printf(_(" -?, --help show this
Re: Application name for pg_basebackup and friends
On 31/10/2019 14:52, Jesper Pedersen wrote: Hi, The attached patch adds an -a / --appname command line switch to pg_basebackup, pg_receivewal and pg_recvlogical. This is useful when f.ex. pg_receivewal needs to connect as a synchronous client (synchronous_standby_names), pg_receivewal -h myhost -p 5432 -S replica1 -a replica1 --synchronous -D /wal I'll add the patch to the CommitFest for discussion, as there is overlap with the -d switch. You can already set application name with the environment variable or on the database connections string: pg_receivewal -D /wal -d "host=myhost application_name=myreceiver" I don't think we need a new comand line switch for it. - Heikki
Re: Collation versioning
Hello Thomas, On Tue, May 28, 2019 at 9:00 PM Thomas Munro wrote: > > Since there's a chance of an "unconference" session on locale versions > tomorrow at PGCon, here's a fresh rebase of the patchset to add > per-database-object collation version tracking. It doesn't handle > default collations yet (not hard AFAIK, will try that soon), but it > does work well enough to demonstrate the generate principal. I won't > attach the CHECK support just yet, because it needs more work, but the > point of it was to demonstrate that pg_depend can handle this for all > kinds of database objects in one standard way, rather than sprinkling > collation version stuff all over the place in pg_index, pg_constraint, > etc, and I think it did that already. Are you planning to continue working on it? For the record, that's something needed to be able to implement a filter in REINDEX command [1]. I'm not sending a review since the code isn't finished yet, but one issue with current approach is that the WARNING message recommending to issue a REINDEX can be issued when running the required REINDEX, which is at best unhelpful: # update pg_depend set refobjversion = 'a' || refobjversion where refobjversion != ''; # reindex table t1; WARNING: 01000: index "t1_val_idx" depends on collation 13330 version "a153.97.35.8", but the current version is "153.97.35.8" DETAIL: The index may be corrupted due to changes in sort order. HINT: REINDEX to avoid the risk of corruption. LOCATION: index_check_collation_version, index.c:1263 [1]: https://www.postgresql.org/message-id/a81069b1-fdaa-ff40-436e-7840bd639ccf%402ndquadrant.com
Re: Remove configure --disable-float4-byval and --disable-float8-byval
Peter Eisentraut writes: > float4 is now always pass-by-value; the pass-by-reference code path is > completely removed. I think this is OK. > float8 and related types are now hardcoded to pass-by-value or > pass-by-reference depending on whether the build is 64- or 32-bit, as > was previously also the default. I'm less happy with doing this. It makes it impossible to test the pass-by-reference code paths without actually firing up a 32-bit environment. It'd be fine to document --disable-float8-byval as a developer-only option (it might be so already), but I don't want to lose it completely. I fail to see any advantage in getting rid of it, anyway, since we do still have to maintain both code paths. regards, tom lane
Re: Allow CREATE OR REPLACE VIEW to rename the columns
Fujii Masao writes: > On Thu, Oct 31, 2019 at 1:42 PM Tom Lane wrote: >> Fujii Masao writes: >>> Currently CREATE OR REPLACE VIEW command fails if the column names >>> are changed. >> That is, I believe, intentional. It's an effective aid to catching >> mistakes in view redefinitions, such as misaligning the new set of >> columns relative to the old. [example] > This example makes me wonder if the addtion of column by > CREATE OR REPLACE VIEW also has the same (or even worse) issue. > That is, it may increase the oppotunity for users' mistake. The idea in CREATE OR REPLACE VIEW is to allow addition of new columns at the end, the same as you can do with tables. Checking the column name matchups is a way to ensure that you actually do add at the end, rather than insert, which wouldn't act as you expect. Admittedly it's only heuristic. We could, perhaps, have insisted that adding a column also requires special syntax, but we didn't. Consider for example a case where the new column needs to come from an additionally joined table; then you have to be able to edit the underlying view definition along with adding the column. So that seems like kind of a pain in the neck to insist on. > But what's the worse is that, currently there is no way to > drop the column from the view, except recreation of the view. I think this has been discussed, as well. It's not necessarily simple to drop a view output column. For example, if the view uses SELECT DISTINCT, removing an output column would have semantic effects on the set of rows that can be returned, since distinct-ness would now mean something else than it did before. It's conceivable that we could enumerate all such effects and then allow DROP COLUMN (probably replacing the output column with a null constant) if none of them apply, but I can't get terribly excited about it. The field demand for such a feature has not been high. I'd be a bit worried about bugs arising from failures to check attisdropped for views, too; so that the cost of getting this working might be greater than it seems on the surface. regards, tom lane
Re: The command tag of "ALTER MATERIALIZED VIEW RENAME COLUMN"
Fujii Masao writes: > ... I found that the command tag of > ALTER MATERIALIZED VIEW RENAME COLUMN is "ALTER TABLE", not "ALTER VIEW". > =# ALTER MATERIALIZED VIEW hoge RENAME COLUMN j TO x; > ALTER TABLE > Is this intentional? Or bug? Seems like an oversight. regards, tom lane
function calls optimization
Hi I almost finished patch optimizing non volatile function calls. select f(t.n) from t where f(t.n) > 10 and f(t.n) < 100; needs 3 calls of f() for each tuple, after applying patch only 1. Any pros and cons ?
Re: function calls optimization
Hi, On October 31, 2019 7:06:13 AM PDT, Andrzej Barszcz wrote: >Hi > >I almost finished patch optimizing non volatile function calls. > >select f(t.n) from t where f(t.n) > 10 and f(t.n) < 100; needs 3 calls >of >f() for each tuple, >after applying patch only 1. > >Any pros and cons ? Depends on the actual way of implementing this proposal. Think we need more details than what you idea here. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
On Fri, Sep 6, 2019 at 3:34 PM Smith, Peter wrote: > > -Original Message- > From: Masahiko Sawada Sent: Thursday, 15 August 2019 > 7:10 PM > > > BTW I've created PoC patch for cluster encryption feature. Attached patch > > set has done some items of TODO list and some of them can be used even for > > finer granularity encryption. Anyway, the implemented components are > > followings: > > Hello Sawada-san, > > I guess your original patch code may be getting a bit out-dated by the > ongoing TDE discussions, but I have done some code review of it anyway. > > Hopefully a few comments below can still be of use going forward: > > --- > > REVIEW COMMENTS > > * src/backend/storage/encryption/enc_cipher.c – For functions > EncryptionCipherValue/String maybe should log warnings for unexpected values > instead of silently assigning to default 0/”off”. > > * src/backend/storage/encryption/enc_cipher.c – For function > EncryptionCipherString, purpose of returning ”unknown” if unclear because > that will map back to “off” again anyway via EncryptionCipherValue. Why not > just return "off" (with warning logged). > > * src/include/storage/enc_common.h – Typo in comment: "Encrypton". > > * src/include/storage/encryption.h - The macro DataEncryptionEnabled may be > better to be using enum TDE_ENCRYPTION_OFF instead of magic number 0 > > * src/backend/storage/encryption/kmgr.c - Function BootStrapKmgr will report > error if USE_OPENSSL is not defined. The check seems premature because it > would fail even if the user is not using encryption. Shouldn't the lack of > openssl be OK when user is not using TDE at all (i.e. when encryption is > "none")? > > * src/backend/storage/encryption/kmgr.c - In function BootStrapMgr suggest > better to check if (bootstrap_data_encryption_cipher == TDE_ENCRYPTION_OFF) > using enum instead of the magic number 0. > > * src/backend/storage/encryption/kmgr.c - The function > run_cluster_passphrase_command function seems mostly a clone of an existing > run_ssl_passphrase_command function. Is it possible to refactor to share the > common code? > > * src/backend/storage/encryption/kmgr.c - The function derive_encryption_key > declares a char key_len. Why char? It seems int everywhere else. > > * src/backend/bootstrap/bootstrap.c - Suggest better if variable declaration > bootstrap_data_encryption_cipher = 0 uses enum TDE_ENCRYPTION_OFF instead of > magic number 0 > > * src/backend/utils/misc/guc.c - It looks like the default value for GUC > variable data_encryption_cipher is AES128. Wouldn't "off" be the more > appropriate default value? Otherwise it seems inconsistent with the logic of > initdb (which insists that the -e option is mandatory if you wanted any > encryption). > > * src/backend/utils/misc/guc.c - There is a missing entry in the > config_group_names[]. The patch changed the config_group[] in guc_tables.h, > so I think there needs to be a matching item in the config_group_names. > > * src/bin/initdb/initdb.c - The function check_encryption_cipher would > disallow an encryption value of "none". Although maybe it is not very useful > to say -e none, it does seem inconsistent to reject it, given that "none" was > a valid value for the GUC variable data_encryption_cipher. > > * contrib/bloom/blinsert.c - In function btbuildempty the arguments for > PageEncryptionInPlace seem in the wrong order (forknum should be 2nd). > > * src/backend/access/hash/hashpage.c - In function _hash_alloc_buckets the > arguments for PageEncryptionInPlace seem in the wrong order (forknum should > be 2nd). > > * src/backend/access/spgist/spginsert.c - In function spgbuildempty the > arguments for PageEncryptionInPlace seem in the wrong order (forknum should > be 2nd). This error looks repeated 3X. > > * in multiple files - The encryption enums have equivalent strings ("off", > "aes-128", "aes-256") but they are scattered as string literals in many > places (e.g. pg_controldata.c, initdb.c, guc.c, enc_cipher.c). Suggest it > would be better to declare those as string constants in one place only.em > Thank you for reviewing this patch. I've updated TDE patches. These patches implements key system, buffer encryption and WAL encryption. Please refer to ToDo of cluster-wide encryption for more details of design and components. It lacks temporary file encryption and front end tools encryption. For temporary file encryption, we are discussing which files should be encrypted on other thread and I thought that temporary file encryption might be related to that. So I'm currently studying the temporary encryption patch that Antonin already submitted[1] but some changes might be needed based on that discussion. For frontend tool support, Shawn will share his patch that is built on my patch. I haven't changed the usage of this feature. Please refer to the email[2] for how to setup encrypted database cluster. [1] https://www.postgresql.org/message-id/7082.1562337694%40localh
Re: function calls optimization
Andres Freund writes: > On October 31, 2019 7:06:13 AM PDT, Andrzej Barszcz wrote: >> I almost finished patch optimizing non volatile function calls. >> >> select f(t.n) from t where f(t.n) > 10 and f(t.n) < 100; needs 3 calls >> of >> f() for each tuple, >> after applying patch only 1. >> >> Any pros and cons ? > Depends on the actual way of implementing this proposal. Think we need more > details than what you idea here. We've typically supposed that the cost of searching for duplicate subexpressions would outweigh the benefits of sometimes finding them. regards, tom lane
Re: function calls optimization
Hi, On October 31, 2019 7:45:26 AM PDT, Tom Lane wrote: >Andres Freund writes: >> On October 31, 2019 7:06:13 AM PDT, Andrzej Barszcz > wrote: >>> I almost finished patch optimizing non volatile function calls. >>> >>> select f(t.n) from t where f(t.n) > 10 and f(t.n) < 100; needs 3 >calls >>> of >>> f() for each tuple, >>> after applying patch only 1. >>> >>> Any pros and cons ? > >> Depends on the actual way of implementing this proposal. Think we >need more details than what you idea here. > >We've typically supposed that the cost of searching for duplicate >subexpressions would outweigh the benefits of sometimes finding them. Based on profiles I've seen I'm not sure that's the right choice. Both for when the calls are expensive (say postgis stuff), and for when a lot of rows are processed. I think one part of doing this in a realistic manner is an efficient search for redundant expressions. The other, also non trivial, is how to even represent references to the results of expressions in other parts of the expression tree / other expressions. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: pgbench - extend initialization phase control
Hello Masao-san, + snprintf(sql, sizeof(sql), + "insert into pgbench_branches(bid,bbalance) " + "select bid, 0 " + "from generate_series(1, %d) as bid", scale); "scale" should be "nbranches * scale". Yep, even if nbranches is 1, it should be there. + snprintf(sql, sizeof(sql), + "insert into pgbench_accounts(aid,bid,abalance,filler) " + "select aid, (aid - 1) / %d + 1, 0, '' " + "from generate_series(1, %d) as aid", naccounts, scale * naccounts); Like client-side data generation, INT64_FORMAT should be used here instead of %d? Indeed. If large scale factor is specified, the query for generating pgbench_accounts data can take a very long time. While that query is running, operators may be likely to do Ctrl-C to cancel the data generation. In this case, IMO pgbench should cancel the query, i.e., call PQcancel(). Otherwise, the query will keep running to the end. Hmmm. Why not. Now the infra to do that seems to already exists twice, once in "src/bin/psql/common.c" and once in "src/bin/scripts/common.c". I cannot say I'm thrilled to replicate this once more. I think that the reasonable option is to share this in fe-utils and then to reuse it from there. However, ISTM that such a restructuring patch which not belong to this feature. - for (step = initialize_steps; *step != '\0'; step++) + for (const char *step = initialize_steps; *step != '\0'; step++) Per PostgreSQL basic coding style, C99 (20 years ago) is new the norm, and this style is now allowed, there are over a hundred instances of these already. I tend to use that where appropriate. - fprintf(stderr, "unrecognized initialization step \"%c\"\n", + fprintf(stderr, + "unrecognized initialization step \"%c\"\n" + "Allowed step characters are: \"" ALL_INIT_STEPS "\".\n", *step); - fprintf(stderr, "allowed steps are: \"d\", \"t\", \"g\", \"v\", \"p\", \"f\"\n"); The original message seems better to me. So what about just appending "G" into the above latter message? That is, "allowed steps are: \"d\", \"t\", \"g\", \"G\", \"v\", \"p\", \"f\"\n" I needed this list in several places, so it makes sense to share the definition, and frankly the list of half a dozen comma-separated chars does not strike me as much better than just giving the allowed chars directly. So the simpler the better, from my point of view. Isn't it better to explain a bit more what "client-side / server-side data generation" is? For example, something like Ok. Attached v7 does most of the above, but the list of char message and the signal handling. The first one does not look really better to me, and the second one belongs to a restructuring patch that I'll try to submit. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index e3a0abb4c7..7be9c81c43 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -193,12 +193,25 @@ pgbench options d - g (Generate data) + g or G (Generate data, client-side or server-side) Generate data and load it into the standard tables, replacing any data already present. + +With g (client-side data generation), +data is generated in pgbench client and then +sent to the server. This uses the client/server bandwidth +extensively through a COPY. + + +With G (server-side data generation), +only limited queries are sent from pgbench +client and then data is actually generated in the server. +No significant bandwidth is required for this variant, but +the server will do more work. + diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 03bcd22996..7f5c1d00c8 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -131,8 +131,14 @@ static int pthread_join(pthread_t th, void **thread_return); / * some configurable parameters */ +/* + * we do all data generation in one transaction to enable the backend's + * data-loading optimizations + */ #define DEFAULT_INIT_STEPS "dtgvp" /* default -I setting */ +#define ALL_INIT_STEPS "dtgGvpf" /* all possible steps */ + #define LOG_STEP_SECONDS 5 /* seconds between log messages */ #define DEFAULT_NXACTS 10 /* default nxacts */ @@ -627,7 +633,7 @@ usage(void) " %s [OPTION]... [DBNAME]\n" "\nInitialization options:\n" " -i, --initialize invokes initialization mode\n" - " -I, --init-steps=[dtgvpf]+ (default \"dtgvp\")\n" + " -I, --init-steps=[" ALL_INIT_STEPS "]+ (default \"" DEFAULT_INIT_STEPS "\")\n" " run selected initialization steps\n" " -F, --fillfactor=NUM set fill factor\n" " -n, --no-vacuum do not run VACUUM du
Re: [BUG] Partition creation fails after dropping a column and adding a partial index
On Thu, Oct 31, 2019 at 9:45 AM Michael Paquier wrote: > On Tue, Oct 29, 2019 at 01:16:58PM +0900, Michael Paquier wrote: > > Yes, something looks wrong with that. I have not looked at it in > > details yet though. I'll see about that tomorrow. > > So.. When building the attribute map for a cloned index (with either > LIKE during the transformation or for partition indexes), we store > each attribute number with 0 used for dropped columns. Unfortunately, > if you look at the way the attribute map is built in this case the > code correctly generates the mapping with convert_tuples_by_name_map. > But, the length of the mapping used is incorrect as this makes use of > the number of attributes for the newly-created child relation, and not > the parent which includes the dropped column in its count. So the > answer is simply to use the parent as reference for the mapping > length. > > The patch is rather simple as per the attached, with extended > regression tests included. I have not checked on back-branches yet, > but that's visibly wrong since 8b08f7d down to v11 (will do that when > back-patching). > > There could be a point in changing convert_tuples_by_name_map & co so > as they return the length of the map on top of the map to avoid such > mistakes in the future. That's a more invasive patch not really > adapted for a back-patch, but we could do that on HEAD once this bug > is fixed. I have also checked other calls of this API and the > handling is done correctly. > > The patch works for me on master and on 12. I have rebased the patch for Version 11. > Wyatt, what do you think? > -- > Michael > -- Ibrar Ahmed diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index b53f6ed3ac..9a1ef78a1f 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -961,7 +961,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId, gettext_noop("could not convert row type")); idxstmt = generateClonedIndexStmt(NULL, RelationGetRelid(rel), idxRel, - attmap, RelationGetDescr(rel)->natts, + attmap, RelationGetDescr(parent)->natts, &constraintOid); DefineIndex(RelationGetRelid(rel), idxstmt, diff --git a/src/test/regress/expected/create_table.out b/src/test/regress/expected/create_table.out index c25117e47c..d20ba3dbdf 100644 --- a/src/test/regress/expected/create_table.out +++ b/src/test/regress/expected/create_table.out @@ -1006,3 +1006,52 @@ CONTEXT: SQL statement "create table tab_part_create_1 partition of tab_part_cr PL/pgSQL function func_part_create() line 3 at EXECUTE drop table tab_part_create; drop function func_part_create(); +-- tests of column drop with partition tables and indexes using +-- predicates and expressions. +create table part_column_drop ( + useless_1 int, + id int, + useless_2 int, + d int, + b int, + useless_3 int +) partition by range (id); +alter table part_column_drop drop column useless_1; +alter table part_column_drop drop column useless_2; +alter table part_column_drop drop column useless_3; +create index part_column_drop_b_pred on part_column_drop(b) where b = 1; +create index part_column_drop_b_expr on part_column_drop((b = 1)); +create index part_column_drop_d_pred on part_column_drop(d) where d = 2; +create index part_column_drop_d_expr on part_column_drop((d = 2)); +create table part_column_drop_1_10 partition of + part_column_drop for values from (1) to (10); +\d part_column_drop + Table "public.part_column_drop" + Column | Type | Collation | Nullable | Default ++-+---+--+- + id | integer | | | + d | integer | | | + b | integer | | | +Partition key: RANGE (id) +Indexes: +"part_column_drop_b_expr" btree ((b = 1)) +"part_column_drop_b_pred" btree (b) WHERE b = 1 +"part_column_drop_d_expr" btree ((d = 2)) +"part_column_drop_d_pred" btree (d) WHERE d = 2 +Number of partitions: 1 (Use \d+ to list them.) + +\d part_column_drop_1_10 + Table "public.part_column_drop_1_10" + Column | Type | Collation | Nullable | Default ++-+---+--+- + id | integer | | | + d | integer | | | + b | integer | | | +Partition of: part_column_drop FOR VALUES FROM (1) TO (10) +Indexes: +"part_column_drop_1_10_b_idx" btree (b) WHERE b = 1 +"part_column_drop_1_10_d_idx" btree (d) WHERE d = 2 +"part_column_drop_1_10_expr_idx" btree ((b = 1)) +"part_column_drop_1_10_expr_idx1" btree ((d = 2)) + +drop table part_column_drop; diff --git a/src/test/regress/sql/create_table.sql b/src/test/regress/sql/create_table.sql index 27e6f59e87..601a4b9dc3 100644 --- a/src/test/regress/sql/create_table.sql +++ b/src/test/regress/sql/create_table.sql @@ -814,3 +814,26 @@ create trigger trig_part_create before
Re: function calls optimization
Hi On October 31, 2019 7:53:20 AM PDT, Andres Freund wrote: >On October 31, 2019 7:45:26 AM PDT, Tom Lane wrote: >>We've typically supposed that the cost of searching for duplicate >>subexpressions would outweigh the benefits of sometimes finding them. > >Based on profiles I've seen I'm not sure that's the right choice. Both >for when the calls are expensive (say postgis stuff), and for when a >lot of rows are processed. > >I think one part of doing this in a realistic manner is an efficient >search for redundant expressions. One way to improve the situation - which is applicable in a lot of situations, e.g. setrefs.c etc - would be to compute hashes for (sub-) expression trees. Which makes it a lot easier to bail out early when trees are not the same, and also easier to build an efficient way to find redundant expressions. There's some complexity in invalidating such hashes once computed, and when to first compute them, obviously. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: function calls optimization
On 10/31/19 3:45 PM, Tom Lane wrote: Andres Freund writes: On October 31, 2019 7:06:13 AM PDT, Andrzej Barszcz wrote: Any pros and cons ? Depends on the actual way of implementing this proposal. Think we need more details than what you idea here. We've typically supposed that the cost of searching for duplicate subexpressions would outweigh the benefits of sometimes finding them. That is an important concern, but given how SQL does not make it convenient to re-use partial results of calculations I think such queries are quite common in real world workloads. So if we can make it cheap enough I think that it is a worthwhile optimization. Andreas
Re: function calls optimization
Andres Freund writes: > On October 31, 2019 7:45:26 AM PDT, Tom Lane wrote: >> We've typically supposed that the cost of searching for duplicate >> subexpressions would outweigh the benefits of sometimes finding them. > Based on profiles I've seen I'm not sure that's the right choice. Both for > when the calls are expensive (say postgis stuff), and for when a lot of rows > are processed. Yeah, if your mental model of a function call is some remarkably expensive PostGIS geometry manipulation, it's easy to justify doing a lot of work to try to detect duplicates. But most functions in most queries are more like int4pl or btint8cmp, and it's going to be extremely remarkable if you can make back the planner costs of checking for duplicate usages of those. Possibly this could be finessed by only trying to find duplicates of functions that have high cost estimates. Not sure how high is high enough. > I think one part of doing this in a realistic manner is an efficient > search for redundant expressions. The other, also non trivial, is how to > even represent re eferences to the results of expressions in other parts of > the expression tree / other expressions. Yup, both of those would be critical to do right. regards, tom lane
Re: function calls optimization
Hi, On October 31, 2019 8:06:50 AM PDT, Tom Lane wrote: >Andres Freund writes: >> On October 31, 2019 7:45:26 AM PDT, Tom Lane >wrote: >>> We've typically supposed that the cost of searching for duplicate >>> subexpressions would outweigh the benefits of sometimes finding >them. > >> Based on profiles I've seen I'm not sure that's the right choice. >Both for when the calls are expensive (say postgis stuff), and for when >a lot of rows are processed. > >Yeah, if your mental model of a function call is some remarkably >expensive >PostGIS geometry manipulation, it's easy to justify doing a lot of work >to try to detect duplicates. But most functions in most queries are >more like int4pl or btint8cmp, and it's going to be extremely >remarkable >if you can make back the planner costs of checking for duplicate usages >of those. Well, if it's an expression containing those individuals cheap calls on a seqscan on a large table below an aggregate, it'd likely still be a win. But we don't, to my knowledge, really have a good way to model optimizations like this that should only be done if either expensive or have a high loop count. I guess one ugly way to deal with this would be to eliminate redundancies very late, e.g. during setrefs (where a better data structure for matching expressions would be good anyway), as we already know all the costs. But ideally we would want to do be able to take such savings into account earlier, when considering different paths. I suspect that it might be a good enough vehicle to tackle the rest of the work however, at least initially. We could also "just" do such elimination during expression "compilation", but it'd be better to not have to do something as complicated as this for every execution of a prepared statement. >> I think one part of doing this in a realistic manner is an efficient >> search for redundant expressions. The other, also non trivial, is how >to >> even represent re eferences to the results of expressions in other >parts of the expression tree / other expressions. > >Yup, both of those would be critical to do right. Potentially related note: for nodes like seqscan, combining the qual and projection processing into one expression seems to be a noticable win (at least when taking care do emit two different sets of deform expression steps). Wonder if something like that would take care of avoiding the need for cross expression value passing in enough places. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: Resume vacuum and autovacuum from interruption and cancellation
On Thu, Sep 26, 2019 at 1:53 AM Alvaro Herrera wrote: > Apparently this patch now has a duplicate OID. Please do use random > OIDs >8000 as suggested by the unused_oids script. > > -- > Álvaro Herrerahttps://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > > > I have updated the patch using OIDs > 8000 -- Ibrar Ahmed v4-0001-Add-RESUME-option-to-VACUUM-and-autovacuum.patch Description: Binary data
ssl passphrase callback
This is the first of a number of patches to enhance SSL functionality, particularly w.r.t. passphrases. This patch provides a hook for a function that can supply an SSL passphrase. The hook can be filled in by a shared preloadable module. In order for that to be effective, the startup order is modified slightly. There is a test attached that builds and uses one trivial implementation, which just takes a configuration setting and rot13's it before supplying the result as the passphrase. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c index 629919cc6e..bf4493c94a 100644 --- a/src/backend/libpq/be-secure-openssl.c +++ b/src/backend/libpq/be-secure-openssl.c @@ -45,6 +45,9 @@ #include "utils/memutils.h" +ssl_passphrase_func_cb ssl_passphrase_function = NULL; +bool ssl_passphrase_function_supports_reload = false; + static int my_sock_read(BIO *h, char *buf, int size); static int my_sock_write(BIO *h, const char *buf, int size); static BIO_METHOD *my_BIO_s_socket(void); @@ -124,12 +127,16 @@ be_tls_init(bool isServerStart) */ if (isServerStart) { - if (ssl_passphrase_command[0]) + if (ssl_passphrase_function) + SSL_CTX_set_default_passwd_cb(context, ssl_passphrase_function); + else if (ssl_passphrase_command[0]) SSL_CTX_set_default_passwd_cb(context, ssl_external_passwd_cb); } else { - if (ssl_passphrase_command[0] && ssl_passphrase_command_supports_reload) + if (ssl_passphrase_function && ssl_passphrase_function_supports_reload) + SSL_CTX_set_default_passwd_cb(context, ssl_passphrase_function); + else if (ssl_passphrase_command[0] && ssl_passphrase_command_supports_reload) SSL_CTX_set_default_passwd_cb(context, ssl_external_passwd_cb); else diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 5f30359165..18dd8578d9 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -972,17 +972,6 @@ PostmasterMain(int argc, char *argv[]) */ LocalProcessControlFile(false); - /* - * Initialize SSL library, if specified. - */ -#ifdef USE_SSL - if (EnableSSL) - { - (void) secure_initialize(true); - LoadedSSL = true; - } -#endif - /* * Register the apply launcher. Since it registers a background worker, * it needs to be called before InitializeMaxBackends(), and it's probably @@ -996,6 +985,17 @@ PostmasterMain(int argc, char *argv[]) */ process_shared_preload_libraries(); + /* + * Initialize SSL library, if specified. + */ +#ifdef USE_SSL + if (EnableSSL) + { + (void) secure_initialize(true); + LoadedSSL = true; + } +#endif + /* * Now that loadable modules have had their chance to register background * workers, calculate MaxBackends. diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h index 541f970f99..dd2b5fc6c8 100644 --- a/src/include/libpq/libpq-be.h +++ b/src/include/libpq/libpq-be.h @@ -287,6 +287,17 @@ extern void be_tls_get_peer_serial(Port *port, char *ptr, size_t len); extern char *be_tls_get_certificate_hash(Port *port, size_t *len); #endif +/* + * ssl_passphrase_function can be filled in by a shared preloaded module + * to supply a passphrase for a key file, as can the flag noting whether the + * function supports reloading. + */ + +typedef int (* ssl_passphrase_func_cb) (char *buf, int size, int rwflag, + void *userdata); +extern ssl_passphrase_func_cb ssl_passphrase_function; +extern bool ssl_passphrase_function_supports_reload; + #endif /* USE_SSL */ #ifdef ENABLE_GSS diff --git a/src/test/modules/ssl_passphrase_callback/.gitignore b/src/test/modules/ssl_passphrase_callback/.gitignore new file mode 100644 index 00..1dbadf7baf --- /dev/null +++ b/src/test/modules/ssl_passphrase_callback/.gitignore @@ -0,0 +1 @@ +tmp_check diff --git a/src/test/modules/ssl_passphrase_callback/Makefile b/src/test/modules/ssl_passphrase_callback/Makefile new file mode 100644 index 00..876ede2f21 --- /dev/null +++ b/src/test/modules/ssl_passphrase_callback/Makefile @@ -0,0 +1,22 @@ +# ssl_passphrase Makefile + +MODULE_big = ssl_passphrase_func +OBJS = ssl_passphrase_func.o $(WIN32RES) +PGFILEDESC = "callback function to provide a passphrase" + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = src/test/modules/ssl_passphrase_callback +top_builddir = ../../../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif + +check: prove-check + +prove-check: | temp-install + @echo running prove ... + $(prove_check) diff --git a/src/test/modules/ssl_passphrase_callback/ssl_passphrase_func.c b/src/test/modules/ssl_passphrase_callback/ssl_passphrase_func.c new file mode 100644 index 00..e3e433b6f0 --- /dev/null +++ b/
Re: function calls optimization
Andres Freund writes: > Potentially related note: for nodes like seqscan, combining the qual and > projection processing into one expression seems to be a noticable win (at > least when taking care do emit two different sets of deform expression steps). There's just one problem: that violates SQL semantics, and not in a small way. select 1/x from tab where x <> 0 regards, tom lane
Re: function calls optimization
Hi, On October 31, 2019 8:45:26 AM PDT, Tom Lane wrote: >Andres Freund writes: >> Potentially related note: for nodes like seqscan, combining the qual >and projection processing into one expression seems to be a noticable >win (at least when taking care do emit two different sets of deform >expression steps). > >There's just one problem: that violates SQL semantics, and not in >a small way. > > select 1/x from tab where x <> 0 The expression would obviously have to return early, before projecting, when not matching the qual? I'm basically just thinking of first executing the steps for the qual, and in the success case execute the projection steps before returning the quals positive result. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: function calls optimization
x <> 0 is evaluated first, 1/x only when x <> 0, not ? czw., 31 paź 2019 o 16:45 Tom Lane napisał(a): > Andres Freund writes: > > Potentially related note: for nodes like seqscan, combining the qual and > projection processing into one expression seems to be a noticable win (at > least when taking care do emit two different sets of deform expression > steps). > > There's just one problem: that violates SQL semantics, and not in > a small way. > > select 1/x from tab where x <> 0 > > regards, tom lane >
Adding percentile metrics to pg_stat_statements module
Hi everyone, I was taking a look at pg_stat_statements module and noticed that it does not collect any percentile metrics. I believe that It would be really handy to have those available and I'd love to contribute with this feature. The basic idea is to accumulate the the query execution times using an approximation structure like q-digest or t-digest and add those results to the pg_stat_statements table as fixed columns. Something like this p90_time: p95_time: p99_time: p70_time: ... Another solution is to persist de digest structure in a binary column and use a function to extract the desired quantile ilke this SELECT approx_quantile(digest_times, 0.99) FROM pg_stat_statements What do you guys think? Cheers,
Re: function calls optimization
Hi, On October 31, 2019 8:51:11 AM PDT, Andrzej Barszcz wrote: >x <> 0 is evaluated first, 1/x only when x <> 0, not ? > >czw., 31 paź 2019 o 16:45 Tom Lane napisał(a): > >> Andres Freund writes: >> > Potentially related note: for nodes like seqscan, combining the >qual and >> projection processing into one expression seems to be a noticable win >(at >> least when taking care do emit two different sets of deform >expression >> steps). >> >> There's just one problem: that violates SQL semantics, and not in >> a small way. >> >> select 1/x from tab where x <> 0 On postgres lists the policy is to reply below the quoted bit, and to trim the quote appropriately. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: Allow cluster_name in log_line_prefix
Hi, On 2019-10-28 12:33:00 +0800, Craig Ringer wrote: > I was recently surprised to notice that log_line_prefix doesn't support a > cluster_name placeholder. I suggest adding one. If I don't hear objections > I'll send a patch. > > Before anyone asks "but why?!": > > * A constant (short) string in log_line_prefix is immensely useful when > working with logs from multi-node systems. Whether that's physical > streaming replication, logical replication, Citus, whatever, it doesn't > matter. It's worth paying the small storage price for sanity when looking > at logs. > > * Yes you can embed it directly into log_line_prefix. But then it gets > copied by pg_basebackup or whatever you're using to clone standbys etc, so > you can easily land up with multiple instances reporting the same name. > This rather defeats the purpose. +1. For a while this was part of the patch that added cluster_name (possibly worthwhile digging it up from that thread), but some people thought it was unnecessary, so it was excised from the patch to get the basic feature... Greetings, Andres Freund
Re: Adding percentile metrics to pg_stat_statements module
čt 31. 10. 2019 v 16:51 odesílatel Igor Calabria napsal: > Hi everyone, > > I was taking a look at pg_stat_statements module and noticed that it does > not collect any percentile metrics. I believe that It would be really handy > to have those available and I'd love to contribute with this feature. > > The basic idea is to accumulate the the query execution times using an > approximation structure like q-digest or t-digest and add those results to > the pg_stat_statements table as fixed columns. Something like this > > p90_time: > p95_time: > p99_time: > p70_time: > ... > > Another solution is to persist de digest structure in a binary column and > use a function to extract the desired quantile ilke this SELECT > approx_quantile(digest_times, 0.99) FROM pg_stat_statements > > What do you guys think? > + the idea But I am not sure about performance and memory overhead Pavel > Cheers, > >
idea - proposal - defining own psql commands
Hi long time we are think how to allow add some custom commands in psql. I had a following idea 1. psql can has special buffer for custom queries. This buffer can be filled by special command \gdefq. This command will have two parameters - name and number of arguments. some like select * from pg_class where relname = :'_first' \gdefcq m1 1 select * from pg_class where relnamespace = :_first::regnamespace and rename = :'_second' \gdefcq m1 2 the custom queries can be executed via doubled backslash like \\m1 pg_proc \\m1 pg_catalog pg_proc the runtime will count number of parameters and chose variant with selected name and same number of arguments. Next, it save parameters to variables like _first, _second. Last step is query execution. What do you think about this? Regards Pavel
Re: RFC: split OBJS lines to one object per line
On 10/29/19 11:32 PM, Andres Freund wrote: Hi, On 2019-10-29 16:31:11 -0400, Tom Lane wrote: Andres Freund writes: one of the most frequent conflicts I see is that two patches add files to OBJS (or one of its other spellings), and there are conflicts because another file has been added. ... Now, obviously these types of conflicts are easy enough to resolve, but it's still annoying. It seems that this would be substantially less often a problem if we just split such lines to one file per line. We did something similar not too long ago in configure.in (bfa6c5a0c), and it seems to have helped. +1 Cool. Any opinion on whether to got for OBJS = \ dest.o \ fastpath.o \ ... or OBJS = dest.o \ fastpath.o \ ... I'm mildly inclined to go for the former. +1 for the former. Greetings, Andres Freund
TestLib::command_fails_like enhancement
This small patch authored by my colleague Craig Ringer enhances Testlib's command_fails_like by allowing the passing of extra keyword type arguments. The keyword initially recognized is 'extra_ipcrun_opts'. The value for this keyword needs to be an array, and is passed through to the call to IPC::Run. Some patches I will be submitting shortly rely on this enhancement. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm index 905d0d178f..5264111d00 100644 --- a/src/test/perl/TestLib.pm +++ b/src/test/perl/TestLib.pm @@ -771,10 +771,16 @@ the given regular expression. sub command_fails_like { local $Test::Builder::Level = $Test::Builder::Level + 1; - my ($cmd, $expected_stderr, $test_name) = @_; + my ($cmd, $expected_stderr, $test_name, %kwargs) = @_; + my @extra_ipcrun_opts = (); + if (defined($kwargs{'extra_ipcrun_opts'})) + { + push(@extra_ipcrun_opts, @{$kwargs{'extra_ipcrun_opts'}}); + } my ($stdout, $stderr); print("# Running: " . join(" ", @{$cmd}) . "\n"); - my $result = IPC::Run::run $cmd, '>', \$stdout, '2>', \$stderr; + my $result = IPC::Run::run $cmd, '>', \$stdout, '2>', \$stderr, + @extra_ipcrun_opts; ok(!$result, "$test_name: exit code not 0"); like($stderr, $expected_stderr, "$test_name: matches"); return;
fe-utils - share query cancellation code
Hello Devs, This patch moves duplicated query cancellation code code from psql & scripts to fe-utils, so that it is shared and may be used by other commands. This is because Masao-san suggested to add a query cancellation feature to pgbench for long queries (server-side data generation being discussed, but possibly pk and fk could use that as well). -- Fabien.diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index b981ae81ff..f1d9e0298a 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -29,6 +29,7 @@ #include "libpq-fe.h" #include "pqexpbuffer.h" #include "common/logging.h" +#include "fe_utils/cancel.h" #include "fe_utils/print.h" #include "fe_utils/string_utils.h" diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index 90f6380170..a00990d214 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -24,6 +24,7 @@ #include "copy.h" #include "crosstabview.h" #include "fe_utils/mbprint.h" +#include "fe_utils/cancel.h" #include "fe_utils/string_utils.h" #include "portability/instr_time.h" #include "settings.h" @@ -222,7 +223,7 @@ NoticeProcessor(void *arg, const char *message) pg_log_info("%s", message); } - +#ifndef WIN32 /* * Code to support query cancellation @@ -241,7 +242,7 @@ NoticeProcessor(void *arg, const char *message) * * SIGINT is supposed to abort all long-running psql operations, not only * database queries. In most places, this is accomplished by checking - * cancel_pressed during long-running loops. However, that won't work when + * CancelRequested during long-running loops. However, that won't work when * blocked on user input (in readline() or fgets()). In those places, we * set sigint_interrupt_enabled true while blocked, instructing the signal * catcher to longjmp through sigint_interrupt_jmp. We assume readline and @@ -249,37 +250,11 @@ NoticeProcessor(void *arg, const char *message) * not work on win32, so control-C is less useful there) */ volatile bool sigint_interrupt_enabled = false; - sigjmp_buf sigint_interrupt_jmp; -static PGcancel *volatile cancelConn = NULL; - -#ifdef WIN32 -static CRITICAL_SECTION cancelConnLock; -#endif - -/* - * Write a simple string to stderr --- must be safe in a signal handler. - * We ignore the write() result since there's not much we could do about it. - * Certain compilers make that harder than it ought to be. - */ -#define write_stderr(str) \ - do { \ - const char *str_ = (str); \ - int rc_; \ - rc_ = write(fileno(stderr), str_, strlen(str_)); \ - (void) rc_; \ - } while (0) - - -#ifndef WIN32 - static void -handle_sigint(SIGNAL_ARGS) +psql_sigint_callback(void) { - int save_errno = errno; - char errbuf[256]; - /* if we are waiting for input, longjmp out of it */ if (sigint_interrupt_enabled) { @@ -288,74 +263,19 @@ handle_sigint(SIGNAL_ARGS) } /* else, set cancel flag to stop any long-running loops */ - cancel_pressed = true; - - /* and send QueryCancel if we are processing a database query */ - if (cancelConn != NULL) - { - if (PQcancel(cancelConn, errbuf, sizeof(errbuf))) - write_stderr("Cancel request sent\n"); - else - { - write_stderr("Could not send cancel request: "); - write_stderr(errbuf); - } - } - - errno = save_errno; /* just in case the write changed it */ -} - -void -setup_cancel_handler(void) -{ - pqsignal(SIGINT, handle_sigint); -} -#else /* WIN32 */ - -static BOOL WINAPI -consoleHandler(DWORD dwCtrlType) -{ - char errbuf[256]; - - if (dwCtrlType == CTRL_C_EVENT || - dwCtrlType == CTRL_BREAK_EVENT) - { - /* - * Can't longjmp here, because we are in wrong thread :-( - */ - - /* set cancel flag to stop any long-running loops */ - cancel_pressed = true; - - /* and send QueryCancel if we are processing a database query */ - EnterCriticalSection(&cancelConnLock); - if (cancelConn != NULL) - { - if (PQcancel(cancelConn, errbuf, sizeof(errbuf))) -write_stderr("Cancel request sent\n"); - else - { -write_stderr("Could not send cancel request: "); -write_stderr(errbuf); - } - } - LeaveCriticalSection(&cancelConnLock); - - return TRUE; - } - else - /* Return FALSE for any signals not being handled */ - return FALSE; + CancelRequested = true; } +#endif void -setup_cancel_handler(void) +psql_setup_cancel_handler(void) { - InitializeCriticalSection(&cancelConnLock); - - SetConsoleCtrlHandler(consoleHandler, TRUE); +#ifndef WIN32 + setup_cancel_handler(psql_sigint_callback); +#else + setup_cancel_handler(); +#endif /* WIN32 */ } -#endif /* WIN32 */ /* ConnectionUp @@ -369,7 +289,6 @@ ConnectionUp(void) } - /* CheckConnection * * Verify that we still have a good connection to the backend, and if not, @@ -428,62 +347,6 @@ CheckConnection(void) -/* - * SetCancelConn - * - * Set cancelConn to point to the current database connection. - */ -void -SetCancelConn(void) -{ - PGcancel *oldCancelConn; - -#ifdef WIN32 - EnterCriticalSection(&cancel
Removing alignment padding for byval types
Hi, We currently align byval types such as int4/8, float4/8, timestamp *, date etc, even though we mostly don't need to. When tuples are deformed, all byval types are copied out from the tuple data into the corresponding Datum array, therefore the original alignment in the tuple data doesn't matter. This is different from byref types, where the Datum formed will often be a pointer into the tuple data. While there are some older systems where it could be a bit slower to copy data out from unaligned positions into the datum array, this is more than bought back by the next point: The fact that these types are aligned has substantial costs: For one, we often waste substantial amounts of space inside tables with alignment padding. It's not uncommon to see about 30% or more of space wasted (especially when taking alignment of the first column into account). For another, and this I think is less obvious, we actually waste substantial amounts of CPU maintaining the alignment. This is primarily the case because we have to perform to align the pointer to the next field during tuple [de]forming. Those instructions [1] have to be executed taking time, but what's worse, they also reduce the ability of out-of-order execution to hide latencies. There's a hard dependency on knowing the offset to the next column to be able to continue with the next column. There's two reasons why we can't just set the alignment for these types to 'c'. 1) pg_upgrade, for fairly obvious reasons 2) We map catalog table rows to structs, in a *lot* of places. It seems to me that, despite the above, it's still worth trying to improve upon the current state, to benefit from reduced space and CPU usage. As it turns out we already separate out the alignment for a type, and a column, between pg_type.typalign and pg_attribute.attalign. One way to tackle this would be to allow to specify, for byval types only, at column creation time whether a column uses a 'struct-mappable' alignment or not. When set, set attalign to pg_type.typalign for alignment, when not, to 'c'. By changing pg_dump in binary upgrade mode to emit the necessary options, and by adding such options during bki processing, we'd solve 1) and 2), but otherwise gain the benefits. Alternatively we could declare such a propert on the table level, but that seems more restrictive, without a corresponding upside. It's possible that we should do something related with a few varlena datatypes. We currently use intalign for types like text, json, and as far as I can tell that does not make all that much sense. They're not struct mappable *anyway* (and if they were, they'd need to be 8 byte aligned on common platforms, __alignof__(void*) is 8). We'd have to take a bit of care to treat the varlena header as unaligned - but we need to do so anyway, because of 1byte varlenas. Short varlenas seems to make it less crucial to pursue this, as the average datum that'd benefit is long enough to make padding a non-issue. So I don't think it'd make sense to tackle this project at the same time. To fully benefit from the increased tuple deforming speed, it might be beneficial to branch very early between two different versions within slot_deform_heap_tuple, having determined whether there's any byval columns with alignment requirements at slot creation / ExecSetSlotDescriptor() time (or even set a different callback getsomeattrs callback, but that's a bit more complicated). Thoughts? Independent of the above, I think it might make sense to replace pg_attribute.attalign with a smallint or such. It's a bit absurd that we need code like #define att_align_nominal(cur_offset, attalign) \ ( \ ((attalign) == 'i') ? INTALIGN(cur_offset) : \ (((attalign) == 'c') ? (uintptr_t) (cur_offset) : \ (((attalign) == 'd') ? DOUBLEALIGN(cur_offset) : \ ( \ AssertMacro((attalign) == 's'), \ SHORTALIGN(cur_offset) \ ))) \ ) instead of just using TYPEALIGN(). There's no need to adapt CREATE TYPE, or pg_type - we should just store the number in pg_attribute.attalign. That keeps CREATE TYPE 32/64bit/arch independent, doesn't require reconstructing c/s/i/d in pg_dump, simplifies the generated code [1], and would also "just" work for what I described earlier in this email. Greetings, Andres Freund [1] E.g. as a function of (void *ptr, char attalign) this ends up with assembly like .globl alignme .type alignme, @function alignme: .LFB210: .cfi_startproc movq%rdi, %rax cmpb$105, %sil je .L496 cmpb$99, %sil je .L492 addq$7, %rax leaq1(%rdi), %rdi andq$-8, %rax andq$-2, %rdi cmpb$100, %sil cmovne %rdi, %rax .L492: ret .p2align 4,,10 .p2align 3 .L496: addq$3, %rax andq$-4, %rax ret .cfi_endproc using
Re: Proposal: Global Index
Greetings, * Peter Geoghegan (p...@bowt.ie) wrote: > On Wed, Oct 30, 2019 at 9:23 AM Tom Lane wrote: > > Well, the *effects* of the feature seem desirable, but that doesn't > > mean that we want an implementation that actually has a shared index. > > As soon as you do that, you've thrown away most of the benefits of > > having a partitioned data structure in the first place. > > Right, but that's only the case for the global index. Global indexes > are useful when used judiciously. They enable the use of partitioning > for use cases where not being able to enforce uniqueness across all > partitions happens to be a deal breaker. I bet that this is fairly > common. Absolutely- our lack of such is a common point of issue when folks are considering using or migrating to PostgreSQL. Thanks, Stephen signature.asc Description: PGP signature
Re: Proposal: Global Index
On Thu, 31 Oct 2019 at 14:50, Stephen Frost wrote: > Greetings, > > * Peter Geoghegan (p...@bowt.ie) wrote: > [] > > Absolutely- our lack of such is a common point of issue when folks are > considering using or migrating to PostgreSQL. > Not sure how similar my situation really is, but I find myself wanting to have indices that cross non-partition members of an inheritance hierarchy: create table t ( id int, primary key (id) ); create table t1 ( a text ) inherits (t); create table t2 ( b int, c int ) inherits (t); So "t"s are identified by an integer; and one kind of "t" has a single text attribute while a different kind of "t" has 2 int attributes. The idea is that there is a single primary key constraint on the whole hierarchy that ensures only one record with a particular id can exist in all the tables together. I can imagine wanting to do this with other unique constraints also. At present I don't actually use inheritance; instead I put triggers on the child tables that do an insert on the parent table, which has the effect of enforcing the uniqueness I want.
Re: pgbench - extend initialization phase control
Hello Masao-san, If large scale factor is specified, the query for generating pgbench_accounts data can take a very long time. While that query is running, operators may be likely to do Ctrl-C to cancel the data generation. In this case, IMO pgbench should cancel the query, i.e., call PQcancel(). Otherwise, the query will keep running to the end. Hmmm. Why not. Now the infra to do that seems to already exists twice, once in "src/bin/psql/common.c" and once in "src/bin/scripts/common.c". I cannot say I'm thrilled to replicate this once more. I think that the reasonable option is to share this in fe-utils and then to reuse it from there. However, ISTM that such a restructuring patch which not belong to this feature. [...] I just did a patch to share the code: https://www.postgresql.org/message-id/flat/alpine.DEB.2.21.1910311939430.27369@lancre https://commitfest.postgresql.org/25/2336/ -- Fabien.
Re: Removing alignment padding for byval types
On Thu, Oct 31, 2019 at 11:48:21AM -0700, Andres Freund wrote: Hi, We currently align byval types such as int4/8, float4/8, timestamp *, date etc, even though we mostly don't need to. When tuples are deformed, all byval types are copied out from the tuple data into the corresponding Datum array, therefore the original alignment in the tuple data doesn't matter. This is different from byref types, where the Datum formed will often be a pointer into the tuple data. While there are some older systems where it could be a bit slower to copy data out from unaligned positions into the datum array, this is more than bought back by the next point: The fact that these types are aligned has substantial costs: For one, we often waste substantial amounts of space inside tables with alignment padding. It's not uncommon to see about 30% or more of space wasted (especially when taking alignment of the first column into account). For another, and this I think is less obvious, we actually waste substantial amounts of CPU maintaining the alignment. This is primarily the case because we have to perform to align the pointer to the next field during tuple [de]forming. Those instructions [1] have to be executed taking time, but what's worse, they also reduce the ability of out-of-order execution to hide latencies. There's a hard dependency on knowing the offset to the next column to be able to continue with the next column. Right. Reducing this overhead was one of the goals to allow "logical ordering" of columns in a table (while arbitrarily reordering the physical ones), but that patch got out of hand pretty quickly. Also, it'd still keep some of the overhead, because it was not removing the alignment/padding entirely. There's two reasons why we can't just set the alignment for these types to 'c'. 1) pg_upgrade, for fairly obvious reasons 2) We map catalog table rows to structs, in a *lot* of places. It seems to me that, despite the above, it's still worth trying to improve upon the current state, to benefit from reduced space and CPU usage. As it turns out we already separate out the alignment for a type, and a column, between pg_type.typalign and pg_attribute.attalign. One way to tackle this would be to allow to specify, for byval types only, at column creation time whether a column uses a 'struct-mappable' alignment or not. When set, set attalign to pg_type.typalign for alignment, when not, to 'c'. By changing pg_dump in binary upgrade mode to emit the necessary options, and by adding such options during bki processing, we'd solve 1) and 2), but otherwise gain the benefits. Alternatively we could declare such a propert on the table level, but that seems more restrictive, without a corresponding upside. I don't know, but it seems entirely sufficient specifying this at the table level, no? What would be the use case for removing padding for only some of the columns? I don't see the use case for that. It's possible that we should do something related with a few varlena datatypes. We currently use intalign for types like text, json, and as far as I can tell that does not make all that much sense. They're not struct mappable *anyway* (and if they were, they'd need to be 8 byte aligned on common platforms, __alignof__(void*) is 8). We'd have to take a bit of care to treat the varlena header as unaligned - but we need to do so anyway, because of 1byte varlenas. Short varlenas seems to make it less crucial to pursue this, as the average datum that'd benefit is long enough to make padding a non-issue. So I don't think it'd make sense to tackle this project at the same time. Not sure, but how come it's not failing on the picky platforms, then? On x86 it's probably OK because it's pretty permissive, but I'd expect some platforms (s390, parisc, itanium, powerpc, ...) to be much pickier. To fully benefit from the increased tuple deforming speed, it might be beneficial to branch very early between two different versions within slot_deform_heap_tuple, having determined whether there's any byval columns with alignment requirements at slot creation / ExecSetSlotDescriptor() time (or even set a different callback getsomeattrs callback, but that's a bit more complicated). Thoughts? Seems reasonable. I certainly agree this padding is pretty annoying, so if we can get rid of it without causing issues, that'd be nice. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Proposal: Global Index
On Thu, Oct 31, 2019 at 03:02:40PM -0400, Isaac Morland wrote: On Thu, 31 Oct 2019 at 14:50, Stephen Frost wrote: Greetings, * Peter Geoghegan (p...@bowt.ie) wrote: [] Absolutely- our lack of such is a common point of issue when folks are considering using or migrating to PostgreSQL. Not sure how similar my situation really is, but I find myself wanting to have indices that cross non-partition members of an inheritance hierarchy: create table t ( id int, primary key (id) ); create table t1 ( a text ) inherits (t); create table t2 ( b int, c int ) inherits (t); So "t"s are identified by an integer; and one kind of "t" has a single text attribute while a different kind of "t" has 2 int attributes. The idea is that there is a single primary key constraint on the whole hierarchy that ensures only one record with a particular id can exist in all the tables together. I can imagine wanting to do this with other unique constraints also. IMO the chances of us supporting global indexes with generic inheritance hierarchies are about zero. We don't even support creating "partition" indexes on those hierarchies ... At present I don't actually use inheritance; instead I put triggers on the child tables that do an insert on the parent table, which has the effect of enforcing the uniqueness I want. Does it? Are you sure it actually works in READ COMMITTED? What exactly does the trigger do? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Removing alignment padding for byval types
Hi, On 2019-10-31 20:15:12 +0100, Tomas Vondra wrote: > On Thu, Oct 31, 2019 at 11:48:21AM -0700, Andres Freund wrote: > > We currently align byval types such as int4/8, float4/8, timestamp *, > > date etc, even though we mostly don't need to. When tuples are deformed, > > all byval types are copied out from the tuple data into the > > corresponding Datum array, therefore the original alignment in the tuple > > data doesn't matter. This is different from byref types, where the > > Datum formed will often be a pointer into the tuple data. > > > > While there are some older systems where it could be a bit slower to > > copy data out from unaligned positions into the datum array, this is > > more than bought back by the next point: > > > > > > The fact that these types are aligned has substantial costs: > > > > For one, we often waste substantial amounts of space inside tables with > > alignment padding. It's not uncommon to see about 30% or more of space > > wasted (especially when taking alignment of the first column into > > account). > > > > For another, and this I think is less obvious, we actually waste > > substantial amounts of CPU maintaining the alignment. This is primarily > > the case because we have to perform to align the pointer to the next > > field during tuple [de]forming. Those instructions [1] have to be > > executed taking time, but what's worse, they also reduce the ability of > > out-of-order execution to hide latencies. There's a hard dependency on > > knowing the offset to the next column to be able to continue with the > > next column. > > > > Right. Reducing this overhead was one of the goals to allow "logical > ordering" of columns in a table (while arbitrarily reordering the > physical ones), but that patch got out of hand pretty quickly. Also, > it'd still keep some of the overhead, because it was not removing the > alignment/padding entirely. Yea. It'd keep just about all the CPU overhead, because we'd still need to align as soon as there is a preceding nulled or varlena colum. There's still some benefit for logical column order, as grouping NOT NULL fixed-length columns at the start is beneficial. And it's also beneficial to have frequently accessed columns at the start. But I think this proposal gains a lot of the space related benefits, at a much lower complexity, together with a lot of other benefits. > > There's two reasons why we can't just set the alignment for these types > > to 'c'. > > 1) pg_upgrade, for fairly obvious reasons > > 2) We map catalog table rows to structs, in a *lot* of places. > > > > > > It seems to me that, despite the above, it's still worth trying to > > improve upon the current state, to benefit from reduced space and CPU > > usage. > > > > As it turns out we already separate out the alignment for a type, and a > > column, between pg_type.typalign and pg_attribute.attalign. One way to > > tackle this would be to allow to specify, for byval types only, at > > column creation time whether a column uses a 'struct-mappable' alignment > > or not. When set, set attalign to pg_type.typalign for alignment, when > > not, to 'c'. By changing pg_dump in binary upgrade mode to emit the > > necessary options, and by adding such options during bki processing, > > we'd solve 1) and 2), but otherwise gain the benefits. > > > > Alternatively we could declare such a propert on the table level, but > > that seems more restrictive, without a corresponding upside. > I don't know, but it seems entirely sufficient specifying this at the > table level, no? What would be the use case for removing padding for > only some of the columns? I don't see the use case for that. Well, if we had it on a per-table level, we'd also align a) catalog table columns that follow the first varlena column - which we don't need to align, as they can't be accessed via mapping b) columns in pg_upgraded tables that have been added after the upgrade > > It's possible that we should do something related with a few varlena > > datatypes. We currently use intalign for types like text, json, and as > > far as I can tell that does not make all that much sense. They're not > > struct mappable *anyway* (and if they were, they'd need to be 8 byte > > aligned on common platforms, __alignof__(void*) is 8). We'd have to take > > a bit of care to treat the varlena header as unaligned - but we need to > > do so anyway, because of 1byte varlenas. Short varlenas seems to make it > > less crucial to pursue this, as the average datum that'd benefit is long > > enough to make padding a non-issue. So I don't think it'd make sense to > > tackle this project at the same time. > > > > Not sure, but how come it's not failing on the picky platforms, then? On > x86 it's probably OK because it's pretty permissive, but I'd expect some > platforms (s390, parisc, itanium, powerpc, ...) to be much pickier. I'm not quite following? As I said, we already need to use alignment aware code due to caring
Re: [PATCH] Implement INSERT SET syntax
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:not tested Patch looks to me and works on my machine 73025140885c889410b9bfc4a30a3866396fc5db - HEAD I have not reviewed the documentaion changes. The new status of this patch is: Ready for Committer
Re: Adding percentile metrics to pg_stat_statements module
On Thu, Oct 31, 2019 at 12:51:17PM -0300, Igor Calabria wrote: Hi everyone, I was taking a look at pg_stat_statements module and noticed that it does not collect any percentile metrics. I believe that It would be really handy to have those available and I'd love to contribute with this feature. The basic idea is to accumulate the the query execution times using an approximation structure like q-digest or t-digest and add those results to the pg_stat_statements table as fixed columns. Something like this p90_time: p95_time: p99_time: p70_time: ... Another solution is to persist de digest structure in a binary column and use a function to extract the desired quantile ilke this SELECT approx_quantile(digest_times, 0.99) FROM pg_stat_statements IMO having some sort of CDF approximation (being a q-digest or t-digest) would be useful, although it'd probably need to be optional (mostly becuase of memory consumption). I don't see why we would not store the digests themselves. Storing just some selected percentiles would be pretty problematic due to losing a lot of information on restart. Also, pg_stat_statements is not a table but a view on in-memory hash table. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Removing alignment padding for byval types
On Thu, Oct 31, 2019 at 12:24:33PM -0700, Andres Freund wrote: Hi, On 2019-10-31 20:15:12 +0100, Tomas Vondra wrote: On Thu, Oct 31, 2019 at 11:48:21AM -0700, Andres Freund wrote: > We currently align byval types such as int4/8, float4/8, timestamp *, > date etc, even though we mostly don't need to. When tuples are deformed, > all byval types are copied out from the tuple data into the > corresponding Datum array, therefore the original alignment in the tuple > data doesn't matter. This is different from byref types, where the > Datum formed will often be a pointer into the tuple data. > > While there are some older systems where it could be a bit slower to > copy data out from unaligned positions into the datum array, this is > more than bought back by the next point: > > > The fact that these types are aligned has substantial costs: > > For one, we often waste substantial amounts of space inside tables with > alignment padding. It's not uncommon to see about 30% or more of space > wasted (especially when taking alignment of the first column into > account). > > For another, and this I think is less obvious, we actually waste > substantial amounts of CPU maintaining the alignment. This is primarily > the case because we have to perform to align the pointer to the next > field during tuple [de]forming. Those instructions [1] have to be > executed taking time, but what's worse, they also reduce the ability of > out-of-order execution to hide latencies. There's a hard dependency on > knowing the offset to the next column to be able to continue with the > next column. > Right. Reducing this overhead was one of the goals to allow "logical ordering" of columns in a table (while arbitrarily reordering the physical ones), but that patch got out of hand pretty quickly. Also, it'd still keep some of the overhead, because it was not removing the alignment/padding entirely. Yea. It'd keep just about all the CPU overhead, because we'd still need to align as soon as there is a preceding nulled or varlena colum. There's still some benefit for logical column order, as grouping NOT NULL fixed-length columns at the start is beneficial. And it's also beneficial to have frequently accessed columns at the start. But I think this proposal gains a lot of the space related benefits, at a much lower complexity, together with a lot of other benefits. +1 > There's two reasons why we can't just set the alignment for these types > to 'c'. > 1) pg_upgrade, for fairly obvious reasons > 2) We map catalog table rows to structs, in a *lot* of places. > > > It seems to me that, despite the above, it's still worth trying to > improve upon the current state, to benefit from reduced space and CPU > usage. > > As it turns out we already separate out the alignment for a type, and a > column, between pg_type.typalign and pg_attribute.attalign. One way to > tackle this would be to allow to specify, for byval types only, at > column creation time whether a column uses a 'struct-mappable' alignment > or not. When set, set attalign to pg_type.typalign for alignment, when > not, to 'c'. By changing pg_dump in binary upgrade mode to emit the > necessary options, and by adding such options during bki processing, > we'd solve 1) and 2), but otherwise gain the benefits. > > Alternatively we could declare such a propert on the table level, but > that seems more restrictive, without a corresponding upside. I don't know, but it seems entirely sufficient specifying this at the table level, no? What would be the use case for removing padding for only some of the columns? I don't see the use case for that. Well, if we had it on a per-table level, we'd also align a) catalog table columns that follow the first varlena column - which we don't need to align, as they can't be accessed via mapping b) columns in pg_upgraded tables that have been added after the upgrade Hmm, OK. I think the question is whether it's worth the extra complexity. I'd say it's not, but perhaps I'm wrong. > It's possible that we should do something related with a few varlena > datatypes. We currently use intalign for types like text, json, and as > far as I can tell that does not make all that much sense. They're not > struct mappable *anyway* (and if they were, they'd need to be 8 byte > aligned on common platforms, __alignof__(void*) is 8). We'd have to take > a bit of care to treat the varlena header as unaligned - but we need to > do so anyway, because of 1byte varlenas. Short varlenas seems to make it > less crucial to pursue this, as the average datum that'd benefit is long > enough to make padding a non-issue. So I don't think it'd make sense to > tackle this project at the same time. > Not sure, but how come it's not failing on the picky platforms, then? On x86 it's probably OK because it's pretty permissive, but I'd expect some platforms (s390, parisc, itanium, powerpc, ...) to be much pickier. I'm not quite following? As I said, we already need to
Re: fe-utils - share query cancellation code
On Thu, Oct 31, 2019 at 11:43 PM Fabien COELHO wrote: > > Hello Devs, > > This patch moves duplicated query cancellation code code from psql & > scripts to fe-utils, so that it is shared and may be used by other > commands. > > This is because Masao-san suggested to add a query cancellation feature to > pgbench for long queries (server-side data generation being discussed, but > possibly pk and fk could use that as well). > > -- > Fabien. I give a quick look and I think we can void psql_setup_cancel_handler(void) { #ifndef WIN32 setup_cancel_handler(psql_sigint_callback); #else setup_cancel_handler(); #endif /* WIN32 */ } to void psql_setup_cancel_handler(void) { setup_cancel_handler(psql_sigint_callback); } Because it does not matter for setup_cancel_handler what we passed because it is ignoring that in case of windows. Hmm, need to remove the assert in the function "setup_cancel_handler" -- Ibrar Ahmed
Re: Creating foreign key on partitioned table is too slow
On Thu, 31 Oct 2019 at 17:56, Tom Lane wrote: > > David Rowley writes: > > In Ottawa this year, Andres and I briefly talked about the possibility > > of making a series of changes to how equalfuncs.c works. The idea was > > to make it easy by using some pre-processor magic to allow us to > > create another version of equalfuncs which would let us have an equal > > comparison function that returns -1 / 0 / +1 rather than just true or > > false. This would allow us to Binary Search Trees of objects. I > > identified that EquivalenceClass.ec_members would be better written as > > a BST to allow much faster lookups in get_eclass_for_sort_expr(). > > This seems like a good idea, but why would we want to maintain two > versions? Just change equalfuncs.c into comparefuncs.c, full stop. > equal() would be a trivial wrapper for (compare_objects(a,b) == 0). If we can do that without slowing down the comparison, then sure. Checking which node sorts earlier is a bit more expensive than just checking for equality. But if that's not going to be noticeable in real-world test cases, then I agree. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Allow superuser to grant passwordless connection rights on postgres_fdw
This patch allows the superuser to grant passwordless connection rights in postgres_fdw user mappings. The patch is authored by my colleague Craig Ringer, with slight bitrot fixed by me. One use case for this is with passphrase-protected client certificates, a patch for which will follow shortly. Here are Craig's remarks on the patch: postgres_fdw denies a non-superuser the ability to establish a connection that doesn't have a password in the connection string, or one that fails to actually use the password in authentication. This is to stop the unprivileged user using OS-level authentication as the postgres server (peer, ident, trust). It also stops unauthorized use of local credentials like .pgpass, a service file, client certificate files, etc. Add the ability for a superuser to create user mappings that override this behaviour by setting the passwordless_ok attribute to true in a user mapping for a non-superuser. The non-superuser gains the ability to use the FDW the mapping applies to even if there's no password in their mapping or in the connection string. This is only safe if the superuser has established that the local server is configured safely. It must be configured not to allow trust/peer/ident/sspi/gssapi auth to allow the OS user the postgres server runs as to log in to postgres as a superuser. Client certificate keys can be used too, if accessible. But the superuser can already GRANT superrole TO normalrole, so it's not any sort of new power. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c index 57ed5f4b90..4739ab0630 100644 --- a/contrib/postgres_fdw/connection.c +++ b/contrib/postgres_fdw/connection.c @@ -17,6 +17,7 @@ #include "access/htup_details.h" #include "access/xact.h" #include "catalog/pg_user_mapping.h" +#include "commands/defrem.h" #include "mb/pg_wchar.h" #include "miscadmin.h" #include "pgstat.h" @@ -91,7 +92,7 @@ static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query, bool ignore_errors); static bool pgfdw_get_cleanup_result(PGconn *conn, TimestampTz endtime, PGresult **result); - +static bool UserMappingPasswordRequired(UserMapping *user); /* * Get a PGconn which can be used to execute queries on the remote PostgreSQL @@ -274,14 +275,16 @@ connect_pg_server(ForeignServer *server, UserMapping *user) /* * Check that non-superuser has used password to establish connection; * otherwise, he's piggybacking on the postgres server's user - * identity. See also dblink_security_check() in contrib/dblink. + * identity. See also dblink_security_check() in contrib/dblink + * and check_conn_params. */ - if (!superuser_arg(user->userid) && !PQconnectionUsedPassword(conn)) + if (!superuser_arg(user->userid) && UserMappingPasswordRequired(user) && + !PQconnectionUsedPassword(conn)) ereport(ERROR, (errcode(ERRCODE_S_R_E_PROHIBITED_SQL_STATEMENT_ATTEMPTED), errmsg("password is required"), errdetail("Non-superuser cannot connect if the server does not request a password."), - errhint("Target server's authentication method must be changed."))); + errhint("Target server's authentication method must be changed or password_required=false set in the user mapping attributes."))); /* Prepare new session for use */ configure_remote_session(conn); @@ -314,12 +317,31 @@ disconnect_pg_server(ConnCacheEntry *entry) } } +/* + * Return true if the password_required is defined and false for this user + * mapping, otherwise false. The mapping has been pre-validated. + */ +static bool +UserMappingPasswordRequired(UserMapping *user) +{ + ListCell *cell; + + foreach(cell, user->options) + { + DefElem*def = (DefElem *) lfirst(cell); + if (strcmp(def->defname, "password_required") == 0) + return defGetBoolean(def); + } + + return true; +} + /* * For non-superusers, insist that the connstr specify a password. This - * prevents a password from being picked up from .pgpass, a service file, - * the environment, etc. We don't want the postgres user's passwords - * to be accessible to non-superusers. (See also dblink_connstr_check in - * contrib/dblink.) + * prevents a password from being picked up from .pgpass, a service file, the + * environment, etc. We don't want the postgres user's passwords, + * certificates, etc to be accessible to non-superusers. (See also + * dblink_connstr_check in contrib/dblink.) */ static void check_conn_params(const char **keywords, const char **values, UserMapping *user) @@ -337,6 +359,10 @@ check_conn_params(const char **keywords, const char **values, UserMapping *user) return; } + /* ok if the superuser explicitly said so at user mapping creation time */ + if (!UserMa
Re: Postgres cache
On Thu, Oct 31, 2019 at 03:19:23PM +0530, Natarajan R wrote: Hi, I want to know how postgres stores catalog relations in cache in-depth. Is there any documentation for that? Not sure what exactly you mean by "cache" - whether shared buffers (as a shared general database cache) or syscache/catcache, i.e. the special cache of catalog records each backend maintains privately. I'm not aware of exhaustive developer docs explaining these parts, but for shared buffers you might want to look at src/backend/storage/buffer/README while for catcache/syscache you probably need to look at the code and comments in the related files, particularly in src/backend/utils/cache/syscache.c src/backend/utils/cache/relcache.c Not sure if there's a better source of information :-( regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: The command tag of "ALTER MATERIALIZED VIEW RENAME COLUMN"
On Thu, Oct 31, 2019 at 6:56 PM Tom Lane wrote: > Fujii Masao writes: > > ... I found that the command tag of > > ALTER MATERIALIZED VIEW RENAME COLUMN is "ALTER TABLE", not "ALTER VIEW". > > > =# ALTER MATERIALIZED VIEW hoge RENAME COLUMN j TO x; > > ALTER TABLE > > > Is this intentional? Or bug? > > Seems like an oversight. > > regards, tom lane > > > The same issue is with ALTER FOREIGN TABLE # ALTER FOREIGN TABLE ft RENAME COLUMN a to t; ALTER TABLE # ALTER MATERIALIZED VIEW mv RENAME COLUMN a to r; ALTER TABLE Attached patch fixes that for ALTER VIEW , ALTER MATERIALIZED VIEW and ALTER FOREIGN TABLE # ALTER MATERIALIZED VIEW mv RENAME COLUMN a to r; ALTER MATERIALIZED VIEW # ALTER FOREIGN TABLE ft RENAME COLUMN a to t; ALTER FOREIGN TABLE -- Ibrar Ahmed 001_alter_tag_ibrar_v1.patch Description: Binary data
libpq sslpassword parameter and callback function
This patch provides for an sslpassword parameter for libpq, and a hook that a client can fill in for a callback function to set the password. This provides similar facilities to those already available in the JDBC driver. There is also a function to fetch the sslpassword from the connection parameters, in the same way that other settings can be fetched. This is mostly the excellent work of my colleague Craig Ringer, with a few embellishments from me. Here are his notes: Allow libpq to non-interactively decrypt client certificates that are stored encrypted by adding a new "sslpassword" connection option. The sslpassword option offers a middle ground between a cleartext key and setting up advanced key mangement via openssl engines, PKCS#11, USB crypto offload and key escrow, etc. Previously use of encrypted client certificate keys only worked if the user could enter the key's password interactively on stdin, in response to openssl's default prompt callback: Enter PEM passhprase: That's infesible in many situations, especially things like use from postgres_fdw. This change also allows admins to prevent libpq from ever prompting for a password by calling: PQsetSSLKeyPassHook(PQdefaultSSLKeyPassHook); which is useful since OpenSSL likes to open /dev/tty to prompt for a password, so even closing stdin won't stop it blocking if there's no user input available. Applications may also override or extend SSL password fetching with their own callback. There is deliberately no environment variable equivalent for the sslpassword option. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: libpq sslpassword parameter and callback function
This time with attachment. On 10/31/19 6:33 PM, Andrew Dunstan wrote: > This patch provides for an sslpassword parameter for libpq, and a hook > that a client can fill in for a callback function to set the password. > > > This provides similar facilities to those already available in the JDBC > driver. > > > There is also a function to fetch the sslpassword from the connection > parameters, in the same way that other settings can be fetched. > > > This is mostly the excellent work of my colleague Craig Ringer, with a > few embellishments from me. > > > Here are his notes: > > > Allow libpq to non-interactively decrypt client certificates that > are stored > encrypted by adding a new "sslpassword" connection option. > > The sslpassword option offers a middle ground between a cleartext > key and > setting up advanced key mangement via openssl engines, PKCS#11, USB > crypto > offload and key escrow, etc. > > Previously use of encrypted client certificate keys only worked if > the user > could enter the key's password interactively on stdin, in response > to openssl's > default prompt callback: > > Enter PEM passhprase: > > That's infesible in many situations, especially things like use from > postgres_fdw. > > This change also allows admins to prevent libpq from ever prompting > for a > password by calling: > > PQsetSSLKeyPassHook(PQdefaultSSLKeyPassHook); > > which is useful since OpenSSL likes to open /dev/tty to prompt for a > password, > so even closing stdin won't stop it blocking if there's no user > input available. > Applications may also override or extend SSL password fetching with > their own > callback. > > There is deliberately no environment variable equivalent for the > sslpassword > option. > > > cheers > > > andrew > > -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/contrib/dblink/expected/dblink.out b/contrib/dblink/expected/dblink.out index 6ceabb453c..6516d4f131 100644 --- a/contrib/dblink/expected/dblink.out +++ b/contrib/dblink/expected/dblink.out @@ -879,7 +879,7 @@ $d$; CREATE USER MAPPING FOR public SERVER fdtest OPTIONS (server 'localhost'); -- fail, can't specify server here ERROR: invalid option "server" -HINT: Valid options in this context are: user, password +HINT: Valid options in this context are: user, password, sslpassword CREATE USER MAPPING FOR public SERVER fdtest OPTIONS (user :'USER'); GRANT USAGE ON FOREIGN SERVER fdtest TO regress_dblink_user; GRANT EXECUTE ON FUNCTION dblink_connect_u(text, text) TO regress_dblink_user; diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index c58527b0c3..a606680182 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -776,6 +776,72 @@ PGPing PQping(const char *conninfo); + + PQsetSSLKeyPassHookPQsetSSLKeyPassHook + + + PQsetSSLKeyPassHook lets an application override + libpq's default + handling of encrypted client certificate key files using +or interactive prompting. + + +void PQsetSSLKeyPassHook(PQsslKeyPassHook_type hook); + + + The application passes a pointer to a callback function with signature: + + int callback_fn(char *buf, int size, PGconn *conn); + + which libpq will then call instead of + its default PQdefaultSSLKeyPassHook handler. The callback + should determine the password for the key and copy it to result-buffer + buf of size size. The string in + buf must be null-terminated. The calback must return the length of + the password stored in buf excluding the null terminator. + On failure, the callback should set buf[0] = '\0' and return 0. + See PQdefaultSSLKeyPassHook in libpq's + source code for an example. + + + + If the user specified an explicit key location, + its path will be in conn->pgsslkey when the callback + is invoked. This will be empty if the default key path is being used. + For keys that are engine specifiers, it is up to engine implementations + whether they use the OpenSSL password callback or define their own handling. + + + + The app callback may choose to delegate unhandled cases to + PQdefaultSSLKeyPassHook, + or call it first and try something else if it returns 0, or completely override it. + + + + The callback must not escape normal flow control with exceptions, + longjmp(...), etc. It must return normally. + + + + + + + PQgetSSLKeyPassHookPQgetSSLKeyPassHook + + + PQgetSSLKeyPassHook returns the current + client certificate key password hook, or NULL + if none has been set. + + +PQsslKeyPassHook_type PQgetSSLKeyPassHook(void); + + + +
Re: merging HashJoin and Hash nodes
On Tue, Oct 29, 2019 at 02:00:00PM +1300, Thomas Munro wrote: On Tue, Oct 29, 2019 at 12:15 PM Andres Freund wrote: I've groused about this a few times, but to me it seems wrong that HashJoin and Hash are separate nodes. They're so tightly bound together that keeping them separate just doesn't architecturally makes sense, imo. So I wrote a prototype. Evidence of being tightly bound together: - functions in nodeHash.h that take a HashJoinState etc - how many functions in nodeHash.h and nodeHashjoin.h are purely exposed so the other side can call them - there's basically no meaningful separation of concerns between code in nodeHash.c and nodeHashjoin.c - the Hash node doesn't really exist during most of the planning, it's kind of faked up in create_hashjoin_plan(). - HashJoin knows that the inner node is always going to be a Hash node. - HashJoinState and HashState both have pointers to HashJoinTable, etc Besides violating some aesthetical concerns, I think it also causes practical issues: - code being in different translation code prevents the compiler from inlining etc. There's a lot of HOT calls going between both. For each new outer tuple we e.g. call, from nodeHashjoin.c separately into nodeHash.c for ExecHashGetHashValue(), ExecHashGetBucketAndBatch(), ExecHashGetSkewBucket(), ExecScanHashBucket(). They each have to do memory loads from HashJoinState/HashJoinTable, even though previous code *just* has done so. I wonder how much we can gain by this. I don't expect any definitive figures from a patch at this stage, but maybe you have some guesses? - a separate executor node, and all the ancillary data (slots, expression context, target lists etc) is far from free - instead of just applying an "empty outer" style optimization to both sides of the HashJoin, we have to choose. Once unified it's fairly easy to just use it on both. - generally, a lot of improvements are harder to develop because of the odd separation. I agree with all of that. Does anybody have good arguments for keeping them separate? The only real one I can see is that it's not a small change, and will make bugfixes etc a bit harder. Personally I think that's outweighed by the disadvantages. Yeah, the ~260KB of churn you came up with is probably the reason I didn't even think of suggesting something along these lines while working on PHJ, though it did occur to me that the division was entirely artificial as I carefully smashed more holes in both directions through that wall. Trying to think of a reason to keep Hash, I remembered Kohei KaiGai's speculation about Hash nodes that are shared by different Hash Join nodes (in the context of a partition-wise join where each partition is joined against one table). But even if we were to try to do that, a Hash node isn't necessary to share the hash table, so that's not an argument. Attached is a quick prototype that unifies them. It's not actually that hard, I think? Obviously this is far from ready, but I thought it'd be a good basis to get a discussion started? I haven't looked at the patch yet but this yet but it sounds like a good start from the description. Comments on the prototype: - I've hacked EXPLAIN to still show the Hash node, to reduce the size of the diffs. I'm doubtful that that's the right approach (and I'm sure it's not the right approach to do so with the code I injected) - I think the Hash node in the explain doesn't really help users, and just makes the explain bigger (except for making it clearer which side is hashed) Yeah, I'm not sure why you'd want to show a Hash node to users if there is no way to use it in any other context than a Hash Join. FWIW, Oracle, DB2 and SQL Server don't show an intermediate Hash node in their plans, and generally you just have to know which way around the input relations are shown (from a quick a glance at some examples found on the web, Oracle and SQL Server show hash relation above probe relation, while DB2, PostgreSQL and MySQL show probe relation above hash relation). Curiously, MySQL 8 just added hash joins, and they do show a Hash node (at least in FORMAT=TREE mode, which looks a bit like our EXPLAIN). Not sure. Maybe we don't need to show an explicit Hash node, because that might seem to imply there's a separate executor step. And that would be misleading. But IMO we should make it obvious which side of the join is hashed, instead of relyin on users to "know which way around the relations are shown". The explain is often used by users who're learning stuff, or maybe investigating it for the first time, and we should not make it unnecessarily hard to understand. I don't think we have any other "explain node" that would not represent an actual executor node, so not sure what's the right approach. So maybe that's not the right way to do that ... OTOH we have tools visualizing execution plans, so maybe backwards compatibility of the output is a concern too (I know we don
A wiki page to track hash join projects and ideas
Hello hackers, Please feel free to edit this new page, which I'd like to use to keep track of observations, ideas and threads relating to hash joins. https://wiki.postgresql.org/wiki/Hash_Join
Re: libpq sslpassword parameter and callback function
On 10/31/19 6:34 PM, Andrew Dunstan wrote: > This time with attachment. > > > On 10/31/19 6:33 PM, Andrew Dunstan wrote: >> This patch provides for an sslpassword parameter for libpq, and a hook >> that a client can fill in for a callback function to set the password. >> >> >> This provides similar facilities to those already available in the JDBC >> driver. >> >> >> There is also a function to fetch the sslpassword from the connection >> parameters, in the same way that other settings can be fetched. >> >> >> This is mostly the excellent work of my colleague Craig Ringer, with a >> few embellishments from me. >> >> >> Here are his notes: >> >> >> Allow libpq to non-interactively decrypt client certificates that >> are stored >> encrypted by adding a new "sslpassword" connection option. >> >> The sslpassword option offers a middle ground between a cleartext >> key and >> setting up advanced key mangement via openssl engines, PKCS#11, USB >> crypto >> offload and key escrow, etc. >> >> Previously use of encrypted client certificate keys only worked if >> the user >> could enter the key's password interactively on stdin, in response >> to openssl's >> default prompt callback: >> >> Enter PEM passhprase: >> >> That's infesible in many situations, especially things like use from >> postgres_fdw. >> >> This change also allows admins to prevent libpq from ever prompting >> for a >> password by calling: >> >> PQsetSSLKeyPassHook(PQdefaultSSLKeyPassHook); >> >> which is useful since OpenSSL likes to open /dev/tty to prompt for a >> password, >> so even closing stdin won't stop it blocking if there's no user >> input available. >> Applications may also override or extend SSL password fetching with >> their own >> callback. >> >> There is deliberately no environment variable equivalent for the >> sslpassword >> option. >> >> I should also mention that this patch provides for support for DER format certificates and keys. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: merging HashJoin and Hash nodes
Hi, On 2019-10-31 23:59:19 +0100, Tomas Vondra wrote: > On Tue, Oct 29, 2019 at 02:00:00PM +1300, Thomas Munro wrote: > > On Tue, Oct 29, 2019 at 12:15 PM Andres Freund wrote: > > > I've groused about this a few times, but to me it seems wrong that > > > HashJoin and Hash are separate nodes. They're so tightly bound together > > > that keeping them separate just doesn't architecturally makes sense, > > > imo. So I wrote a prototype. > > > > > > Evidence of being tightly bound together: > > > - functions in nodeHash.h that take a HashJoinState etc > > > - how many functions in nodeHash.h and nodeHashjoin.h are purely exposed > > > so the other side can call them > > > - there's basically no meaningful separation of concerns between code in > > > nodeHash.c and nodeHashjoin.c > > > - the Hash node doesn't really exist during most of the planning, it's > > > kind of faked up in create_hashjoin_plan(). > > > - HashJoin knows that the inner node is always going to be a Hash node. > > > - HashJoinState and HashState both have pointers to HashJoinTable, etc > > > > > > Besides violating some aesthetical concerns, I think it also causes > > > practical issues: > > > > > > - code being in different translation code prevents the compiler from > > > inlining etc. There's a lot of HOT calls going between both. For each > > > new outer tuple we e.g. call, from nodeHashjoin.c separately into > > > nodeHash.c for ExecHashGetHashValue(), ExecHashGetBucketAndBatch(), > > > ExecHashGetSkewBucket(), ExecScanHashBucket(). They each have to > > > do memory loads from HashJoinState/HashJoinTable, even though previous > > > code *just* has done so. > > I wonder how much we can gain by this. I don't expect any definitive > figures from a patch at this stage, but maybe you have some guesses? It's measureable, but not a world-changing difference. Some of the gains are limited by the compiler not realizing that it does not have to reload values across some external function calls. I saw somewhere around ~3% for a case that was bottlenecked by HJ lookups (not build). I think part of the effect size is also limited by other unnecessary inefficiencies being a larger bottleneck. E.g. 1) the fact that ExecScanHashBucket() contains branches that have roughly 50% likelihoods, making them unpredictable ( 1. on a successfull lookup we oscillate between the first hashTuple != NULL test succeeding and failing except in case of bucket conflict; 2. the while (hashTuple != NULL) oscillates similarly, because it tests for I. initial lookup succeeding, II. further tuple in previous bucket lookup III. further tuples in case of hashvalue mismatch. Quite observable by profiling for br_misp_retired.conditional. 2) The fact that there's *two* indirections for a successfull lookup that are very likely to be cache misses. First we need to look up the relevant bucket, second we need to actually fetch hashvalue from the pointer stored in the bucket. But even independent of these larger inefficiencies, I suspect we could also gain more from inlining by changing nodeHashjoin a bit. E.g. moving the HJ_SCAN_BUCKET code into an always_inline function, and also referencing it from the tail end of the HJ_NEED_NEW_OUTER code, instead of falling through, would allow to optimize away a number of loads (and I think also stores), and improve branch predictor efficiency. E.g. optimizing away store/load combinations for node->hj_CurHashValue, node->hj_CurBucketNo, node->hj_CurSkewBucketNo; loads of hj_HashTable, ...; stores of node->hj_JoinState, node->hj_MatchedOuter. And probably make the code easier to read, to boot. > But IMO we should make it obvious which side of the join is hashed, > instead of relyin on users to "know which way around the relations are > shown". The explain is often used by users who're learning stuff, or > maybe investigating it for the first time, and we should not make it > unnecessarily hard to understand. I agree. I wonder if just outputting something like 'Hashed Side: second' (or "right", or ...) could work. Not perfect, but I don't really have a better idea. We somewhat rely on users understanding inner/outer for explain output (I doubt this is good, to be clear), e.g. "Inner Unique: true ". Made worse by the fact that "inner"/"outer" is also used to describe different kinds of joins, with a mostly independent meaning. > OTOH we have tools visualizing execution plans, so maybe backwards > compatibility of the output is a concern too (I know we don't promise > anything, though). Well, execution *would* work a bit differently, so I don't feel too bad about tools having to adapt to that. E.g. graphical explain tool really shouldn't display a separate Hash nodes anymore. > > The fact that EXPLAIN doesn't label relations seems to be a separate > > concern that applies equally to nestloop joins, and could perhaps be > > addressed with some more optional verbosity, not a fake
Improve checking for pg_index.xmin
Hi! Our customer faced with issue, when index is invisible after creation. The reproducible case is following. $ psql db2 # begin; # select txid_current(); $ psql db1 # select i as id, 0 as v into t from generate_series(1, 10) i; # create unique index idx on t (id); # update t set v = v + 1 where id = 1; # update t set v = v + 1 where id = 1; # update t set v = v + 1 where id = 1; # update t set v = v + 1 where id = 1; # update t set v = v + 1 where id = 1; # drop index idx; # create unique index idx on t (id); # explain analyze select v from t where id = 1; There is no issue if there is no parallel session in database db2. The fact that index visibility depends on open transaction in different database is ridiculous for users. This happens so, because we're checking that there is no broken HOT chains after index creation by comparison pg_index.xmin and TransactionXmin. So, we check that pg_index.xmin is in the past for current transaction in lossy way by comparison just xmins. Attached patch changes this check to XidInMVCCSnapshot(). With patch the issue is gone. My doubt about this patch is that it changes check with TransactionXmin to check with GetActiveSnapshot(), which might be more recent. However, query shouldn't be executer with older snapshot than one it was planned with. Any thoughts? -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company 0001-improve-check-for-pg_index-xmin.patch Description: Binary data
Allow 'sslkey' and 'sslcert' in postgres_fdw user mappings
This patch achieves $SUBJECT and also provides some testing of the sslpassword setting. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 298f652afc..c2661320bc 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -174,6 +174,18 @@ WARNING: extension "bar" is not installed ALTER SERVER testserver1 OPTIONS (DROP extensions); ALTER USER MAPPING FOR public SERVER testserver1 OPTIONS (DROP user, DROP password); +-- Attempt to add a valid option that's not allowed in a user mapping +ALTER USER MAPPING FOR public SERVER testserver1 + OPTIONS (ADD sslmode 'require'); +ERROR: invalid option "sslmode" +HINT: Valid options in this context are: user, password, sslpassword, password_required, sslcert, sslkey +-- But we can add valid ones fine +ALTER USER MAPPING FOR public SERVER testserver1 + OPTIONS (ADD sslpassword 'dummy'); +-- Ensure valid options we haven't used in a user mapping yet are +-- permitted to check validation. +ALTER USER MAPPING FOR public SERVER testserver1 + OPTIONS (ADD sslkey 'value', ADD sslcert 'value'); ALTER FOREIGN TABLE ft1 OPTIONS (schema_name 'S 1', table_name 'T 1'); ALTER FOREIGN TABLE ft2 OPTIONS (schema_name 'S 1', table_name 'T 1'); ALTER FOREIGN TABLE ft1 ALTER COLUMN c1 OPTIONS (column_name 'C 1'); diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c index 07fc7fa00a..0f35dfe54f 100644 --- a/contrib/postgres_fdw/option.c +++ b/contrib/postgres_fdw/option.c @@ -196,6 +196,16 @@ InitPgFdwOptions(void) {"fetch_size", ForeignServerRelationId, false}, {"fetch_size", ForeignTableRelationId, false}, {"password_required", UserMappingRelationId, false}, + /* + * Extra room for the user mapping copies of sslcert and sslkey. These + * are really libpq options but we repeat them here to allow them to + * appear in both foreign server context (when we generate libpq + * options) and user mapping context (from here). Bit of a hack + * putting this in "non_libpq_options". + */ + {"sslcert", UserMappingRelationId, true}, + {"sslkey", UserMappingRelationId, true}, + {NULL, InvalidOid, false} }; diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql index 5f50c65566..bfdd5dd4d2 100644 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -188,6 +188,19 @@ ALTER SERVER testserver1 OPTIONS (DROP extensions); ALTER USER MAPPING FOR public SERVER testserver1 OPTIONS (DROP user, DROP password); +-- Attempt to add a valid option that's not allowed in a user mapping +ALTER USER MAPPING FOR public SERVER testserver1 + OPTIONS (ADD sslmode 'require'); + +-- But we can add valid ones fine +ALTER USER MAPPING FOR public SERVER testserver1 + OPTIONS (ADD sslpassword 'dummy'); + +-- Ensure valid options we haven't used in a user mapping yet are +-- permitted to check validation. +ALTER USER MAPPING FOR public SERVER testserver1 + OPTIONS (ADD sslkey 'value', ADD sslcert 'value'); + ALTER FOREIGN TABLE ft1 OPTIONS (schema_name 'S 1', table_name 'T 1'); ALTER FOREIGN TABLE ft2 OPTIONS (schema_name 'S 1', table_name 'T 1'); ALTER FOREIGN TABLE ft1 ALTER COLUMN c1 OPTIONS (column_name 'C 1'); diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml index bdd6ff8ae4..ef203d225c 100644 --- a/doc/src/sgml/postgres-fdw.sgml +++ b/doc/src/sgml/postgres-fdw.sgml @@ -107,13 +107,13 @@ A foreign server using the postgres_fdw foreign data wrapper can have the same options that libpq accepts in connection strings, as described in , -except that these options are not allowed: +except that these options are not allowed or have special handling: user, password and sslpassword (specify these - in a user mapping, instead) + in a user mapping, instead, or use a service file) @@ -128,6 +128,14 @@ postgres_fdw) + + + sslkey and sslpassword - these may + appear in either or both a connection and a user + mapping. If both are present, the user mapping setting overrides the + connection setting. + +
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Hello. On Thu, Oct 31, 2019 at 11:25 PM Masahiko Sawada wrote: > > On Fri, Sep 6, 2019 at 3:34 PM Smith, Peter > wrote: > > > > -Original Message- > > From: Masahiko Sawada Sent: Thursday, 15 August > > 2019 7:10 PM > > > > > BTW I've created PoC patch for cluster encryption feature. Attached patch > > > set has done some items of TODO list and some of them can be used even > > > for finer granularity encryption. Anyway, the implemented components are > > > followings: > > > > Hello Sawada-san, > > > > I guess your original patch code may be getting a bit out-dated by the > > ongoing TDE discussions, but I have done some code review of it anyway. > > > > Hopefully a few comments below can still be of use going forward: > > > > --- > > > > REVIEW COMMENTS > > > > * src/backend/storage/encryption/enc_cipher.c – For functions > > EncryptionCipherValue/String maybe should log warnings for unexpected > > values instead of silently assigning to default 0/”off”. > > > > * src/backend/storage/encryption/enc_cipher.c – For function > > EncryptionCipherString, purpose of returning ”unknown” if unclear because > > that will map back to “off” again anyway via EncryptionCipherValue. Why not > > just return "off" (with warning logged). > > > > * src/include/storage/enc_common.h – Typo in comment: "Encrypton". > > > > * src/include/storage/encryption.h - The macro DataEncryptionEnabled may be > > better to be using enum TDE_ENCRYPTION_OFF instead of magic number 0 > > > > * src/backend/storage/encryption/kmgr.c - Function BootStrapKmgr will > > report error if USE_OPENSSL is not defined. The check seems premature > > because it would fail even if the user is not using encryption. Shouldn't > > the lack of openssl be OK when user is not using TDE at all (i.e. when > > encryption is "none")? > > > > * src/backend/storage/encryption/kmgr.c - In function BootStrapMgr suggest > > better to check if (bootstrap_data_encryption_cipher == TDE_ENCRYPTION_OFF) > > using enum instead of the magic number 0. > > > > * src/backend/storage/encryption/kmgr.c - The function > > run_cluster_passphrase_command function seems mostly a clone of an existing > > run_ssl_passphrase_command function. Is it possible to refactor to share > > the common code? > > > > * src/backend/storage/encryption/kmgr.c - The function > > derive_encryption_key declares a char key_len. Why char? It seems int > > everywhere else. > > > > * src/backend/bootstrap/bootstrap.c - Suggest better if variable > > declaration bootstrap_data_encryption_cipher = 0 uses enum > > TDE_ENCRYPTION_OFF instead of magic number 0 > > > > * src/backend/utils/misc/guc.c - It looks like the default value for GUC > > variable data_encryption_cipher is AES128. Wouldn't "off" be the more > > appropriate default value? Otherwise it seems inconsistent with the logic > > of initdb (which insists that the -e option is mandatory if you wanted any > > encryption). > > > > * src/backend/utils/misc/guc.c - There is a missing entry in the > > config_group_names[]. The patch changed the config_group[] in guc_tables.h, > > so I think there needs to be a matching item in the config_group_names. > > > > * src/bin/initdb/initdb.c - The function check_encryption_cipher would > > disallow an encryption value of "none". Although maybe it is not very > > useful to say -e none, it does seem inconsistent to reject it, given that > > "none" was a valid value for the GUC variable data_encryption_cipher. > > > > * contrib/bloom/blinsert.c - In function btbuildempty the arguments for > > PageEncryptionInPlace seem in the wrong order (forknum should be 2nd). > > > > * src/backend/access/hash/hashpage.c - In function _hash_alloc_buckets the > > arguments for PageEncryptionInPlace seem in the wrong order (forknum should > > be 2nd). > > > > * src/backend/access/spgist/spginsert.c - In function spgbuildempty the > > arguments for PageEncryptionInPlace seem in the wrong order (forknum should > > be 2nd). This error looks repeated 3X. > > > > * in multiple files - The encryption enums have equivalent strings ("off", > > "aes-128", "aes-256") but they are scattered as string literals in many > > places (e.g. pg_controldata.c, initdb.c, guc.c, enc_cipher.c). Suggest it > > would be better to declare those as string constants in one place only.em > > > > Thank you for reviewing this patch. > > I've updated TDE patches. These patches implements key system, buffer > encryption and WAL encryption. Please refer to ToDo of cluster-wide > encryption for more details of design and components. It lacks > temporary file encryption and front end tools encryption. For > temporary file encryption, we are discussing which files should be > encrypted on other thread and I thought that temporary file encryption > might be related to that. So I'm currently studying the temporary > encryption patch that Antonin already submitted[1] but some changes > might be needed based on that discu
Re: [BUG] Partition creation fails after dropping a column and adding a partial index
On Thu, Oct 31, 2019 at 1:45 PM Michael Paquier wrote: > On Tue, Oct 29, 2019 at 01:16:58PM +0900, Michael Paquier wrote: > > Yes, something looks wrong with that. I have not looked at it in > > details yet though. I'll see about that tomorrow. > > So.. When building the attribute map for a cloned index (with either > LIKE during the transformation or for partition indexes), we store > each attribute number with 0 used for dropped columns. Unfortunately, > if you look at the way the attribute map is built in this case the > code correctly generates the mapping with convert_tuples_by_name_map. > But, the length of the mapping used is incorrect as this makes use of > the number of attributes for the newly-created child relation, and not > the parent which includes the dropped column in its count. So the > answer is simply to use the parent as reference for the mapping > length. > > The patch is rather simple as per the attached, with extended > regression tests included. I have not checked on back-branches yet, > but that's visibly wrong since 8b08f7d down to v11 (will do that when > back-patching). The patch looks correct and applies to both v12 and v11. > There could be a point in changing convert_tuples_by_name_map & co so > as they return the length of the map on top of the map to avoid such > mistakes in the future. That's a more invasive patch not really > adapted for a back-patch, but we could do that on HEAD once this bug > is fixed. I have also checked other calls of this API and the > handling is done correctly. I've been bitten by this logical error when deciding what length to use for the map, so seems like a good idea. Thanks, Amit
Re: Restore replication settings when modifying a field type
On 2019/10/28 12:39, Kyotaro Horiguchi wrote: Hello. # The patch no longer applies on the current master. Needs a rebasing. At Sat, 26 Oct 2019 16:50:48 +0800, Quan Zongliang wrote in In fact, the replication property of the table has not been modified, and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously specified index property 'indisreplident' is set to false because of the rebuild. I suppose that the behavior is intended. Change of column types on the publisher side can break the agreement on replica identity with subscribers. Thus replica identity setting cannot be restored unconditionally. For (somewhat artifitial :p) example: P=# create table t (c1 integer, c2 text unique not null); P=# alter table t replica identity using index t_c2_key; P=# create publication p1 for table t; P=# insert into t values (0, '00'), (1, '01'); S=# create table t (c1 integer, c2 text unique not null); S=# alter table t replica identity using index t_c2_key; S=# create subscription s1 connection '...' publication p1; Your patch allows change of the type of c2 into integer. P=# alter table t alter column c2 type integer using c2::integer; P=# update t set c1 = c1 + 1 where c2 = '01'; This change doesn't affect perhaps as expected. S=# select * from t; c1 | c2 + 0 | 00 1 | 01 (2 rows) So I developed a patch. If the user modifies the field type. The associated index is REPLICA IDENTITY. Rebuild and restore replication settings. Explicit setting of replica identity premises that they are sure that the setting works correctly. Implicit rebuilding after a type change can silently break it. At least we need to guarantee that the restored replica identity setting is truly compatible with all existing subscribers. I'm not sure about potential subscribers.. Anyway I think it is a problem that replica identity setting is dropped silently. Perhaps a message something like "REPLICA IDENTITY setting is lost, please redefine after confirmation of compatibility with subscribers." is needed. In fact, the scene we encountered is like this. The field of a user's table is of type "smallint", and it turns out that this range is not sufficient. So change it to "int". At this point, the REPLICA IDENTITY is lost and the user does not realize it. When they found out, the logical replication for this period of time did not output normally. Users have to find other ways to get the data back. The logical replication of this user is to issue standard SQL statements to other relational databases using the plugin developed by himself. And they have thousands of tables to replicate. So I think this patch is appropriate in this scenario. As for the matching problem between publishers and subscribers, I'm afraid it's hard to solve here. If this is not a suitable modification, I can withdraw it. And see if there's a better way. If necessary, I'll modify it again. Rebase to the master branch. regards.
abs function for interval
Hi, Sometimes you want to answer if a difference between two timestamps is lesser than x minutes but you are not sure which timestamp is greater than the other one (to obtain a positive result -- it is not always possible). However, if you cannot obtain the absolute value of subtraction, you have to add two conditions. The attached patch implements abs function and @ operator for intervals. The following example illustrates the use case: postgres=# create table xpto (a timestamp, b timestamp); CREATE TABLE postgres=# insert into xpto (a, b) values(now(), now() - interval '1 day'),(now() - interval '5 hour', now()),(now() + '3 hour', now()); INSERT 0 3 postgres=# select *, a - b as t from xpto; a | b | t ++--- 2019-10-31 22:43:30.601861 | 2019-10-30 22:43:30.601861 | 1 day 2019-10-31 17:43:30.601861 | 2019-10-31 22:43:30.601861 | -05:00:00 2019-11-01 01:43:30.601861 | 2019-10-31 22:43:30.601861 | 03:00:00 (3 rows) postgres=# select *, a - b as i from xpto where abs(a - b) < interval '12 hour'; a | b | i ++--- 2019-10-31 17:43:30.601861 | 2019-10-31 22:43:30.601861 | -05:00:00 2019-11-01 01:43:30.601861 | 2019-10-31 22:43:30.601861 | 03:00:00 (2 rows) postgres=# select @ interval '1 years -2 months 3 days 4 hours -5 minutes 6.789 seconds' as t; t - 10 mons 3 days 03:55:06.789 (1 row) -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento From b11a05e3304250803c7aa2ac811e0d49b0adfc00 Mon Sep 17 00:00:00 2001 From: Euler Taveira Date: Thu, 31 Oct 2019 23:07:00 -0300 Subject: [PATCH] Add abs function for interval. Sometimes you want to answer if a difference between two timestamps is lesser than x minutes. However, if you cannot obtain the absolute value of subtraction, you have to add two conditions. Let's make it simple and add abs function and @ operator for intervals. --- doc/src/sgml/func.sgml | 19 +++ src/backend/utils/adt/timestamp.c | 17 + src/include/catalog/pg_operator.dat| 3 +++ src/include/catalog/pg_proc.dat| 6 ++ src/test/regress/expected/interval.out | 8 src/test/regress/sql/interval.sql | 4 6 files changed, 57 insertions(+) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 28eb322f3f..9882742aba 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -7370,6 +7370,12 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}'); interval '1 hour' / double precision '1.5' interval '00:40:00' + + + @ +@ interval '-2 hour' +interval '02:00:00' + @@ -7391,6 +7397,19 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}'); + abs + + abs(interval) + +interval +Absolute value +abs(interval '6 days -08:16:27') +6 days 08:16:27 + + + + + age age(timestamp, timestamp) diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c index 1dc4c820de..a6b8b8c221 100644 --- a/src/backend/utils/adt/timestamp.c +++ b/src/backend/utils/adt/timestamp.c @@ -2435,6 +2435,23 @@ interval_cmp(PG_FUNCTION_ARGS) PG_RETURN_INT32(interval_cmp_internal(interval1, interval2)); } +Datum +interval_abs(PG_FUNCTION_ARGS) +{ + Interval *interval = PG_GETARG_INTERVAL_P(0); + Interval *result; + + result = palloc(sizeof(Interval)); + *result = *interval; + + /* convert all struct Interval members to absolute values */ + result->month = (interval->month < 0) ? (-1 * interval->month) : interval->month; + result->day = (interval->day < 0) ? (-1 * interval->day) : interval->day; + result->time = (interval->time < 0) ? (-1 * interval->time) : interval->time; + + PG_RETURN_INTERVAL_P(result); +} + /* * Hashing for intervals * diff --git a/src/include/catalog/pg_operator.dat b/src/include/catalog/pg_operator.dat index fa7dc96ece..09ce9f2765 100644 --- a/src/include/catalog/pg_operator.dat +++ b/src/include/catalog/pg_operator.dat @@ -2164,6 +2164,9 @@ { oid => '1803', descr => 'subtract', oprname => '-', oprleft => 'timetz', oprright => 'interval', oprresult => 'timetz', oprcode => 'timetz_mi_interval' }, +{ oid => '8302', descr => 'absolute value', + oprname => '@', oprkind => 'l', oprleft => '0', oprright => 'interval', + oprresult => 'interval', oprcode => 'interval_abs' }, { oid => '1804', descr => 'equal', oprname => '=', oprcanmerge => 't', oprleft => 'varbit', oprright => 'varbit', diff --git a/src/include/c
Re: abs function for interval
Hi, On 2019-10-31 23:20:07 -0300, Euler Taveira wrote: > diff --git a/src/backend/utils/adt/timestamp.c > b/src/backend/utils/adt/timestamp.c > index 1dc4c820de..a6b8b8c221 100644 > --- a/src/backend/utils/adt/timestamp.c > +++ b/src/backend/utils/adt/timestamp.c > @@ -2435,6 +2435,23 @@ interval_cmp(PG_FUNCTION_ARGS) > PG_RETURN_INT32(interval_cmp_internal(interval1, interval2)); > } > > +Datum > +interval_abs(PG_FUNCTION_ARGS) > +{ > + Interval *interval = PG_GETARG_INTERVAL_P(0); > + Interval *result; > + > + result = palloc(sizeof(Interval)); > + *result = *interval; > + > + /* convert all struct Interval members to absolute values */ > + result->month = (interval->month < 0) ? (-1 * interval->month) : > interval->month; > + result->day = (interval->day < 0) ? (-1 * interval->day) : > interval->day; > + result->time = (interval->time < 0) ? (-1 * interval->time) : > interval->time; > + > + PG_RETURN_INTERVAL_P(result); > +} > + Several points: 1) I don't think you can do the < 0 check on an elementwise basis. Your code would e.g. make a hash out of abs('1 day -1 second'), by inverting the second, but not the day (whereas nothing should be done). It'd probably be easiest to implement this by comparing with a 0 interval using inteval_lt() or interval_cmp_internal(). 2) This will not correctly handle overflows, I believe. What happens if you do SELECT abs('-2147483648 days'::interval)? You probably should reuse interval_um() for this. > --- a/src/test/regress/expected/interval.out > +++ b/src/test/regress/expected/interval.out > @@ -927,3 +927,11 @@ select make_interval(secs := 7e12); > @ 19 hours 26 mins 40 secs > (1 row) > > +-- test absolute operator > +set IntervalStyle to postgres; > +select @ interval '1 years -2 months 3 days 4 hours -5 minutes 6.789 > seconds' as t; > + t > +- > + 10 mons 3 days 03:55:06.789 > +(1 row) > + > diff --git a/src/test/regress/sql/interval.sql > b/src/test/regress/sql/interval.sql > index bc5537d1b9..8f9a2bda29 100644 > --- a/src/test/regress/sql/interval.sql > +++ b/src/test/regress/sql/interval.sql > @@ -308,3 +308,7 @@ select make_interval(months := 'NaN'::float::int); > select make_interval(secs := 'inf'); > select make_interval(secs := 'NaN'); > select make_interval(secs := 7e12); > + > +-- test absolute operator > +set IntervalStyle to postgres; > +select @ interval '1 years -2 months 3 days 4 hours -5 minutes 6.789 > seconds' as t; > -- > 2.11.0 This is not even remotely close to enough tests. In your only test abs() does not change the value, as there's no negative component (the 1 year -2 month result in a positive 10 months, and the hours, minutes and seconds get folded together too). At the very least a few boundary conditions need to be tested (see b) above), a few more complicated cases with different components being of different signs, and you need to show the values with and without applying abs(). Greetings, Andres Freund
Re: The command tag of "ALTER MATERIALIZED VIEW RENAME COLUMN"
On Fri, Nov 1, 2019 at 6:34 AM Ibrar Ahmed wrote: > > > > On Thu, Oct 31, 2019 at 6:56 PM Tom Lane wrote: >> >> Fujii Masao writes: >> > ... I found that the command tag of >> > ALTER MATERIALIZED VIEW RENAME COLUMN is "ALTER TABLE", not "ALTER VIEW". >> >> > =# ALTER MATERIALIZED VIEW hoge RENAME COLUMN j TO x; >> > ALTER TABLE >> >> > Is this intentional? Or bug? >> >> Seems like an oversight. Thanks for the check! > The same issue is with ALTER FOREIGN TABLE Yes. > Attached patch fixes that for ALTER VIEW , ALTER MATERIALIZED VIEW and ALTER > FOREIGN TABLE You introduced subtype in your patch, but I think it's better and simpler to just give relationType to AlterObjectTypeCommandTag() if renaming the columns (i.e., renameType = OBJECT_COLUMN). To avoid this kind of oversight about command tag, I'd like to add regression tests to make sure that SQL returns valid and correct command tag. But currently there seems no mechanism for such test, in regression test. Right?? Maybe we will need that mechanism. Regards, -- Fujii Masao cmdtag_of_alter_mv_v1.patch Description: Binary data
Re: Restore replication settings when modifying a field type
Em seg, 28 de out de 2019 às 01:41, Kyotaro Horiguchi escreveu: > > At Sat, 26 Oct 2019 16:50:48 +0800, Quan Zongliang > wrote in > > In fact, the replication property of the table has not been modified, > > and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously > > specified index property 'indisreplident' is set to false because of > > the rebuild. > > I suppose that the behavior is intended. Change of column types on the > publisher side can break the agreement on replica identity with > subscribers. Thus replica identity setting cannot be restored > unconditionally. For (somewhat artifitial :p) example: > I don't think so. The actual logical replication behavior is that DDL will always break replication. If you add a new column or drop a column, you will stop replication for that table while you don't execute the same DDL in the subscriber. What happens in the OP case is that a DDL is *silently* breaking the logical replication. IMHO it is a bug. If the behavior was intended it should clean pg_class.relreplident but it is not. > Explicit setting of replica identity premises that they are sure that > the setting works correctly. Implicit rebuilding after a type change > can silently break it. > The current behavior forces the OP to execute 2 DDLs in the same transaction to ensure that it won't "loose" transactions for logical replication. > At least we need to guarantee that the restored replica identity > setting is truly compatible with all existing subscribers. I'm not > sure about potential subscribers.. > Why? Replication will break and to fix it you should apply the same DDL you apply in publisher. It is the right thing to do. [poking the code...] ATExecAlterColumnType records everything that depends on the column and for indexes it saves the definition (via pg_get_indexdef_string). Definition is not sufficient for reconstructing the replica identity information because there is not such keyword for replica identity in CREATE INDEX. The new index should call relation_mark_replica_identity to fix pg_index.indisreplident. -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Re: Allow CREATE OR REPLACE VIEW to rename the columns
On Thu, Oct 31, 2019 at 10:54 PM Tom Lane wrote: > > Fujii Masao writes: > > On Thu, Oct 31, 2019 at 1:42 PM Tom Lane wrote: > >> Fujii Masao writes: > >>> Currently CREATE OR REPLACE VIEW command fails if the column names > >>> are changed. > > >> That is, I believe, intentional. It's an effective aid to catching > >> mistakes in view redefinitions, such as misaligning the new set of > >> columns relative to the old. [example] > > > This example makes me wonder if the addtion of column by > > CREATE OR REPLACE VIEW also has the same (or even worse) issue. > > That is, it may increase the oppotunity for users' mistake. > > The idea in CREATE OR REPLACE VIEW is to allow addition of new > columns at the end, the same as you can do with tables. Checking > the column name matchups is a way to ensure that you actually do > add at the end, rather than insert, which wouldn't act as you > expect. Admittedly it's only heuristic. > > We could, perhaps, have insisted that adding a column also requires > special syntax, but we didn't. Consider for example a case where > the new column needs to come from an additionally joined table; > then you have to be able to edit the underlying view definition along > with adding the column. So that seems like kind of a pain in the > neck to insist on. > > > But what's the worse is that, currently there is no way to > > drop the column from the view, except recreation of the view. > > I think this has been discussed, as well. It's not necessarily > simple to drop a view output column. For example, if the view > uses SELECT DISTINCT, removing an output column would have > semantic effects on the set of rows that can be returned, since > distinct-ness would now mean something else than it did before. > > It's conceivable that we could enumerate all such effects and > then allow DROP COLUMN (probably replacing the output column > with a null constant) if none of them apply, but I can't get > terribly excited about it. The field demand for such a feature > has not been high. I'd be a bit worried about bugs arising > from failures to check attisdropped for views, too; so that > the cost of getting this working might be greater than it seems > on the surface. Thanks for the explanation! Understood. Regards, -- Fujii Masao
Re: abs function for interval
Em qui, 31 de out de 2019 às 23:45, Andres Freund escreveu: > > 1) I don't think you can do the < 0 check on an elementwise basis. Your >code would e.g. make a hash out of abs('1 day -1 second'), by >inverting the second, but not the day (whereas nothing should be >done). > >It'd probably be easiest to implement this by comparing with a 0 >interval using inteval_lt() or interval_cmp_internal(). > Hmm. Good idea. Let me try it. -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Re: Problem with synchronous replication
On Oct 31, 2019, at 10:11, Michael Paquier wrote: > > On Wed, Oct 30, 2019 at 05:43:04PM +0900, Kyotaro Horiguchi wrote: >> At Wed, 30 Oct 2019 17:21:17 +0900, Fujii Masao >> wrote in >>> This change causes every ending backends to always take the exclusive lock >>> even when it's not in SyncRep queue. This may be problematic, for example, >>> when terminating multiple backends at the same time? If yes, >>> it might be better to check SHMQueueIsDetached() again after taking the >>> lock. >>> That is, >> >> I'm not sure how much that harms but double-checked locking >> (releasing) is simple enough for reducing possible congestion here, I >> think. > > FWIW, I could not measure any actual difference with pgbench -C, up to > 500 sessions and an empty input file (just have one meta-command) and > -c 20. > > I have added some comments in SyncRepCleanupAtProcExit(), and adjusted > the patch with the suggestion from Fujii-san. Any comments? Thanks for the patch. Looks good to me +1. Regards, — Dongming Liu smime.p7s Description: S/MIME cryptographic signature
Re: progress report for ANALYZE
Hi vignesh! On 2019/09/17 20:51, vignesh C wrote: On Thu, Sep 5, 2019 at 2:31 AM Alvaro Herrera wrote: There were some minor problems in v5 -- bogus Docbook as well as outdated rules.out, small "git diff --check" complaint about whitespace. This v6 (on today's master) fixes those, no other changes. + + The command is preparing to begin scanning the heap. This phase is + expected to be very brief. + In the above after "." there is an extra space, is this intentional. I had noticed that in lot of places there is couple of spaces and sometimes single space across this file. Like in below, there is single space after ".": Time when this process' current transaction was started, or null if no transaction is active. If the current query is the first of its transaction, this column is equal to the query_start column. Sorry for the late reply. Probably, it is intentional because there are many extra space in other documents. See below: # Results of grep = $ grep '\. ' doc/src/sgml/monitoring.sgml | wc -l 114 $ grep '\. ' doc/src/sgml/information_schema.sgml | wc -l 184 $ grep '\. ' doc/src/sgml/func.sgml | wc -l 577 = Therefore, I'm going to leave it as is. :) Thanks, Tatsuro Yamada
On disable_cost
Hi, Postgres has a global variable `disable_cost`. It is set the value 1.0e10. This value will be added to the cost of path if related GUC is set off. For example, if enable_nestloop is set off, when planner trys to add nestloop join path, it continues to add such path but with a huge cost `disable_cost`. But 1.0e10 may not be large enough. I encounter this issue in Greenplum(based on postgres). Heikki tolds me that someone also encountered the same issue on Postgres. So I send it here to have a discussion. My issue: I did some spikes and tests on TPCDS 1TB Bytes data. For query 104, it generates nestloop join even with enable_nestloop set off. And the final plan's total cost is very huge (about 1e24). But If I enlarge the disable_cost to 1e30, then, planner will generate hash join. So I guess that disable_cost is not large enough for huge amount of data. It is tricky to set disable_cost a huge number. Can we come up with better solution? The following thoughts are from Heikki: > Aside from not having a large enough disable cost, there's also the > fact that the high cost might affect the rest of the plan, if we have to > use a plan type that's disabled. For example, if a table doesn't have any > indexes, but enable_seqscan is off, we might put the unavoidable Seq Scan > on different side of a join than we we would with enable_seqscan=on, > because of the high cost estimate. > I think a more robust way to disable forbidden plan types would be to > handle the disabling in add_path(). Instead of having a high disable cost > on the Path itself, the comparison add_path() would always consider > disabled paths as more expensive than others, regardless of the cost. Any thoughts or ideas on the problem? Thanks! Best Regards, Zhenghua Lyu
Re: On disable_cost
On Fri, Nov 1, 2019 at 7:42 PM Zhenghua Lyu wrote: > It is tricky to set disable_cost a huge number. Can we come up with > better solution? What happens if you use DBL_MAX?