Re: patch to ensure logical decoding errors early
On 1 August 2018 at 23:11, Alvaro Herrera wrote: > On 2018-Aug-01, Dave Cramer wrote: > >> See attached patch which fixes it, and adds a test for it. > > Pushed, thanks. Thanks -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
FW: [Todo item] Add entry creation timestamp column to pg_stat_replication
I changed field name from 'reply_time' to 'last_msg_send_time'. Because 'last_msg_send_time' is used in pg_stat_wal_receiver/pg_stat_subsctiption view. I think that field has the same meaning. test example> postgres=# select pid, last_msg_send_time from pg_stat_replication; -[ RECORD 1 ]--+-- pid| 12015 last_msg_send_time | 2018-08-02 18:02:49.233049+09 -[ RECORD 2 ]--+-- pid| 12084 last_msg_send_time | 2018-08-02 18:02:48.583256+09 I Attached new patch file : 0001-Implement-following-TODO-list-item-v2.patch Feedback and suggestion will be very welcome. Thanks! Best regards, Myungkyu, Lim - Original Message - Date : 2018-07-31 17:56 (GMT+9) Title : [Todo item] Add entry creation timestamp column to pg_stat_replication Hello hackers, I have worked on following todo list item. - Add entry creation timestamp column to pg_stat_replication http://archives.postgresql.org/pgsql-hackers/2011-08/msg00694.php This item looks like simple because necessary data was already exist. So, I wrote a prototype patch. test example> postgres=# select pid, reply_time from pg_stat_replication; -[ RECORD 1 ]- pid| 4817 reply_time | 2018-07-31 12:00:53.911198+09 -[ RECORD 2 ]- pid| 4819 reply_time | 2018-07-31 12:00:53.911154+09 Several candidates exist for the field name. - reply_timestamp - info_gen_timestamp - stats_reset - last_msg_send_time Feedback and suggestion will be very welcome. Thanks! Best regards, Myungkyu, Lim 0001-Implement-following-TODO-list-item-v1.patch Description: Binary data 0001-Implement-following-TODO-list-item-v2.patch Description: Binary data
Re: [HACKERS] Parallel Append implementation
On 08/01/2018 03:14 PM, Robert Haas wrote: Committed to master and v11. Thanks for the review. Thanks!
Re: Explain buffers wrong counter with parallel plans
On Thu, Aug 2, 2018 at 10:26 AM, Amit Kapila wrote: > On Thu, Aug 2, 2018 at 8:38 AM, Andres Freund wrote: >> Hi, >> >> On 2018-08-02 08:21:58 +0530, Amit Kapila wrote: >>> I think something on the lines what Tom and you are suggesting can be >>> done with the help of EXEC_FLAG_BACKWARD, but I don't see the need to >>> do anything for this patch. The change in nodeLimit.c is any way for >>> forward scans, so there shouldn't be any need for any other check. >> >> I think this is almost a guarantee to introduce bugs in the future. And >> besides that, as Robert points out, it's essentially an exiting bug for >> custom scans. Given that EXEC_FLAG_BACKWARD already exists, why not do >> the right thing here? >> > > Sure, if we want we can do it in this patch, but this is not the > problem of this patch. It is also related to existing usage of > shutdown callbacks. I think we can use es_top_eflags in Estate to > detect it and then call shutdown only if EXEC_FLAG_BACKWARD is not > set. I think we should do that as a separate patch either before or > after this patch rather than conflating that change into this patch. > IIUC, then Robert also said that we should fix that separately. I > will do as per whatever consensus we reach here. > I have created three patches (a) move InstrStartParallelQuery from its original location so that we perform it just before ExecutorRun (b) patch to fix the gather stats by calling shutdown at appropriate place and allow stats collection in ExecShutdownNode (c) Probit calling ExecShutdownNode if there is a possibility of backward scans (I have done some basic tests with this patch, if we decide to proceed with it, then some more verification and testing would be required). I think we should commit first two patches as that fixes the problem being discussed in this thread and then do some additional verification for the third patch (mentioned in option c). I can understand if people want to commit the third patch before the second patch, so let me know what you guys think. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com postpone_buffer_usage_tracking_v1.patch Description: Binary data fix_gather_stats_v2.patch Description: Binary data prohibit_shutdown_backward_scans_v1.patch Description: Binary data
Re: Have an encrypted pgpass file
On Tue, 24 Jul 2018 at 11:25, Marco van Eck wrote: > Indeed having unencrypted password lying (.pgpass or PGPASSWORD or -W) > around is making my auditors unhappy, > With the greatest of respect, perhaps you need to get auditors who understand crypto better. Having a user that has the minimal permissions to perform the required tasks with a stored password that only the automation user can read is perfectly valid. Encrypting it with a key that must (perforce) be accessible using the same permissions that the user would need in order to to read the unencrypted password file is no more valid (look up "security through obscurity"). Perhaps you could make your auditors happier by restricting that user's permissions to only run a defined function, and make that function do the work that the automation script wants? So even if the attacker can access the password he will still only be able to run that function? (You could even add DOS protection into the function to ensure it's only run so often, if you were worried about that.) Geoff
Re: New Defects reported by Coverity Scan for PostgreSQL
On 08/01/2018 04:07 PM, Emre Hasegeli wrote: I think there are three different things that need to be addressed: * Underspecified comments. I agree. My patch added comments, the next one with actual fixes adds more. I just didn't want to invest even more time on them while the code is its current shape. * The function names and argument names are badly chosen IMO, because even granted a convention such as the above, it's not very obvious what roles "l1" and "l2" play. I'm not exactly sure what would be better, but if you used names like "ofseg" and "otherseg" you'd at least be trying. I'd go with an asymmetrical function name too, to make miswriting of calls less likely. Good idea. I haven't though about that, but now such names makes more sense to me. I will prepare another patch to improve the naming and comments to be applied on top of my current patches. * And lastly, are we sure there aren't actual *bugs* here? I'd initially supposed that lseg_closept_lseg acted as Emre says above, but reading the code makes me think it's the other way around. Its first two potential assignments to *result are definitely assigning points on l2 not l1. Yes, it is wrong beyond understanding. The committed patch intended to keep answers as they were. The next one actually fixes those. OK, so I think we should not do anything about the issues reported by coverity until we commit all the patches. Which may resolve some of those (pre-existing) issues, for example. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Have an encrypted pgpass file
On Thu, 2 Aug 2018 at 10:41, I wrote: > Perhaps you could make your auditors happier by restricting that user's > permissions to only run a defined function, and make that function do the > work that the automation script wants? So even if the attacker can access > the password he will still only be able to run that function? (You could > even add DOS protection into the function to ensure it's only run so often, > if you were worried about that.) > > I realise (of course, after I sent this) that I misunderstood the thrust of your requirement, and that you want the ability to log in *your own user* without entering your own password. Apologies. Ignore me. Geoff
Re: [PATCH] Improve geometric types
On 08/01/2018 01:40 PM, Emre Hasegeli wrote: Ah, so there's an assumption that NaNs are handled earlier and never reach this place? That's probably a safe assumption. I haven't thought about that, it simply seemed suspicious that the code mixes direct comparisons and float8_mi() calls. The comparison functions handle NaNs. The arithmetic functions handle returning error on underflow, overflow and division by zero. I assumed we want to return error on those in any case, but we don't want to handle NaNs at every place. Not sure, I'll leave that up to you. I don't mind doing it in a separate patch (I'd probably prefer that over mixing it into unrelated patch). It is attached separately. OK, thanks. So, have we reached conclusion about all the bits I mentioned on 7/31? The delta and float8/double cast are fixed, and for computeDistance (i.e. doing comparisons directly or using float8_lt), the code may seem a bit inconsistent, but it is in fact correct as the NaNs are handled elsewhere. That seems reasonable, but perhaps a comment pointing that out would be nice. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Ideas for a relcache test mode about missing invalidations
Hello. At Wed, 1 Aug 2018 09:25:18 -0700, Andres Freund wrote in <20180801162518.jnb2ql5dfmgwp...@alap3.anarazel.de> > Hi, > > The issue at [1] is caused by missing invalidations, and [2] seems like > a likely candidate too. I wonder if it'd be good to have a relcache test > mode akin to CLOBBER_CACHE_ALWAYS and RELCACHE_FORCE_RELEASE, that tries > to ensure that we've done sufficiently to ensure the right invalidations > are sent. > > I think what we'd kind of want is to ensure that relcache entries are > rebuilt at the earliest possible time, but *not* later. That'd mean > they're out of date if there's missing invalidations. Unfortunately I'm > not clear on how that'd be achievable? Ideas? > > The best I can come up with is to code some additional dependencies into > CacheInvalidateHeapTuple(), and add tracking ensuring we've sent the > right messages. But that seems somewhat painful and filled with holes. > > [1] > http://archives.postgresql.org/message-id/CAKoxK%2B5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf%3DHEQ%40mail.gmail.com > [2] https://www.postgresql.org/message-id/12259.1532117...@sss.pgh.pa.us As for [1], it is not a issue on invalidation. It happens also if the relation has any index and even drop is not needed. The following steps are sufficient. create table t( pk serial, t text ); insert into t( t ) values( 'hello' ), ('world'); create index idx_fake0 on t (pk); create index idx_fake on t ( f_fake( pk ) ); -- ERROR index_create() creates a new pg_index entry with indislive = true before building it. So the planner (RelationGetIndexList) sees the not-yet-populated index while planning the query in f_fake(). The attached patch fixes the issue, but I'm not sure this story is applicable to [2]. regards. -- Kyotaro Horiguchi NTT Open Source Software Center >From eaa5de68ff27bd43a643089f8c5963e5cc3d20cc Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Thu, 2 Aug 2018 18:52:24 +0900 Subject: [PATCH] Don't use currently-building index to populate it index_create() creates a pg_index entry with indislive = true before populating it. If it is a function index where the function runs a query on the parent relation, planner sees the just created entry and tries to access the heap page that is not created yet. This patch let index_create() not to set indislive = true after population. --- src/backend/catalog/index.c | 55 - 1 file changed, 54 insertions(+), 1 deletion(-) diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index 8b276bc430..b561c8696d 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -124,6 +124,7 @@ static void UpdateIndexRelation(Oid indexoid, Oid heapoid, bool immediate, bool isvalid, bool isready); +static void ActivateIndexRelation(Oid indexoid); static void index_update_stats(Relation rel, bool hasindex, double reltuples); @@ -666,7 +667,7 @@ UpdateIndexRelation(Oid indexoid, values[Anum_pg_index_indisvalid - 1] = BoolGetDatum(isvalid); values[Anum_pg_index_indcheckxmin - 1] = BoolGetDatum(false); values[Anum_pg_index_indisready - 1] = BoolGetDatum(isready); - values[Anum_pg_index_indislive - 1] = BoolGetDatum(true); + values[Anum_pg_index_indislive - 1] = BoolGetDatum(false); values[Anum_pg_index_indisreplident - 1] = BoolGetDatum(false); values[Anum_pg_index_indkey - 1] = PointerGetDatum(indkey); values[Anum_pg_index_indcollation - 1] = PointerGetDatum(indcollation); @@ -693,6 +694,55 @@ UpdateIndexRelation(Oid indexoid, heap_freetuple(tuple); } +/* + * ActivateIndexRelation + * + * Publish index by marking it "relislive" + + * UpdateIndexRelation builds an index relation with relislive = false so as + * not to be used by the quieries used to build the index. This function + * marks the index as "live" so that it can be used hereafter. + * + */ +static void +ActivateIndexRelation(Oid indexoid) +{ + Relation indrel; + SysScanDesc sdesc; + ScanKeyData skey; + HeapTuple tup; + + ScanKeyInit(&skey, +Anum_pg_index_indexrelid, +BTEqualStrategyNumber, F_OIDEQ, +ObjectIdGetDatum(indexoid)); + indrel = heap_open(IndexRelationId, RowExclusiveLock); + sdesc = systable_beginscan(indrel, InvalidOid, true, NULL, 1, &skey); + + /* + * We must see one and only one entry for the key. But don't bother + * checking that. + */ + while (HeapTupleIsValid(tup = systable_getnext(sdesc))) + { + Datum values[Natts_pg_index]; + bool nulls[Natts_pg_index]; + bool replaces[Natts_pg_index]; + + MemSet(values, 0, sizeof(values)); + MemSet(nulls, 0, sizeof(nulls)); + MemSet(replaces, 0, sizeof(replaces)); + values[Anum_pg_index_indislive] = BoolGetDatum(true); + replaces[Anum_pg_index_indislive] = true; + tup = heap_modify_tuple(tup, RelationGetDescr(indrel), +values, nulls, repla
Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.
(2018/08/02 2:44), Robert Haas wrote: Tom, Ashutosh, and I all seem to agree that we shouldn't try to re-jigger things at create-plan time. I think that a 3-1 consensus against your proposal is sufficient to say we shouldn't go that way. Agreed. Now, here you've got a new approach which avoids that, which I have not yet reviewed. I'll try to spend some time on it this afternoon, but really, I think it's too late for this. This bug was reported in February, and we're supposed to be pushing 11 final out the door in not much more than a month. Proposing a new approach in August is not good. Agreed. Can't we just do what Ashutosh proposed for now and revisit this for v12? I think that may be possible. Best regards, Etsuro Fujita
Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.
(2018/08/02 4:30), Robert Haas wrote: On Wed, Aug 1, 2018 at 7:44 AM, Etsuro Fujita wrote: I updated the patch that way. Updated patch attached. I fixed a bug and did a bit of cleanups as well. Looking this over from a technical point of view, I think it's better than what you proposed before but I still don't agree with the approach. I don't think there is any getting around the fact that converted whole-row vars are going to result in some warts. Ashutosh inserted most of the necessary warts in the original partition-wise join patch, but missed some cases in postgres_fdw; his approach is, basically, to add the matching warts there. Your approach is to rip out all of those warts and insert different ones. You've simplified build_tlist_index_other_vars() and add_placeholders_to_joinrel() and build_joinrel_tlist() to basically what they were before partition-wise join went in. On the other hand, RelOptInfo grows three new fields, Sorry, I forgot to remove one of the fields that isn't needed anymore. RelOptInfo needs two new fields, in the new approach. set_append_rel_size() ends up more complicated than it was before when you include the node code added in the build_childrel_tlist function it calls, build_child_joinrel() has a different set of complications, and most importantly create_append_path() and create_merge_append_path() now do surgery on their input paths that knows specifically about the converted-whole-row var case. I would be glad to be rid of the stuff that you're ripping out, but in terms of complexity, I don't think we really come out ahead with the stuff you're adding. The new approach seems to me more localized than what Ashutosh had proposed. One obvious advantage of the new approach is that we no longer need changes to postgres_fdw for it to work for partitionwise joins, which would apply for any other FDWs, making the FDW authors' life easy. I'm also still concerned about the design. In your old approach, you were creating the paths with with the wrong target list and then fixing it when you turned the paths into a plan. In your new approach, you're still creating the paths with the wrong target list, but now you're fixing it when you put an Append or MergeAppend path on top of them. I still think it's a bad idea to have anything other than the correct paths in the target list from the beginning. I don't think so; actually, the child's targetlist would not have to be the same as the parent's until the child gets in under the parent's Append or MergeAppend. So, I modified the child's target list so as to make it easy to join the child relations. For one thing, what happens if no Append or MergeAppend is ever put on top of them? One way that can happen today is partition-wise aggregate, although I haven't been able to demonstrate a bug, so maybe that's covered somehow. Right. In the new approach, the targetlist for the topmost child scan/join is guaranteed to be the same as that for the topmost parent scan/join by the planner work that I added. But in general, with your approach, any code that looks at the tlist for a child-join has to know that the tlist is to be used as-is *except that* ConvertRowtypeExpr may need to be inserted. I think that special case could be easy to overlook, and it seems pretty arbitrary. I think we can check to see whether the conversion is needed, from the flags added to RelOptInfo. A tlist is a list of arbitrary expressions to be produced; with this patch, we'd end up with the tlist being the list of expressions to be produced in every case *except for* child-joins containing whole-row-vars. I just can't get behind a strange exception like that. I have to admit that it's weird to adjust the child's targetlist in that case when putting the child under the parent's Append/MergeAppend, but IMHO I think that would be much localized. Thanks for the comments! Best regards, Etsuro Fujita
Problem during Windows service start
Hi, This is my first posting to the mailing list. Currently our customer uses PostgreSQL 9.5 and hits problem during Windows service start. The Windows service status of the instance is different from actual status. We got the following situation. 1. Register service with 'pg_ctl register -N "PostgreSQL" -U postgres -P -D D:\data\inst1 -w' 2. Start the instance from the Windows service screen. 3. After 60 seconds, the startup process fails with a timeout. Because crash recovery takes a lot of times. Then, the service status of the instance become "STOPPED", but the instance was running. It cannot be stopped from the Windows service screen (it can be stopped only with pg_ctl). PostgreSQL version : 9.5.12 Operating system : Windows Server 2012 R2 I think this is a bug. I think it has not been fixed in the latest version, is my understanding correct? If it is correct, I will fix it. Regards, SAKAI Teppei
Re: Allow COPY's 'text' format to output a header
Simon Muller wrote: > I changed the error type and message for consistency with other similar > errors in that file. Whenever options are combined that are incompatible, > it looks like the convention is for a ERRCODE_SYNTAX_ERROR to be thrown. That makes sense, thanks for elaborating, although there are also a fair number of ERRCODE_FEATURE_NOT_SUPPORTED in copy.c that are raised on forbidden/nonsensical combination of features, so the consistency argument could work both ways. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
Re: [HACKERS] Restricting maximum keep segments by repslots
On Tue, Jul 31, 2018 at 9:52 PM, Kyotaro HORIGUCHI wrote: > I thought it's to be deprecated for some reason so I'm leaving > wal_keep_segments in '# of segments' even though the new GUC is > in MB. I'm a bit uneasy that the two similar settings are in > different units. Couldn't we turn it into MB taking this > opportunity if we will keep wal_keep_segments, changing its name > to min_wal_keep_size? max_slot_wal_keep_size could be changed to > just max_wal_keep_size along with it. This seems like it's a little bit of a separate topic from what this thread about, but FWIW, +1 for standardizing on MB. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should contrib modules install .h files?
On Tue, Jul 31, 2018 at 5:53 PM, Tom Lane wrote: > By my count of people expressing opinions, we had Michael -1, Stephen -1, > me -0.1 or so, Alvaro +1, Peter -1, presumably +1 from Andrew; and Andres > made a comment about not waiting, which perhaps Andrew read as a +1 for > backpatching. So it's not unreasonable for Andrew to have decided that > it was about tied. Nonetheless, it does seem like a feature and we're > past feature freeze, so the default assumption ought to be "no backpatch" > IMO. Yeah, I would have voted -1 if I'd realized that it was close. Now we're in a situation where we have patch not everyone likes not only in master (which is OK, because we've got a year to decide whether to change anything) but also in v11 (where we have a lot less time). That's not so great. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Stored procedures and out parameters
Apologies for disappearing from this conversation for a week. First off, on the .NET side I have the exact same issue that Vladimir Sitnikov described for the Java side. The .NET database API (ADO.NET) has a standard, portable way for calling "server-side code". Since stored procedures are in PostgreSQL, this portable API was implemented years ago to invoke functions, which were the only thing in existence (so Npgsql issues SELECT * FROM function()). Now that stored procedures have been introduced, it's impossible to change what the portable API means without massive backwards compatibility issues for all programs which already rely on the API calling *functions*. In other words, I really do hope that on the PostgreSQL side you consider allowing both functions and procedures to be invoked via CALL. Npgsql (and likely pgjdbc) would then be able to change the portable API to send CALL instead of SELECT, avoiding all backwards compatibility issues (they would do that only for PostgreSQL 11 and above). For now I'm telling users on the beta version to avoid the API altogether (writing CALL SQL manually), which as Vladimir wrote above is bad for portability. > If you "DESCRIBE CALL my_func()" you get back a NoData response; it doesn't try to inspect the RETURNS clause of the function even though in theory it could. The client is using CALL so that is it should expect to receive. That said I'm not entirely clear whether the NoData response is a fixed bug or not... Uh, this sounds like something we really need to understand... How is a driver supposed to know what data types are being returned if it can't use Describe? DataRow messages contain only field lengths and values, so having a type OID is critical for knowing how to interpret the data, and that currently is only available by sending a Describe on a statement... Npgsql currently always sends a describe as part of statement execution (for server-prepared messages the describe is done only once, at preparation-time). Vladimir, are you doing things differently here? On Tue, Jul 24, 2018 at 7:57 PM, David G. Johnston < david.g.johns...@gmail.com> wrote: > On Tue, Jul 24, 2018 at 11:31 AM, Daniel Verite > wrote: > >> David G. Johnston wrote: >> >> > > 2. A stored procedure can decide dynamically of >> > > the structure of the resultset(s) it returns, >> > > and the caller will discover it as they're returned, not >> > > before. >> > > >> > >> > The function itself doesn't care - this concern is about SELECT vs CALL >> > invocation only, not the script definition. >> >> It does care, because CREATE FUNCTION has a RETURNS clause and >> matching RETURN statements in the body, whereas CREATE PROCEDURE >> doesn't and (will?) have a different syntax for producing resultsets. >> > > But why does whatever code that implements CALL have to care? > > In Object Oriented terms why can not both procedures and functions > implement a "EXECUTE_VIA_CALL" interface; while functions additionally > implement a "EXECUTE_VIA_SELECT" interface - the one that they do today. > > ISTM that any (most) function could be trivially rewritten into a > procedure (or wrapped by one) in a mechanical fashion which could then be > executed via CALL. I'm proposing that instead of having people write their > own wrappers we figure out what the mechanical wrapper looks like, ideally > based upon the public API of the function, and create it on-the-fly > whenever said function is executed via a CALL statement. > > As for the invocation, that's just the starting point. At this point >> the driver doesn't know from '{call x}' whether x is a procedure or a >> function in Postgres, hence the request for a syntax that would work >> for both. Okay, but aside from that, if there are results, the driver >> needs to get them in a way that works without knowing wether it's a >> function or procedure. How would that happen? >> > > I'm saying that the driver needs to rewrite {call x} as "CALL x()" and > expect optional resultsets and optional output arguments. For functions > invoked as procedures this would be a single resultset with zero output > arguments. Which is exactly the same end-user result that is received > today when "SELECT * FROM x()" is used. > > >> Back to the first message of the thread, Shay Rojansky was saying: >> >> "However, connecting via Npgsql, which uses the extended protocol, I >> see something quite different. As a response to a Describe PostgreSQL >> message, I get back a NoData response rather than a RowDescription >> message" >> >> Why would a Describe on a "CALL myproc()" even work if we >> accept the premise that myproc() does not advertise what it may return, >> contrary to a "SELECT * FROM function()"? >> This issue goes beyond a "syntactic bridge" between CALL and SELECT. > > > If you "DESCRIBE CALL my_func()" you get back a NoData response; it > doesn't try to inspect the RETURNS clause of the function even though in > theory it could. The
Re: [Patch] Create a new session in postmaster by calling setsid()
Paul Guo writes: > [ make the postmaster execute setsid() too ] I'm a bit skeptical of this proposal. Forcing the postmaster to dissociate from its controlling terminal is a good thing in some scenarios, but probably less good in others, and manual postmaster starts are probably mostly in the latter class. I wonder whether having "pg_ctl start" do a setsid() would accomplish the same result with less chance of breaking debugging usages. regards, tom lane
Re: Problems with plan estimates in postgres_fdw
Andrew Gierth writes: > [ postgres_fdw is not smart about exploiting fast-start plans ] Yeah, that's basically not accounted for at all in the current design. > One possibility: would it be worth adding an option to EXPLAIN that > makes it assume cursor_tuple_fraction? [ handwaving ahead ] I wonder whether it could be done without destroying postgres_fdw's support for old servers, by instead including a LIMIT in the query sent for explaining. The trick would be to know what value to put as the limit, though. It'd be easy to do if we were willing to explain the query twice (the second time with a limit chosen as a fraction of the rowcount seen the first time), but man that's an expensive solution. Another component of any real fix here would be to issue "SET cursor_tuple_fraction" before opening the execution cursor, so as to ensure that we actually get an appropriate plan on the remote side. If we could tell whether there's going to be any use in fast-start plans, it might make sense to build two scan paths for a foreign table, one based on a full-table scan and one based on EXPLAIN ... LIMIT 1. This still means two explain requests, which is why I'm not thrilled about doing it unless there's a high probability of the extra explain being useful. regards, tom lane
Re: Should contrib modules install .h files?
Robert Haas writes: > Yeah, I would have voted -1 if I'd realized that it was close. Now > we're in a situation where we have patch not everyone likes not only > in master (which is OK, because we've got a year to decide whether to > change anything) but also in v11 (where we have a lot less time). > That's not so great. It seems like there were two separate areas of disagreement/questioning, one being the file layout (whether to add per-extension subdirectories) and then one about how one would actually use this, ie what would the -I switch(es) look like and where would they get injected. My impression is that there was consensus for per-extension subdirectories, but the usage scenario isn't totally designed yet. In principle, only the layout question has to be resolved to make it OK to ship this in v11. On the other hand, if there's no very practical way to use the per-extension subdirectory layout, we might have to rethink that layout. So I'd be happier about this if that question were answered promptly. Failing a satisfactory answer in the near future, I think we need to revert in v11. Also, "near future" means "before Monday". I don't want to ship beta3 with this in place if we end up reverting later, because it'd mean thrashing packagers' file manifests, which they won't appreciate. It might be best to revert in v11 for now, and then we can put it back after beta3 if there's agreement that the questions are satisfactorily resolved. regards, tom lane
Re: Allow COPY's 'text' format to output a header
> On Aug 2, 2018, at 8:11 AM, Daniel Verite wrote: > > That makes sense, thanks for elaborating, although there are also > a fair number of ERRCODE_FEATURE_NOT_SUPPORTED in copy.c > that are raised on forbidden/nonsensical combination of features, > so the consistency argument could work both ways. > If there is not a strong reason to change the error code, then I believe we should not. The error is the same as it was before, just narrower in scope. Best, -Cynthia
Re: [HACKERS] logical decoding of two-phase transactions
>> They can be, but currently they might not be. So this requires at least >> big fat warning in docs and description on how to access user catalogs >> from plugins correctly (ie to always use systable_* API on them). It >> would be nice if we could check for it in Assert builds at least. > Ok, modified the sgml documentation for the above. > Yea, I agree. I think we should just consider putting similar checks in > the general scan APIs. With an unlikely() and the easy predictability of > these checks, I think we should be fine, overhead-wise. > Ok, added unlikely() checks in the heap_* scan APIs. Revised patchset attached. Regards, Nikhils > Greetings, > > Andres Freund -- Nikhil Sontakke http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services 0001-Cleaning-up-of-flags-in-ReorderBufferTXN-structure.patch Description: Binary data 0002-Support-decoding-of-two-phase-transactions-at-PREPAR.patch Description: Binary data 0003-Gracefully-handle-concurrent-aborts-of-uncommitted-t.patch Description: Binary data 0004-Teach-test_decoding-plugin-to-work-with-2PC.patch Description: Binary data
Re: [HACKERS] Bug in to_timestamp().
On Wed, Aug 1, 2018 at 3:17 PM Arthur Zakirov wrote: > On Mon, Jul 23, 2018 at 05:21:43PM +0300, Alexander Korotkov wrote: > > Thank you, Arthur. These examples shows downside of this patch, where > > users may be faced with incompatibility. But it's good that this > > situation can be handled by altering format string. I think these > > examples should be added to the documentation and highlighted in > > release notes. > > I updated the documentation. I added a tip text which explains > how to_timestamp() and to_date() handled ordinary text prior to > PostgreSQL 11 and how they should handle ordinary text now. > > There is now changes in the code and tests. Thank you, Arthur! I made following observations on Oracle behavior. 1) Oracle also supports parsing TZH in to_timestamp_tz() function. Previously, in order to handle TZH (which could be negative) we decided to skip spaces before fields, but not after fields [1][2]. That leads to behavior, which is both incompatible with Oracle and unlogical (at least in my opinion). But after exploration of to_timestamp_tz() behavior I found that Oracle can distinguish minus sign used as separator from minus sign in TZH field. # select to_char(to_timestamp_tz('2018-01-01 -10', ' MM DD TZH'), '-MM-DD HH24:MI:SS TZH:TZM') from dual 2018-01-01 00:00:00 +10:00 # select to_char(to_timestamp_tz('2018-01-01--10', ' MM DD TZH'), '-MM-DD HH24:MI:SS TZH:TZM') from dual 2018-01-01 00:00:00 +10:00 # select to_char(to_timestamp_tz('2018-01-01 -10', ' MM DD TZH'), '-MM-DD HH24:MI:SS TZH:TZM') from dual 2018-01-01 00:00:00 -10:00 # select to_char(to_timestamp_tz('2018-01-01---10', ' MM DD TZH'), '-MM-DD HH24:MI:SS TZH:TZM') from dual 2018-01-01 00:00:00 -10:00 So, if number of spaces/separators between fields in input string is more than in format string and list space/separator skipped is minus sign, then it decides to glue that minus sign to TZH. I think this behavior is nice to follow, because it allows us to skip spaces after fields. 2) It appears that Oracle skips spaces not only before fields, but also in the beginning of the input string. SELECT to_timestamp(' -2018', ' ') FROM dual # 01.08.2018 00:00:00 In the example given it seems that first space is skipped, while minus sign is matched to space. 3) When multiple separators are specified in the format string, Oracle doesn't allow skipping arbitrary number of spaces before each separator as we did. # SELECT to_timestamp('2018- -01 02', '--MM-DD') FROM dual ORA-01843: not a valid month 4) Spaces and separators in format string seem to be handled in the same way in non-FX mode. But strange things happen when you mix spaces and separators there. # SELECT to_timestamp('2018- -01 02', '---MM-DD') FROM dual 02.01.2018 00:00:00 # SELECT to_timestamp('2018- -01 02', ' MM-DD') FROM dual 02.01.2018 00:00:00 # SELECT to_timestamp('2018- -01 02', '- -MM-DD') FROM dual ORA-01843: not a valid month After some experiments I found that when you mix spaces and separators between two fields, then Oracle takes into account only length of last group of spaces/separators. # SELECT to_timestamp('2018- -01 02', ' --- --MM-DD') FROM dual2018-01-01 00:00:00 -10:00 (length of last spaces/separators group is 2) # SELECT to_timestamp('2018- -01 02', ' --- --MM-DD') FROM dual 2018-01-01 00:00:00 -10:00 (length of last spaces/separators group is 3) # SELECT to_timestamp('2018- -01 02', ' -- ---MM-DD') FROM dual 02.01.2018 00:00:00 (length of last spaces/separators group is 2) 1+2+3 are implemented in the attached patch, but not 4. I think that it's only worth to follow Oracle when it does reasonable things. I also plan to revise documentation and regression tests in the patch. But I wanted to share my results so that everybody can give a feedback if any. 1. https://www.postgresql.org/message-id/CAEepm%3D1Y7z1VSisBKxa6x3E-jP-%2B%3DrWfw25q_PH2nGjqVcX-rw%40mail.gmail.com 2. https://www.postgresql.org/message-id/20180112125848.GA32559%40zakirov.localdomain-- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company 0001-to-timestamp-format-checking-v14.patch Description: Binary data
Re: Ideas for a relcache test mode about missing invalidations
Hi, On 2018-08-02 19:18:11 +0900, Kyotaro HORIGUCHI wrote: > At Wed, 1 Aug 2018 09:25:18 -0700, Andres Freund wrote > in <20180801162518.jnb2ql5dfmgwp...@alap3.anarazel.de> > > Hi, > > > > The issue at [1] is caused by missing invalidations, and [2] seems like > > a likely candidate too. I wonder if it'd be good to have a relcache test > > mode akin to CLOBBER_CACHE_ALWAYS and RELCACHE_FORCE_RELEASE, that tries > > to ensure that we've done sufficiently to ensure the right invalidations > > are sent. > > > > I think what we'd kind of want is to ensure that relcache entries are > > rebuilt at the earliest possible time, but *not* later. That'd mean > > they're out of date if there's missing invalidations. Unfortunately I'm > > not clear on how that'd be achievable? Ideas? > > > > The best I can come up with is to code some additional dependencies into > > CacheInvalidateHeapTuple(), and add tracking ensuring we've sent the > > right messages. But that seems somewhat painful and filled with holes. > > > > [1] > > http://archives.postgresql.org/message-id/CAKoxK%2B5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf%3DHEQ%40mail.gmail.com > > [2] https://www.postgresql.org/message-id/12259.1532117...@sss.pgh.pa.us > > As for [1], it is not a issue on invalidation. It happens also if > the relation has any index and even drop is not needed. The > following steps are sufficient. Huh? I don't think this is a proper fix. But please let's argue over in the other that in the other thread. Greetings, Andres Freund
Re: Problem during Windows service start
Sakai, Teppei wrote: > This is my first posting to the mailing list. > > Currently our customer uses PostgreSQL 9.5 and hits problem during Windows > service start. > The Windows service status of the instance is different from actual status. > > We got the following situation. > 1. Register service with 'pg_ctl register -N "PostgreSQL" -U postgres -P > -D D:\data\inst1 -w' > 2. Start the instance from the Windows service screen. > 3. After 60 seconds, the startup process fails with a timeout. >Because crash recovery takes a lot of times. > > Then, the service status of the instance become "STOPPED", > but the instance was running. > It cannot be stopped from the Windows service screen (it can be stopped only > with pg_ctl). > > PostgreSQL version : 9.5.12 > Operating system : Windows Server 2012 R2 > > I think this is a bug. > I think it has not been fixed in the latest version, is my understanding > correct? > If it is correct, I will fix it. I agree that this is not nice. How do you propose to fix it? Yours, Laurenz Albe
Re: Stored procedures and out parameters
Shay>Npgsql currently always sends a describe as part of statement execution (for server-prepared messages the describe is done only once, at preparation-time). Vladimir, are you doing things differently here? The same thing is for pgjdbc. It does use describe to identify result row format. However, "CALL my_proc()" works just fine with current git master for both simple and extended protocol. The missing part is "invoke functions via CALL statement". Vladimir
Fallout from PQhost() semantics changes
Traditionally (prior to v10), PQhost() returned the "host" connection parameter if that was nonempty, otherwise the default host name (DEFAULT_PGSOCKET_DIR or "localhost" depending on platform). That got whacked around to a state of brokenness in v10 (which I'll return to in a bit), and then commit 1944cdc98 fixed it to return the active host's connhost[].host string if nonempty, else the connhost[].hostaddr string if nonempty, else an empty string. Together with the fact that the default host name gets inserted into connhost[].host if neither option is supplied, that's compatible with the traditional behavior when host is supplied or when both options are omitted. It's not the same when only hostaddr is supplied. This change is generally a good thing: returning the default host name is pretty misleading if hostaddr actually points at some remote server. However, it seems that insufficient attention was paid to whether *every* call site is OK with it. In particular, libpq has several internal calls to PQhost() to get the host name to be compared to a server SSL certificate, or for comparable usages in GSS and SSPI authentication. These changes mean that sometimes we will be comparing the server's numeric address, not its hostname, to the server auth information. I do not think that was the intention; it's certainly in direct contradiction to our documentation, which clearly says that the host name parameter and nothing else is used for this purpose. It's not clear to me if this could amount to a security problem, but at the least it's wrongly documented. What I think we should do about it is change those internal calls to fetch connhost[].host directly instead of going through PQhost(), as in the attached libpq-internal-PQhost-usage-1.patch. This will restore the semantics to what they were pre-v10, including erroring out when hostaddr is supplied without host. I also noted that psql's \conninfo code takes it upon itself to substitute the value of the hostaddr parameter, if used, for the result of PQhost(). This is entirely wrong/unhelpful if multiple host targets were specified; moreover, that patch failed to account for the very similar connection info printout in do_connect(). Given the change in PQhost's behavior I think it'd be fine to just drop that complexity and print PQhost's result without any editorialization, as in the attached psql-conninfo-PQhost-usage-1.patch. I would also like to make the case for back-patching 1944cdc98 into v10. I'm not sure why that wasn't done to begin with, because v10's PQhost() is just completely broken for cases involving a hostaddr specification: if (!conn) return NULL; if (conn->connhost != NULL && conn->connhost[conn->whichhost].type != CHT_HOST_ADDRESS) return conn->connhost[conn->whichhost].host; else if (conn->pghost != NULL && conn->pghost[0] != '\0') return conn->pghost; else { #ifdef HAVE_UNIX_SOCKETS return DEFAULT_PGSOCKET_DIR; #else return DefaultHost; #endif } In the CHT_HOST_ADDRESS case, it will either give back the raw host parameter (again, wrong if multiple hosts are targeted) or give back DEFAULT_PGSOCKET_DIR/DefaultHost if the host wasn't specified. Ignoring the brokenness for multiple target hosts, you could argue that that's compatible with pre-v10 behavior ... but it's still pretty misleading to give back DefaultHost, much less DEFAULT_PGSOCKET_DIR, for a remote connection. (There's at least some chance that the hostaddr is actually 127.0.0.1 or ::1. There is no chance that DEFAULT_PGSOCKET_DIR is an appropriate description.) Given that we whacked around v10 libpq's behavior for some related corner cases earlier this week, I think it'd be OK to change this in v10. If we do, it'd make sense to back-patch psql-conninfo-PQhost-usage-1.patch into v10 as well. I think that libpq-internal-PQhost-usage-1.patch should be back-patched to v10 in any case, since whether or not you want to live with the existing behavior of PQhost() in v10, it's surely not appropriate for comparing to server SSL certificates. In fact, I think there's probably a good case for doing something comparable to libpq-internal-PQhost-usage-1.patch all the way back. In exactly what scenario is it sane to be comparing "/tmp" or "localhost" to a server's SSL certificate? regards, tom lane diff --git a/src/interfaces/libpq/fe-auth.c b/src/interfaces/libpq/fe-auth.c index 3b2073a..09012c5 100644 --- a/src/interfaces/libpq/fe-auth.c +++ b/src/interfaces/libpq/fe-auth.c @@ -199,7 +199,7 @@ pg_GSS_startup(PGconn *conn, int payloadlen) min_stat; int maxlen; gss_buffer_desc temp_gbuf; - char *host = PQhost(conn); + char *host = conn->connhost[conn->whichhost].host; if (!(host && host[0] != '\0')) { @@ -414,7 +414,7 @@ pg_SSPI_startup(PGconn *conn, int use_negotiate, int payloadlen) { SECURITY_STATUS r; TimeStamp expire; - char *host = PQhost(
Re: Should contrib modules install .h files?
On 2018-08-02 10:54:00 -0400, Tom Lane wrote: > Also, "near future" means "before Monday". I don't want to ship beta3 > with this in place if we end up reverting later, because it'd mean > thrashing packagers' file manifests, which they won't appreciate. > It might be best to revert in v11 for now, and then we can put it back > after beta3 if there's agreement that the questions are satisfactorily > resolved. +1
Re: Making "COPY partitioned_table FROM" faster
On 02/08/2018 01:36, David Rowley wrote: > On 31 July 2018 at 11:51, David Rowley wrote: >> The attached v6 delta replaces the v5 delta and should be applied on >> top of the full v4 patch. > > (now committed) > > Many thanks for committing this Peter and many thanks to Melanie and > Karen for reviewing it! I problems with my email as I was committing this, which is why I didn't let you know. ;-) -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Should contrib modules install .h files?
> "Tom" == Tom Lane writes: Tom> It seems like there were two separate areas of Tom> disagreement/questioning, one being the file layout (whether to Tom> add per-extension subdirectories) and then one about how one would Tom> actually use this, ie what would the -I switch(es) look like and Tom> where would they get injected. Tom> My impression is that there was consensus for per-extension Tom> subdirectories, but the usage scenario isn't totally designed yet. Tom> In principle, only the layout question has to be resolved to make Tom> it OK to ship this in v11. Currently, everything is agnostic about the usage scenario - the existing extension include files will work with either -I$(includedir_server)/extension and #include "hstore/hstore.h", or with -I$(includedir_server)/extension/hstore and #include "hstore.h". Tom> On the other hand, if there's no very practical way to use the Tom> per-extension subdirectory layout, What constitutes "practical"? Right now it seems unlikely that there's much of a use case for referring to more than two different extensions at a time (a pl and a data type, for building a transform module outside either the pl's or the type's source tree). Referring to one is more likely (in my case, hstore_pllua is written to build inside pllua-ng's source tree, so all it needs is to get at hstore.h). -- Andrew (irc:RhodiumToad)
Re: Should contrib modules install .h files?
> "Andres" == Andres Freund writes: >> Also, "near future" means "before Monday". I don't want to ship >> beta3 with this in place if we end up reverting later, because it'd >> mean thrashing packagers' file manifests, which they won't >> appreciate. It might be best to revert in v11 for now, and then we >> can put it back after beta3 if there's agreement that the questions >> are satisfactorily resolved. Andres> +1 On the other hand, _I'm_ getting pressure from at least one packager to nail down a final release of pllua-ng so they can build it along with beta3 (in place of the old broken pllua), which obviously I can't do if I keep having to fiddle with workarounds for hstore.h. -- Andrew (irc:RhodiumToad)
Re: Should contrib modules install .h files?
On 2018-08-02 17:53:17 +0100, Andrew Gierth wrote: > > "Andres" == Andres Freund writes: > > >> Also, "near future" means "before Monday". I don't want to ship > >> beta3 with this in place if we end up reverting later, because it'd > >> mean thrashing packagers' file manifests, which they won't > >> appreciate. It might be best to revert in v11 for now, and then we > >> can put it back after beta3 if there's agreement that the questions > >> are satisfactorily resolved. > > Andres> +1 > > On the other hand, _I'm_ getting pressure from at least one packager to > nail down a final release of pllua-ng so they can build it along with > beta3 (in place of the old broken pllua), which obviously I can't do if > I keep having to fiddle with workarounds for hstore.h. I just don't have a lot of sympathy for that, given the months-after-freeze-window timing. It's not like *any* of this just started to be problematic in v11. Greetings, Andres Freund
Re: doc - add missing documentation for "acldefault"
Hello Pavel, I couldn't find the documentation. Attached patch adds one. Probably this function should have been named pg_*. Too late. I think this is intentionally hidden function, like others started with acl*. Yep, I thought of that. However, the point of having hidden and/or undocumented functions fails me: they are hard/impossible to find if you do not know they exist from the start, and if you ever find one you do not know what they do without reading the source code in detail, eg to know what to give arguments to get an answer. So I assumed that it was more lazyness and could be remedied with a doc patch for the one function I read. Maybe it would make sense to document the others as well, which are more straightforward. Also, that does not explain why they do not use a "pg_" prefix, which is already a way of saying "this is an internal-purpose function". So I would be in favor of prefixing them with pg_: as it is an undocumented feature, there would not be a compatibility break, somehow:-) postgres=# \df acl* List of fun Schema | Name | Result data type | +-+--+-- pg_catalog | aclcontains | boolean | aclitem[], aclitem pg_catalog | acldefault | aclitem[] | "char", oid pg_catalog | aclexplode | SETOF record | acl aclitem[], OUT grantor oid, O pg_catalog | aclinsert | aclitem[] | aclitem[], aclitem pg_catalog | aclitemeq | boolean | aclitem, aclitem pg_catalog | aclitemin | aclitem | cstring pg_catalog | aclitemout | cstring | aclitem pg_catalog | aclremove | aclitem[] | aclitem[], aclitem -- Fabien.
Re: Should contrib modules install .h files?
Andrew Gierth writes: > "Tom" == Tom Lane writes: > Tom> My impression is that there was consensus for per-extension > Tom> subdirectories, but the usage scenario isn't totally designed yet. > Tom> In principle, only the layout question has to be resolved to make > Tom> it OK to ship this in v11. > Currently, everything is agnostic about the usage scenario - the > existing extension include files will work with either > -I$(includedir_server)/extension and #include "hstore/hstore.h", or with > -I$(includedir_server)/extension/hstore and #include "hstore.h". Well, the point is we don't want agnosticism --- we want a clear usage model for people to follow. > Tom> On the other hand, if there's no very practical way to use the > Tom> per-extension subdirectory layout, > What constitutes "practical"? Something that copes with selecting the right headers if you're rebuilding a module whose headers are already installed (as you mentioned upthread). Something that copes with different modules installing headers with the same base name. Allowing for that was the driving force for going with subdirectory-per-extension, but if we really want that to work, there seems to be no alternative but for extensions to write qualified header names (#include "hstore/hstore.h" not #include "hstore.h"). Andres, for one, seemed to think that wouldn't play nicely with PGXS, though I'm not sure why not. Maybe he only meant that an extension's *own* headers shouldn't be referenced that way. Maybe this all just works without much thought, but given that smart people like Peter E. seem to be unsure of that, I'd sure like to see a concrete set of rules that extensions should follow for this. There's also a question of whether we need to change anything in contrib/ so that it plays by whatever rules we set. There's an expectation that contrib modules should be buildable with PGXS, so they need to follow the rules. regards, tom lane
Re: doc - add missing documentation for "acldefault"
Hello Fabien, However, the point of having hidden and/or undocumented functions fails me: they are hard/impossible to find if you do not know they exist from the start, and if you ever find one you do not know what they do without reading the source code in detail, eg to know what to give arguments to get an answer. At first, we must decide in which cases users will use them. And I don't see such cases. I must to know how to grant privileges, how to revoke them and how to check existing priveleges. All theese tasks documented in GRANT, REVOKE commands and system catalog descriptions. Your's patch from another thread closes the last hole - describing default privileges in various psql commands. - Pavel Luzanov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Explain buffers wrong counter with parallel plans
On Thu, Aug 2, 2018 at 5:41 AM, Amit Kapila wrote: > I have created three patches (a) move InstrStartParallelQuery from its > original location so that we perform it just before ExecutorRun (b) > patch to fix the gather stats by calling shutdown at appropriate place > and allow stats collection in ExecShutdownNode (c) Probit calling > ExecShutdownNode if there is a possibility of backward scans (I have > done some basic tests with this patch, if we decide to proceed with > it, then some more verification and testing would be required). > > I think we should commit first two patches as that fixes the problem > being discussed in this thread and then do some additional > verification for the third patch (mentioned in option c). I can > understand if people want to commit the third patch before the second > patch, so let me know what you guys think. I'm happy with the first two patches. In the third one, I don't think "See ExecLimit" is a good thing to put a comment like this, because it's too hard to find the comment to which it refers, and because future commits are likely to edit or remove that comment without noticing the references to it from elsewhere. Instead I would just write, in all three places, /* If we know we won't need to back up, we can release resources at this point. */ or something like that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should contrib modules install .h files?
On Thu, Aug 2, 2018 at 12:56 PM, Andres Freund wrote: >> On the other hand, _I'm_ getting pressure from at least one packager to >> nail down a final release of pllua-ng so they can build it along with >> beta3 (in place of the old broken pllua), which obviously I can't do if >> I keep having to fiddle with workarounds for hstore.h. > > I just don't have a lot of sympathy for that, given the > months-after-freeze-window timing. It's not like *any* of this just > started to be problematic in v11. Yeah, I agree. Andrew, it seems to me that you brought this problem on yourself. You rammed a patch through four months after feature freeze with a marginal consensus and are now complaining about having to spend more time on it. Well, that's a self-inflicted injury. If you don't commit features with a marginal consensus and/or don't do it four months after feature freeze, you won't get nearly as much pushback. Don't get me wrong -- I think you've done a lot of great work over the years and I'm glad to have you involved both as a contributor and as a committer of your own patches and those of others. It's just that your email here reads as if you think that your commit privileges are for your benefit rather than that of the project, and that's not how it works. Everybody here is free to pursue the things they think are interesting and the diversity of such things is part of the strength of the project, but nobody is free to ram through changes that make life better for them and worse for other people, even if they don't agree with the complaints those other people make. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: FailedAssertion on partprune
[ bringing this discussion back to the original thread ] David Rowley writes: > The attached patch removes the Assert, but I think it should be > probably be done as part of [1]'s patch since that's also adding the > code to handle subplans for tables that don't belong in the partition > hierarchy. Over in that thread, we had David Rowley writes: > On 2 August 2018 at 11:48, Tom Lane wrote: >> I didn't include this change because (a) it's late, (b) no test >> case was included, and (c) I don't entirely believe it anyway. >> How would a rel be both leaf and nonleaf? Isn't this indicative >> of a problem somewhere upstream in the planner? > It's probably best discussed on the other thread, but it seems to be > by design in accumulate_append_subpath(). Parallel Append nodes don't > get flattened if they contain a mix of parallel aware and non-parallel > aware subplans. Yeah, looking at the explain posted upthread, the issue is specifically that some of the child paths might be for Gathers over subsets of the partitioning hierarchy. It's not real clear to me whether such a subset would necessarily correspond to a subtree of the hierarchy, but if it does, you'd want to be able to apply run-time pruning at the parent Append, in hopes of not having to execute the Gather at all. Separately from that, if you do execute the Gather, applying runtime pruning within its child Append is also a good thing. So what's unclear to me is whether removing that Assert is sufficient to make all of that work nicely. I'm concerned in particular that the planner might get confused about which pruning tests to apply at each level. (EXPLAIN isn't a very adequate tool for testing this, since it fails to show anything about what pruning tests are attached to an Append. I wonder if we need to create something that does show that.) I think it'd be a real good idea to have a test case to exercise this. Jaime's presented test case is too expensive to consider as a regression test, but I bet that a skinnied-down version could be made to work once you'd set all the parallelization parameters to liberal values, like in select_parallel.sql. regards, tom lane
Re: FailedAssertion on partprune
On Thu, Aug 2, 2018 at 1:54 PM, Tom Lane wrote: > (EXPLAIN isn't a very adequate tool for testing this, since > it fails to show anything about what pruning tests are attached to an > Append. I wonder if we need to create something that does show that.) I've asked for that more than once and still think it's a good idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should contrib modules install .h files?
> "Tom" == Tom Lane writes: Tom> Maybe this all just works without much thought, but given that Tom> smart people like Peter E. seem to be unsure of that, I'd sure Tom> like to see a concrete set of rules that extensions should follow Tom> for this. I'll comment on the more substantive stuff later since I just noticed a few relevant points that I need to investigate. But while investigating, I found... Tom> There's also a question of whether we need to change anything in Tom> contrib/ so that it plays by whatever rules we set. There's an Tom> expectation that contrib modules should be buildable with PGXS, Tom> so they need to follow the rules. ... that at least all of the *_plperl transform modules in contrib/ fail to build with USE_PGXS already (i.e. for as long as they have ever existed), because they rely on plperl_helpers.h which is never installed anywhere, and trying to get it via $(top_srcdir) obviously can't work in PGXS. Haven't tried the python ones yet. -- Andrew.
Re: [HACKERS] Bug in to_timestamp().
On Thu, Aug 2, 2018 at 6:17 PM Alexander Korotkov wrote: > After some experiments I found that when you mix spaces and separators > between two fields, then Oracle takes into account only length of last > group of spaces/separators. > > # SELECT to_timestamp('2018- -01 02', ' --- --MM-DD') FROM > dual2018-01-01 00:00:00 -10:00 > (length of last spaces/separators group is 2) > > # SELECT to_timestamp('2018- -01 02', ' --- --MM-DD') FROM dual > 2018-01-01 00:00:00 -10:00 > (length of last spaces/separators group is 3) > > # SELECT to_timestamp('2018- -01 02', ' -- ---MM-DD') FROM dual > 02.01.2018 00:00:00 > (length of last spaces/separators group is 2) Ooops... I'm sorry, but I've posted wrong results here. Correct version is here. # SELECT to_timestamp('2018- -01 02', ' --- --MM-DD') FROM dual ORA-01843: not a valid month (length of last spaces/separators group is 2) # SELECT to_timestamp('2018- -01 02', ' -- ---MM-DD') FROM dual 02.01.2018 00:00:00 (length of last spaces/separators group is 3) So length of last group of spaces/separators in the pattern should be greater or equal to length of spaces/separators in the input string. Other previous groups are ignored in Oracle. And that seems ridiculous for me. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Should contrib modules install .h files?
> "Andrew" == Andrew Gierth writes: Tom> There's also a question of whether we need to change anything in Tom> contrib/ so that it plays by whatever rules we set. There's an Tom> expectation that contrib modules should be buildable with PGXS, Tom> so they need to follow the rules. Andrew> ... that at least all of the *_plperl transform modules in Andrew> contrib/ fail to build with USE_PGXS already (i.e. for as long Andrew> as they have ever existed), because they rely on Andrew> plperl_helpers.h which is never installed anywhere, and trying Andrew> to get it via $(top_srcdir) obviously can't work in PGXS. Andrew> Haven't tried the python ones yet. And none of the plpython transforms can even parse their makefiles with USE_PGXS, let alone build, because they have an "include" directive pointing into src/pl/plpython. -- Andrew (irc:RhodiumToad)
Re: Should contrib modules install .h files?
On 2018-08-02 19:13:05 +0100, Andrew Gierth wrote: > > "Andrew" == Andrew Gierth writes: > > Tom> There's also a question of whether we need to change anything in > Tom> contrib/ so that it plays by whatever rules we set. There's an > Tom> expectation that contrib modules should be buildable with PGXS, > Tom> so they need to follow the rules. > > Andrew> ... that at least all of the *_plperl transform modules in > Andrew> contrib/ fail to build with USE_PGXS already (i.e. for as long > Andrew> as they have ever existed), because they rely on > Andrew> plperl_helpers.h which is never installed anywhere, and trying > Andrew> to get it via $(top_srcdir) obviously can't work in PGXS. > > Andrew> Haven't tried the python ones yet. > > And none of the plpython transforms can even parse their makefiles with > USE_PGXS, let alone build, because they have an "include" directive > pointing into src/pl/plpython. FWIW, I'd be perfectly on board with just burying this policy. Designating one contrib module (or something in src/test/examples or such) as a PGXS example, and always building it with pgxs seems like it'd do a much better job than the current policy. Greetings, Andres Freund
[PATCH] Add regress test for pg_read_all_stats role
Hello, I have noticed that there is no regression tests for default monitoring roles such as pg_read_all_stats. In this patch the test called defroles is added. It tests permissions of pg_read_all_stats role to query database size without CONNECT permission on it, to query tablespace size without CREATE permission on it, and to read "query" column of pg_stat_activity table without SUPERUSER privilege. Best regards, Alexandra Ryzhevich From 3cf3d1b9e252902e0e8f05436c703d1ec2d90125 Mon Sep 17 00:00:00 2001 From: Alexandra Ryzhevich Date: Thu, 2 Aug 2018 18:06:13 +0100 Subject: [PATCH 1/1] Add regress test for pg_read_all_stats role --- src/test/regress/expected/defroles.out | 68 ++ src/test/regress/parallel_schedule | 2 +- src/test/regress/serial_schedule | 1 + src/test/regress/sql/defroles.sql | 46 + 4 files changed, 116 insertions(+), 1 deletion(-) create mode 100644 src/test/regress/expected/defroles.out create mode 100644 src/test/regress/sql/defroles.sql diff --git a/src/test/regress/expected/defroles.out b/src/test/regress/expected/defroles.out new file mode 100644 index 00..f3bec0910a --- /dev/null +++ b/src/test/regress/expected/defroles.out @@ -0,0 +1,68 @@ +-- DEFAULT MONITORING ROLES +DROP ROLE IF EXISTS tmp_role_stats; +NOTICE: role "tmp_role_stats" does not exist, skipping +DROP ROLE IF EXISTS tmp_role_no_stats; +NOTICE: role "tmp_role_no_stats" does not exist, skipping +DROP DATABASE IF EXISTS tmp_db; +NOTICE: database "tmp_db" does not exist, skipping +CREATE ROLE tmp_role_stats; +CREATE ROLE tmp_role_no_stats; +-- check pg_read_all_stats default role permissions +GRANT pg_read_all_stats TO tmp_role_stats; +CREATE DATABASE tmp_db; +-- CONNECT is granted by default +REVOKE CONNECT ON DATABASE tmp_db FROM public; +SET SESSION AUTHORIZATION tmp_role_stats; +-- should not fail because tmp_role_stats is member of pg_read_all_stats +SELECT pg_database_size('tmp_db') > 0 AS canread; + canread +- + t +(1 row) + +-- should not fail because tmp_role_stats is member of pg_read_all_stats +SELECT pg_tablespace_size('pg_global') > 0 AS canread; + canread +- + t +(1 row) + +-- should not fail because it is a default tablespace +SELECT pg_tablespace_size('pg_default') > 0 AS canread; + canread +- + t +(1 row) + +-- should be true because tmp_role_stats is member of pg_read_all_stats +SELECT COUNT(*) = 0 AS haspriv FROM pg_stat_activity WHERE "query" LIKE ''; + haspriv +- + t +(1 row) + +SET SESSION AUTHORIZATION tmp_role_no_stats; +-- should fail because tmp_role_no_stats has not CONNECT permission on this db +SELECT pg_database_size('tmp_db') > 0 AS canread; +ERROR: permission denied for database tmp_db +-- should fail because tmp_role_no_stats has not CREATE permission on this tablespace +SELECT pg_tablespace_size('pg_global') > 0 AS canread; +ERROR: permission denied for tablespace pg_global +-- should not fail because it is a default tablespace +SELECT pg_tablespace_size('pg_default') > 0 AS canread; + canread +- + t +(1 row) + +-- should be false because current session belongs to superuser +SELECT COUNT(*) = 0 AS haspriv FROM pg_stat_activity WHERE "query" LIKE ''; + haspriv +- + f +(1 row) + +RESET SESSION AUTHORIZATION; +DROP ROLE tmp_role_stats; +DROP ROLE tmp_role_no_stats; +DROP DATABASE tmp_db; diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule index 16f979c8d9..d6cf7b8226 100644 --- a/src/test/regress/parallel_schedule +++ b/src/test/regress/parallel_schedule @@ -89,7 +89,7 @@ test: brin gin gist spgist privileges init_privs security_label collate matview # -- # Another group of parallel tests # -- -test: alter_generic alter_operator misc psql async dbsize misc_functions sysviews tsrf tidscan stats_ext +test: alter_generic alter_operator misc psql async dbsize misc_functions sysviews tsrf tidscan stats_ext defroles # rules cannot run concurrently with any test that creates a view test: rules psql_crosstab amutils diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule index 42632be675..b12aa09904 100644 --- a/src/test/regress/serial_schedule +++ b/src/test/regress/serial_schedule @@ -135,6 +135,7 @@ test: sysviews test: tsrf test: tidscan test: stats_ext +test: defroles test: rules test: psql_crosstab test: select_parallel diff --git a/src/test/regress/sql/defroles.sql b/src/test/regress/sql/defroles.sql new file mode 100644 index 00..8c91ee9c95 --- /dev/null +++ b/src/test/regress/sql/defroles.sql @@ -0,0 +1,46 @@ +-- DEFAULT MONITORING ROLES + +DROP ROLE IF EXISTS tmp_role_stats; +DROP ROLE IF EXISTS tmp_role_no_stats; + +DROP DATABASE IF EXISTS tmp_db; + +CREATE ROLE tmp_role_stats; +CREATE ROLE tmp_role_no_stats; + + +-- check pg_read_all_stats default role permissions +GRANT pg_read_all_stats TO tmp_role_stats; + +CREATE DATABASE tmp_db; +-- CONNECT is granted by
Re: Should contrib modules install .h files?
Andres Freund writes: > On 2018-08-02 19:13:05 +0100, Andrew Gierth wrote: >> And none of the plpython transforms can even parse their makefiles with >> USE_PGXS, let alone build, because they have an "include" directive >> pointing into src/pl/plpython. > FWIW, I'd be perfectly on board with just burying this policy. Designating > one contrib module (or something in src/test/examples or such) as a PGXS > example, and always building it with pgxs seems like it'd do a much > better job than the current policy. No, I think that'd be pretty wrongheaded. One of the big reasons for contrib to exist is to serve as a testbed proving that our extension features actually work. What you suggest would reduce the scope of that testing, which seems like the wrong direction to be going in. It's particularly bad for these cases, since what they demonstrate is that it's impossible to build transform modules for plperl or plpython out-of-tree at the moment. That doesn't seem to me to be something we should just ignore; it goes against not only our larger commitment to extensibility, but also the entire point of the transforms feature. regards, tom lane
Re: Should contrib modules install .h files?
> "Andres" == Andres Freund writes: Tom> There's also a question of whether we need to change anything in Tom> contrib/ so that it plays by whatever rules we set. There's an Tom> expectation that contrib modules should be buildable with PGXS, Tom> so they need to follow the rules. Andrew> ... that at least all of the *_plperl transform modules in Andrew> contrib/ fail to build with USE_PGXS already (i.e. for as long Andrew> as they have ever existed), because they rely on Andrew> plperl_helpers.h which is never installed anywhere, and trying Andrew> to get it via $(top_srcdir) obviously can't work in PGXS. Andrew> Haven't tried the python ones yet. >> And none of the plpython transforms can even parse their makefiles >> with USE_PGXS, let alone build, because they have an "include" >> directive pointing into src/pl/plpython. Andres> FWIW, I'd be perfectly on board with just burying this policy. Andres> Designating one contrib module (or something in Andres> src/test/examples or such) as a PGXS example, and always Andres> building it with pgxs seems like it'd do a much better job than Andres> the current policy. I suggest that modules that actually _can't_ build with PGXS should have the PGXS logic removed from their makefiles (perhaps replaced with an error). But for the rest, it's a convenience to be able to build single modules using USE_PGXS without having to configure the whole source tree. -- Andrew (irc:RhodiumToad)
Re: BUG #15182: Canceling authentication due to timeout aka Denial of Service Attack
On Wed, Aug 01, 2018 at 10:55:11AM +, Ahsan Hadi wrote: > Can you rebase the reindex-priv-93.patch, it is getting some hunk failures. Patch reindex-priv-93.patch is for REL9_3_STABLE, where it applies cleanly for me. reindex-priv-94.patch is for REL9_4_STABLE and reindex-priv-95-master.patch is for 9.5~master. -- Michael signature.asc Description: PGP signature
Re: Documentaion fix.
On Wed, Aug 01, 2018 at 01:04:37PM +0900, Kyotaro HORIGUCHI wrote: > The query and the result with four columns fit the current width. Just wondering, what is your reason behind the addition of restart_lsn? This part of the documentation focuses on slot creation, so slot_name, slot_type and active would be representative enough, no? -- Michael signature.asc Description: PGP signature
Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.
On Thu, Aug 2, 2018 at 7:20 AM, Etsuro Fujita wrote: > The new approach seems to me more localized than what Ashutosh had proposed. > One obvious advantage of the new approach is that we no longer need changes > to postgres_fdw for it to work for partitionwise joins, which would apply > for any other FDWs, making the FDW authors' life easy. Well, I don't know what to say here expect that I don't agree. I think Ashutosh's approach is a pretty natural extension of the system that we have now. It involves needing to handle converted whole row vars in some new places, most of which were handled in the original patch, but postgres_fdw was missed. Your approach involves changing the meaning of the target list, but only in narrow corner cases. I completely disagree that we can say it's less invasive. It may indeed be less work for extension authors, though, though perhaps at the expense of moving the conversion from the remote server to the local one. >> But in general, with your approach, any code that >> looks at the tlist for a child-join has to know that the tlist is to >> be used as-is *except that* ConvertRowtypeExpr may need to be >> inserted. I think that special case could be easy to overlook, and it >> seems pretty arbitrary. > > I think we can check to see whether the conversion is needed, from the flags > added to RelOptInfo. Sure, I don't dispute that it can be made to work. I just think it's an ugly kludge. > I have to admit that it's weird to adjust the child's targetlist in that > case when putting the child under the parent's Append/MergeAppend, but IMHO > I think that would be much localized. Yeah, I just don't agree at all. Does anyone else want to weigh in on this? It seems to me that there are several people who are quite willing to complain about the fact that this hasn't been fixed, but not willing to express an opinion about the shape of the fix. Either the RMT needs to take executive action, or we need some more votes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Allow COPY's 'text' format to output a header
On 2 August 2018 at 17:07, Cynthia Shang wrote: > > > On Aug 2, 2018, at 8:11 AM, Daniel Verite > wrote: > > > > That makes sense, thanks for elaborating, although there are also > > a fair number of ERRCODE_FEATURE_NOT_SUPPORTED in copy.c > > that are raised on forbidden/nonsensical combination of features, > > so the consistency argument could work both ways. > > > > If there is not a strong reason to change the error code, then I believe > we should not. The error is the same as it was before, just narrower in > scope. > > Best, > -Cynthia Sure, thanks both for the feedback. Attached is a patch with the error kept as ERRCODE_FEATURE_NOT_SUPPORTED. -- Simon Muller text_header_v5.patch Description: Binary data
Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.
Hi, On 2018-08-02 15:19:49 -0400, Robert Haas wrote: > Does anyone else want to weigh in on this? It seems to me that there > are several people who are quite willing to complain about the fact > that this hasn't been fixed, but not willing to express an opinion > about the shape of the fix. Either the RMT needs to take executive > action, or we need some more votes. (taking my RMT hat off, as it's not been coordinated) Well, I think it's enough in the weeds that not too many people are going to have an informed opinion on this, and I'm not sure uninformed opinions help. But I do think the fact that one of the authors and the committer are on one side, and Tom seems to tentatively agree, weighs quite a bit here. As far as I can tell, it's "just" Etsuro on the other side - obviously his opinion counts, but it doesn't outweigh others. I think if you feel confident, you should be ok just commit the approach you favor, and I read you as being confident that that's the better approach. Greetings, Andres Freund
Re: [PATCH] Add regress test for pg_read_all_stats role
On Thu, Aug 02, 2018 at 06:25:14PM +0100, Alexandra Ryzhevich wrote: > I have noticed that there is no regression tests for default monitoring > roles such as pg_read_all_stats. A bug has been recently fixed for that, see 0c8910a0, so having more coverage would be welcome, now your patch has a couple of problems. 25fff40 is the original commit which has introduced pg_read_all_stats. Your patch has a couple of problems by the way: - database creation is costly, those should not be part of the main regression test suite. - Any roles created should use "regress_" as prefix. - You should look also at negative tests which trigger "must be superuser or a member of pg_read_all_settings", like say a "SHOW stats_temp_directory" with a non-privileged user. -- Michael signature.asc Description: PGP signature
Re: [HACKERS] Bug in to_timestamp().
On Thu, Aug 02, 2018 at 06:17:01PM +0300, Alexander Korotkov wrote: > So, if number of spaces/separators between fields in input string is > more than in format string and list space/separator skipped is minus > sign, then it decides to glue that minus sign to TZH. I think this > behavior is nice to follow, because it allows us to skip spaces after > fields. Good approach! With this change the following queries work now: =# SELECT to_timestamp('2000 + JUN', ' MON'); =# SELECT to_timestamp('2000 +JUN', ' MON'); I think it is worth to add them to regression tests. > 4) Spaces and separators in format string seem to be handled in the > same way in non-FX mode. But strange things happen when you mix > spaces and separators there. > ... > 1+2+3 are implemented in the attached patch, but not 4. I think that > it's only worth to follow Oracle when it does reasonable things. I agree with you. I think it isn't worth to copy this behaviour. -- Arthur Zakirov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Re: FailedAssertion on partprune
David Rowley writes: >> It's probably best discussed on the other thread, but it seems to be >> by design in accumulate_append_subpath(). Parallel Append nodes don't >> get flattened if they contain a mix of parallel aware and non-parallel >> aware subplans. I don't really understand the issue at hand, but let me just make a comment about accumulate_append_subpath(). If we have a regular Append on top of another regular Append or on top of a MergeAppend, we can flatten the lower Merge(Append) into the upper one. No problem. If, however, the lower path is a Parallel Append, we can't. Consider this plan: Gather -> Append -> Partial Subpath #1 -> Parallel Append -> Partial Subpath #2 -> Non-partial Subpath Making Partial Subpath #2 a child of the Append rather than the Parallel Append would be correct; the only downside is that we'd lose the nice worker-balancing stuff that the Parallel Append does. However, making Non-partial Subpath a child of the Append rather than the Parallel Append would be dead wrong, because the Parallel Append will make sure that it only gets executed by one of the cooperating processes, and the regular Append won't. If the upper Append were changed to a Parallel Append, then we could flatten the whole thing just fine, as long as we're careful to keep partial paths and non-partial paths separated. Also, if the situation is reversed, with an upper Parallel Append and a regular Append beneath it, that's always flatten-able. > Yeah, looking at the explain posted upthread, the issue is specifically > that some of the child paths might be for Gathers over subsets of the > partitioning hierarchy. It's not real clear to me whether such a subset > would necessarily correspond to a subtree of the hierarchy, but if it > does, you'd want to be able to apply run-time pruning at the parent > Append, in hopes of not having to execute the Gather at all. Separately > from that, if you do execute the Gather, applying runtime pruning within > its child Append is also a good thing. That sounds right to me. If I'm not mistaken, in that EXPLAIN, the radicado2012 and radicado2017 partitions are ineligible for parallel query for some reason, maybe a reloption setting, but the other partitions are all eligible, or rather, their subpartitions are eligible. If they were all OK for parallel query, the plan would probably just end up with Gather over Parallel Append over individual sample scans. In general, I think the only ways to get an Append are inheritance, partitioning, and set operations. If an upper Append is due to partitioning or inheritance, then I think that any (Merge)Append nodes beneath it must be a subtree of the partitioning or inheritance hierarchy, respectively. Partitioning and inheritance can't be mixed, and set operations can't occur below partition expansion. However, if the upper Append was created by a set operation, then its child Append nodes could be from any of those causes. For instance, it seems possible to imagine an upper Append that is implementing EXCEPT and a lower Append that is implementing both partition expansion and UNION ALL. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Alter index rename concurrently to
On Wed, Aug 1, 2018 at 3:36 PM, Andres Freund wrote: >> Right. If nobody sees a reason not to change that, I think we should. >> It would make the behavior more predictable with, I hope, no real >> loss. > > What precisely are you proposing? Inserting AcceptInvalidationMessages() in some location that guarantees it will be executed at least once per SQL statement. I tentatively propose the beginning of parse_analyze(), but I am open to suggestions. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Alter index rename concurrently to
Hi, On 2018-08-02 15:57:13 -0400, Robert Haas wrote: > On Wed, Aug 1, 2018 at 3:36 PM, Andres Freund wrote: > >> Right. If nobody sees a reason not to change that, I think we should. > >> It would make the behavior more predictable with, I hope, no real > >> loss. > > > > What precisely are you proposing? > > Inserting AcceptInvalidationMessages() in some location that > guarantees it will be executed at least once per SQL statement. I > tentatively propose the beginning of parse_analyze(), but I am open to > suggestions. I'm inclined to think that that doesn't really actually solve anything, but makes locking issues harder to find, because the window is smaller, but decidedly non-zero. Can you describe why this'd make things more "predictable" precisely? Greetings, Andres Freund
Re: doc - add missing documentation for "acldefault"
On Wed, Aug 01, 2018 at 04:41:44PM +0300, Pavel Luzanov wrote: > postgres=# \df acl* > List of fun > Schema | Name | Result data type | > +-+--+-- > pg_catalog | aclcontains | boolean | aclitem[], aclitem > pg_catalog | acldefault | aclitem[] | "char", oid > pg_catalog | aclexplode | SETOF record | acl aclitem[], OUT grantor > oid, O > pg_catalog | aclinsert | aclitem[] | aclitem[], aclitem > pg_catalog | aclitemeq | boolean | aclitem, aclitem > pg_catalog | aclitemin | aclitem | cstring > pg_catalog | aclitemout | cstring | aclitem > pg_catalog | aclremove | aclitem[] | aclitem[], aclitem Some of them are definitely internal. For example, aclitemin and aclitemout are input and output functions respectively of the aclitem data type. I think they don't need to be documented and shouldn't have "pg_" prefix as they was created to maintenance aclitem data type. -- Arthur Zakirov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Re: Standby trying "restore_command" before local WAL
On Wed, Aug 1, 2018 at 7:14 AM, Emre Hasegeli wrote: >> There's still a question here, at least from my perspective, as to which >> is actually going to be faster to perform recovery based off of. A good >> restore command, which pre-fetches the WAL in parallel and gets it local >> and on the same filesystem, meaning that the restore_command only has to >> execute essentially a 'mv' and return back to PG for the next WAL file, >> is really rather fast, compared to streaming that same data over the >> network with a single TCP connection to the primary. Of course, there's >> a lot of variables there and it depends on the network speed between the >> various pieces, but I've certainly had cases where a replica catches up >> much faster using restore command than streaming from the primary. > > Trying "restore_command" before streaming replication is totally fine. > It is not likely that the same WAL would be on both places anyway. > > My problem is trying "restore_command" before the local WAL. I > understand the historic reason of this design, but I don't think it is > expected behavior to anybody who is using "restore_command" together > with streaming replication. Right. I don't really understand the argument that this should be controlled by a GUC. I could see having a GUC to choose between archiving-first and streaming-first, but if it's safe to use the WAL we've already got in pg_xlog, it seems like that should take priority over every other approach. The comments lend themselves to a certain amount of doubt over whether we can actually trust the contents of pg_xlog, but if we can't, it seems like we just shouldn't use it - at all - ever. It doesn't make much sense to me to say "hey, pg_xlog might have evil bad WAL in it that we shouldn't replay, so let's look for the same WAL elsewhere first, but then if we don't find it, we'll just replay the bad stuff." I might be missing something, but that sounds a lot like "hey, this mysterious gloop I found might be rat poison, so let me go check if there's some fruit in the fruit bowl, but if I don't find any, I'm just going to eat the mysterious gloop." -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.
Robert Haas writes: > Does anyone else want to weigh in on this? It seems to me that there > are several people who are quite willing to complain about the fact > that this hasn't been fixed, but not willing to express an opinion > about the shape of the fix. Either the RMT needs to take executive > action, or we need some more votes. [ reads thread ... ] Well, my vote is that I've got zero faith in either of these patches. I do not trust Ashutosh's patch because of the point you noted that it will kick in on ConvertRowtypeExprs that are not related to partitioning. It's also got a rather fundamental conceptual (or at least documentation) error in that it says it's making pull_var_clause do certain things to all ConvertRowtypeExprs, but that's not what the code actually does. I think the tension around that has a lot to do with the fact that the patch went through so many revisions, and I don't have any faith that it's right even yet. As you mentioned upthread, this might be insoluble without some representational change for converted whole-row Vars. As for Etsuro-san's patch, I don't like that for the same reasons you gave. Also, I note that back towards the beginning of July it was said that that patch depends on the assumption of no Gather below an Append. That's broken *now*, not in some hypothetical future. (See the nearby "FailedAssertion on partprune" thread, wherein we're trying to fix the planner's handling of exactly such plans.) What I'm thinking might be a more appropriate thing, at least for getting v11 out the door, is to refuse to generate partitionwise joins when any whole-row vars are involved. (Perhaps that's not enough to dodge all the problems, though?) Now, that's a bit of a problem for postgres_fdw, because it seems to insist on injecting WRVs even when the query text does not require any. Why is that, and can't we get rid of it? It's horrid for performance even without any of these other considerations. But if we fail to get rid of that in time for v11, it would mean that postgres_fdw can't participate in PWJ, which isn't great but I think we could live with it for now. BTW, what exactly do we think the production status of PWJ is, anyway? I notice that upthread it was stated that enable_partitionwise_join defaults to off, as indeed it still does, and we'd turn it on later when we'd gotten rid of some memory-hogging problems. If that hasn't happened yet (and I don't see any open item about considering enabling PWJ by default for v11), then I have exactly no hesitation about lobotomizing PWJ as hard as we need to to mask this problem for v11. I'd certainly prefer a solution along those lines to either of these patches. regards, tom lane
Re: Alter index rename concurrently to
Robert Haas writes: > On Wed, Aug 1, 2018 at 3:36 PM, Andres Freund wrote: >> What precisely are you proposing? > Inserting AcceptInvalidationMessages() in some location that > guarantees it will be executed at least once per SQL statement. I > tentatively propose the beginning of parse_analyze(), but I am open to > suggestions. What will that accomplish that the existing call in transaction start doesn't? It might make the window for concurrency issues a bit narrower, but it certainly doesn't close it. regards, tom lane
Re: Alter index rename concurrently to
On Thu, Aug 2, 2018 at 4:02 PM, Andres Freund wrote: >> Inserting AcceptInvalidationMessages() in some location that >> guarantees it will be executed at least once per SQL statement. I >> tentatively propose the beginning of parse_analyze(), but I am open to >> suggestions. > > I'm inclined to think that that doesn't really actually solve anything, > but makes locking issues harder to find, because the window is smaller, > but decidedly non-zero. Can you describe why this'd make things more > "predictable" precisely? Sure. I'd like to be able to explain in the documentation in simple words when a given DDL change is likely to take effect. For changes made under AccessExclusiveLock, there's really nothing to say: the change will definitely take effect immediately. If you're able to do anything at all with the relevant table, you must have got some kind of lock on it, and that means you must have done AcceptInvalidationMessages(), and so you will definitely see the change. With DDL changes made under less than AccessExclusiveLock, the situation is more complicated. Right now, we can say that a new transaction will definitely see the changes, because it will have to acquire a lock on that relation which it doesn't already have and will thus have to do AcceptInvalidationMessages(). A new statement within the same transaction may see the changes, or it may not. If it mentions any relations not previously mentioned or if it does something like UPDATE a relation where we previously only did a SELECT, thus triggering a new lock acquisition, it will see the changes. If a catchup interrupt gets sent to it, it will see the changes. Otherwise, it won't. It's even possible that we'll start to see the changes in the middle of a statement, because of a sinval reset or something. To summarize, at present, all we can really say is that changes made by concurrent DDL which doesn't take AEL may or may not affect already-running queries, may or may not affect new queries, and will affect new transactions. With this change, we'd be able to say that new statements will definitely see the results of concurrent DDL. That's a clear, easy to understand rule which I think users will like. It would be even better if we could say something stronger, e.g. "Concurrent DDL will affect new SQL statements, but not those already in progress." Or "Concurrent DDL will affect new SQL statements, but SQL statements that are already running may take up to 10 seconds to react to the changes". Or whatever. I'm not sure there's really a way to get to a really solid guarantee, but being able to tell users that, at a minimum, the next SQL statement will notice that things have changed would be good. Otherwise, we'll have conversations like this: User: I have a usage pattern where I run a DDL command to rename an object, and then in another session I tried to refer to it by the new name, and it sometimes it works and sometimes it doesn't. Why does that happen? Postgres Expert: Well, there are several possible reasons. If you already had a transaction in progress in the second window and it already had a lock on the object strong enough for the operation you attempted to perform and no sinval resets occurred and nothing else triggered invalidation processing, then it would still know that object under the old name. Otherwise it would know about it under the new name. User: Well, I did have a transaction open and it may have had some kind of lock on that object already, but how do I know whether invalidation processing happened? Postgres Expert: There's really know way to know. If something else on the system were doing a lot of DDL operations, then it might fill up the invalidation queue enough to trigger catchup interrupts or sinval resets, but if not, it could be deferred for an arbitrarily long period of time. User: So you're saying that if I have two PostgreSQL sessions, and I execute the same commands in those sessions in the same order, just like the isolationtester does, I can get different answers depending on whether some third session creates a lot of unrelated temporary tables in a different database while that's happening? Postgres Expert: Yes. I leave it to your imagination to fill in what this imaginary user will say next, but I bet it will be snarky. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: "Write amplification" is made worse by "getting tired" while inserting into nbtree secondary indexes (Was: Why B-Tree suffix truncation matters)
On Tue, Jul 17, 2018 at 10:42 PM, Simon Riggs wrote: > If we knew that we were never going to do deletes/non-HOT updates from > the table we could continue to use the existing mechanism, otherwise > we will be better off to use sorted index entries. However, it does > appear as if keeping entries in sorted order would be a win on > concurrency from reduced block contention on the first few blocks of > the index key, so it may also be a win in cases where there are heavy > concurrent inserts but no deletes. I think so too. I also expect a big reduction in the number of FPIs in the event of many duplicates. > I hope we can see a patch that just adds the sorting-by-TID property > so we can evaluate that aspect before we try to add other more > advanced index ideas. I can certainly see why that's desirable. Unfortunately, it isn't that simple. If I want to sort on heap TID as a tie-breaker, I cannot cut any corners. That is, it has to be just another column, at least as far as the implementation is concerned (heap TID will have a different representation in internal pages and leaf high keys, but nonetheless it's essentially just another column in the keyspace). This means that if I don't have suffix truncation, I'll regress performance in many important cases that have no compensating benefit (e.g. pgbench). There is no point in trying to assess that. It is true that I could opt to only "logically truncate" the heap TID attribute during a leaf page split (i.e. there'd only be "logical truncation", which is to say there'd only be the avoidance of adding a heap TID to the new high key, and never true physical truncation of user attributes). But doing only that much saves very little code, since the logic for assessing whether or not it's safe to avoid adding a new heap attribute (whether or not we logically truncate) still has to involve an insertion scankey. It seems much more natural to do everything at once. Again, the heap TID attribute is more or less just another attribute. Also, the code for doing physical suffix truncation already exists from the covering/INCLUDE index commit. I'm currently improving the logic for picking a page split in light of suffix truncation, which I've been working on for weeks now. I had something that did quite well with the initial index sizes for TPC-C and TPC-H, but then realized I'd totally regressed the motivating example with many duplicates that I started this thread with. I now have something that does both things well, which I'm trying to simplify. Another thing to bear in mind is that page split logic for suffix truncation also helps space utilization on the leaf level. I can get the main TPC-C order_line pkey about 7% smaller with true suffix truncation, even though the internal page index tuples can never be any smaller due to alignment, and even though there are no duplicates that would otherwise make the implementation "get tired". Can I really fix space utilization in a piecemeal fashion? I strongly doubt it. -- Peter Geoghegan
Re: Alter index rename concurrently to
Hi, On 2018-08-02 16:30:42 -0400, Robert Haas wrote: > With this change, we'd be able to say that new statements will > definitely see the results of concurrent DDL. I don't think it really gets you that far. Because you're suggesting to do it at the parse stage, you suddenly have issues with prepared statements. Some client libraries, and a lot more frameworks, automatically use prepared statements when they see the same query text multiple times. And this'll not improve anything on that. ISTM, if you want to increase consistency in this area, you've to go further. E.g. processing invalidations in StartTransactionCommand() in all states, which'd give you a lot more consistency. Greetings, Andres Freund
Re: Alter index rename concurrently to
Robert Haas writes: > [ reasons why DDL under less than AEL sucks ] Unfortunately, none of these problems are made to go away with an AcceptInvalidationMessages at statement start. That just makes the window smaller. But DDL effects could still be seen - or not - partway through a statement, with just as much ensuing hilarity as in your example. Maybe more. The real issue here, and the reason why I'm very unhappy with the mad rush in certain quarters to try to reduce locking levels for DDL, is exactly that it generally results in uncertainty about when the effects will be seen. I do not think your proposal does much more than put a fig leaf over that problem. Barring somebody having a great idea about resolving that, I think we just need to be very clear that any DDL done with less than AEL has exactly this issue. In the case at hand, couldn't we just document that "the effects of RENAME CONCURRENTLY may not be seen in other sessions right away; at worst, not till they begin a new transaction"? If you don't like that, don't use CONCURRENTLY. regards, tom lane
Re: Alter index rename concurrently to
On 2018-08-02 16:51:10 -0400, Tom Lane wrote: > Robert Haas writes: > > [ reasons why DDL under less than AEL sucks ] > > Unfortunately, none of these problems are made to go away with an > AcceptInvalidationMessages at statement start. That just makes the > window smaller. But DDL effects could still be seen - or not - > partway through a statement, with just as much ensuing hilarity > as in your example. Maybe more. I think it's a lot easier to explain that every newly issued statement sees the effect of DDL, and already running statements may see it, than something else. I don't agree that parse analysis is a good hook to force that, as I've written. > The real issue here, and the reason why I'm very unhappy with the mad rush > in certain quarters to try to reduce locking levels for DDL, is exactly > that it generally results in uncertainty about when the effects will be > seen. I do not think your proposal does much more than put a fig leaf > over that problem. I think it's a significant issue operationally, which is why that pressure exists. I don't know what makes it a "mad rush", given these discussions have been going on for *years*? Greetings, Andres Freund
Re: Ideas for a relcache test mode about missing invalidations
On Thu, Aug 2, 2018 at 3:18 AM, Kyotaro HORIGUCHI wrote: >> [1] >> http://archives.postgresql.org/message-id/CAKoxK%2B5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf%3DHEQ%40mail.gmail.com >> [2] https://www.postgresql.org/message-id/12259.1532117...@sss.pgh.pa.us > > As for [1], it is not a issue on invalidation. It happens also if > the relation has any index and even drop is not needed. The > following steps are sufficient. > > create table t( pk serial, t text ); > insert into t( t ) values( 'hello' ), ('world'); > create index idx_fake0 on t (pk); > create index idx_fake on t ( f_fake( pk ) ); -- ERROR The fact that there was a weird error wasn't really what we cared about there. If the user is doing something that's clearly unreasonable or nonsensical, then they cannot expect much from the error message (maybe we could do better, but it's not a priority). What we really cared about was the fact that it was possible to make a backend's relcache irrecoverably corrupt. That should never be allowed to happen, even when the user is determined to do something unreasonable. -- Peter Geoghegan
Re: [HACKERS] Cached plans and statement generalization
On 02.08.2018 08:25, Yamaji, Ryo wrote: -Original Message- From: Konstantin Knizhnik [mailto:k.knizh...@postgrespro.ru] Sent: Wednesday, August 1, 2018 4:53 PM To: Yamaji, Ryo/山地 亮 Cc: PostgreSQL mailing lists Subject: Re: [HACKERS] Cached plans and statement generalization I failed to reproduce the problem. I used the following non-default configuration parameters: autoprepare_limit=1 autoprepare_threshold=1 create dummy database: create table foo(x integer primary key, y integer); insert into foo values (generate_series(1,1), 0); and run different queries, like: postgres=# select * from foo where x=1; postgres=# select * from foo where x+x=1; postgres=# select * from foo where x+x+x=1; postgres=# select * from foo where x+x+x+x=1; ... and check size of CacheMemoryContext using gdb - it is not increased. Can you please send me your test? I checked not CacheMemoryContext but "plan cache context". Because I think that the memory context that autoprepare mainly uses is "plan cache context". Non-default configuration parameter was used as well as Konstantin. autoprepare_limit=1 autoprepare_threshold=1 The procedure of the test that I did is shown below. create dummy database create table test (key1 integer, key2 integer, ... , key100 integer); insert into test values (1, 2, ... , 100); And, different queries are executed. select key1 from test where key1=1 and key2=2 and ... and key100=100; select key2 from test where key1=1 and key2=2 and ... and key100=100; select key3 from test where key1=1 and key2=2 and ... and key100=100;... And, "plan cache context" was confirmed by using gdb. Thank you. Unfortunately expression_tree_walker is not consistent with copyObject: I tried to use this walker to destroy raw parse tree of autoprepared statement, but looks like some nodes are not visited by expression_tree_walker. I have to create separate memory context for each autoprepared statement. New version of the patch is attached. diff --git a/doc/src/sgml/autoprepare.sgml b/doc/src/sgml/autoprepare.sgml new file mode 100644 index 000..b3309bd --- /dev/null +++ b/doc/src/sgml/autoprepare.sgml @@ -0,0 +1,62 @@ + + + + Autoprepared statements + + + autoprepared statements + + + +PostgreSQL makes it possible prepare +frequently used statements to eliminate cost of their compilation +and optimization on each execution of the query. On simple queries +(like ones in pgbench -S) using prepared statements +increase performance more than two times. + + + +Unfortunately not all database applications are using prepared statements +and, moreover, it is not always possible. For example, in case of using +pgbouncer or any other session pooler, +there is no session state (transactions of one client may be executed at different +backends) and so prepared statements can not be used. + + + +Autoprepare mode allows to overcome this limitation. +In this mode Postgres tries to generalize executed statements +and build parameterized plan for them. Speed of execution of +autoprepared statements is almost the same as of explicitly +prepared statements. + + + +By default autoprepare mode is switched off. To enable it, assign non-zero +value to GUC variable autoprepare_tthreshold. +This variable specified minimal number of times the statement should be +executed before it is autoprepared. Please notice that, despite to the +value of this parameter, Postgres makes a decision about using +generalized plan vs. customized execution plans based on the results +of comparison of average time of five customized plans with +time of generalized plan. + + + +If number of different statements issued by application is large enough, +then autopreparing all of them can cause memory overflow +(especially if there are many active clients, because prepared statements cache +is local to the backend). To prevent growth of backend's memory because of +autoprepared cache, it is possible to limit number of autoprepared statements +by setting autoprepare_limit GUC variable. LRU strategy will be used +to keep in memory most frequently used queries. + + + +It is possible to inspect autoprepared queries in the backend using +pg_autoprepared_statements view. It shows original text of the +query, types of the extracted parameters (replacing literals) and +query execution counter. + + + diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index fffb79f..26f561d 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -8230,6 +8230,11 @@ SCRAM-SHA-256$:&l + pg_autoprepared_statements + autoprepared statements + + + pg_prepared_xacts prepared transactions @@ -9541,6 +9546,68 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx + + +
Re: Should contrib modules install .h files?
> "Tom" == Tom Lane writes: Tom> It's particularly bad for these cases, since what they demonstrate Tom> is that it's impossible to build transform modules for plperl or Tom> plpython out-of-tree at the moment. Right. Both plperl and plpython install _some_ header files (which behavior was added in the commit for the transforms feature), but they don't install all of the header files that the existing transform modules actually use. Is this supposed to be a principled choice, with an out-of-tree transform expected to provide their own code instead of using those headers, or is it just an oversight? Tom> That doesn't seem to me to be something we should just ignore; it Tom> goes against not only our larger commitment to extensibility, but Tom> also the entire point of the transforms feature. Yeah, if an out-of-tree data type can't provide a plperl or plpython transform for itself, something's broken. And that does seem to be the case at present. -- Andrew (irc:RhodiumToad)
Re: Should contrib modules install .h files?
> "Tom" == Tom Lane writes: Tom> Something that copes with different modules installing headers Tom> with the same base name. Allowing for that was the driving force Tom> for going with subdirectory-per-extension, but if we really want Tom> that to work, there seems to be no alternative but for extensions Tom> to write qualified header names (#include "hstore/hstore.h" not Tom> #include "hstore.h"). Andres, for one, seemed to think that Tom> wouldn't play nicely with PGXS, I think that was me, not Andres? But I think I was partially wrong and that it's possible that this can be made to work at least in most cases, as long as we can rely on the same-directory rule for #include "foo.h". (i.e. the first place to look is always the same directory as the file containing the #include statement). I'm going to test this now, trying to do an out-of-both-trees build of a transform function for an out-of-tree PL that uses multiple .h files. -- Andrew (irc:RhodiumToad)
Re: FailedAssertion on partprune
On 3 August 2018 at 05:54, Tom Lane wrote: > Yeah, looking at the explain posted upthread, the issue is specifically > that some of the child paths might be for Gathers over subsets of the > partitioning hierarchy. It's not real clear to me whether such a subset > would necessarily correspond to a subtree of the hierarchy, but if it > does, you'd want to be able to apply run-time pruning at the parent > Append, in hopes of not having to execute the Gather at all. Separately > from that, if you do execute the Gather, applying runtime pruning within > its child Append is also a good thing. Yeah, that "good thing" was the reason I changed my mind from aborting run-time pruning in my first patch idea to allowing it because if pruning says so, it's fine to remove an entire sub-partitioned hierarchy. If we look at the Assert that's in question. /* Same rel cannot be both leaf and non-leaf */ Assert(relid_subplan_map[rti] == 0); If this fails then the subplan path that was found must have a parent that's a partitioned table. If we're worried about those containing subplans which are not partitions then we can probably add some Asserts somewhere else. One place that we could do that with a bit of cover would be inside the bms_num_members(allmatchedsubplans) < list_length(subpaths) condition, we could do: Assert(root->simple_rte_array[parentrel->relid]->relkind != RELKIND_PARTITIONED_TABLE); There should never be any other_subplans when the parent is a partitioned table. I've attached a patch to demonstrate what I mean here. > So what's unclear to me is whether removing that Assert is sufficient > to make all of that work nicely. I'm concerned in particular that the > planner might get confused about which pruning tests to apply at each > level. (EXPLAIN isn't a very adequate tool for testing this, since > it fails to show anything about what pruning tests are attached to an > Append. I wonder if we need to create something that does show that.) Pruning is only ever applied at the top-level when the RelOptInfo is a RELOPT_BASEREL. It might well be that we want to reconsider that now that we've seen some plans where the subpaths are not always pulled into the top-level [Merge]Append path, but as of now I don't quite see how this could get broken. We're just pruning with a coarser granularity. > I think it'd be a real good idea to have a test case to exercise this. > Jaime's presented test case is too expensive to consider as a regression > test, but I bet that a skinnied-down version could be made to work > once you'd set all the parallelization parameters to liberal values, > like in select_parallel.sql. Yeah, I did spend a bit of time trying to cut down that test and failed at it. The fact that it seemed so hard to do that didn't give me much confidence that the plan produced by the test would be very stable, but perhaps we can just deal with that if we see its unstable. I'll try again with the test. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services run-time_prune_asserts_for_discussion.patch Description: Binary data
Re: FailedAssertion on partprune
On 3 August 2018 at 07:53, Robert Haas wrote: > I don't really understand the issue at hand, but let me just make a > comment about accumulate_append_subpath(). If we have a regular > Append on top of another regular Append or on top of a MergeAppend, we > can flatten the lower Merge(Append) into the upper one. No problem. > If, however, the lower path is a Parallel Append, we can't. Consider > this plan: > > Gather > -> Append > -> Partial Subpath #1 > -> Parallel Append > -> Partial Subpath #2 > -> Non-partial Subpath > > Making Partial Subpath #2 a child of the Append rather than the > Parallel Append would be correct; the only downside is that we'd lose > the nice worker-balancing stuff that the Parallel Append does. > However, making Non-partial Subpath a child of the Append rather than > the Parallel Append would be dead wrong, because the Parallel Append > will make sure that it only gets executed by one of the cooperating > processes, and the regular Append won't. If the upper Append were > changed to a Parallel Append, then we could flatten the whole thing > just fine, as long as we're careful to keep partial paths and > non-partial paths separated. Also, if the situation is reversed, with > an upper Parallel Append and a regular Append beneath it, that's > always flatten-able. Wouldn't that code have more flexibility to flatten the Append if it were to, instead of looking at the Append's subpaths, look at the subpath's Parent RelOptInfo paths and just use the Append and MergeAppend as a container to identify the relations that must be included, rather than the paths that should be? -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Page freezing, FSM, and WAL replay
We recently had a customer report a very strange problem, involving a very large insert-only table: without explanation, insertions would stall for several seconds, causing application timeout and process accumulation and other nastiness. After some investigation, we narrowed this down to happening immediately after the first VACUUM on the table right after a standby got promoted. It wasn't at first obvious what the connection between these factors was, but eventually we realized that VACUUM must have been skipping a bunch of pages because they had been marked all-frozen previously, so the FSM was not updated with the correct freespace figures for those pages. The FSM pages had been transmitted as full-page images on WAL before the promotion (because wal_log_hints), so they contained optimistic numbers on amount of free space coming from the previous master. (Because this only happens on the first change to that FSM page after a checkpoint, it's quite likely that one page every few thousand or so contains optimistic figures while the others remain all zeroes, or something like that.) Before VACUUM, nothing too bad would happen, because the upper layers of the FSM would not know about those optimistic numbers. But when VACUUM does FreeSpaceMapVacuum, it propagates those numbers upwards; as soon as that happens, inserters looking for pages would be told about those pages (wrongly catalogued to contain sufficient free space), go to insert there, and fail because there isn't actually any freespace; ask FSM for another page, lather, rinse, repeat until all those pages are all catalogued correctly by FSM, at which point things continue normally. (There are many processes doing this chase-up concurrently and it seems a pretty contentious process, about which see last paragraph; it can be seen in pg_xlogdump that it takes several seconds for things to settle). After considering several possible solutions, I propose to have heap_xlog_visible compute free space for any page being marked frozen; Pavan adds to that to have heap_xlog_clean compute free space for all pages also. This means that if we later promote this standby and VACUUM skips all-frozen pages, their FSM numbers are going to be up-to-date anyway. Patch attached. Now, it's possible that the problem occurs for all-visible pages not just all-frozen. I haven't seen that one, maybe there's some reason why it cannot. But fixing both things together is an easy change in the proposed patch: just do it on xlrec->flags != 0 rather than checking for the specific all-frozen flag. (This problem seems to be made worse by the fact that RecordAndGetPageWithFreeSpace (or rather fsm_set_and_search) holds exclusive lock on the FSM page for the whole duration of update plus search. So when there are many inserters, they all race to the update process. Maybe it'd be less terrible if we would release exclusive after the update and grab shared lock for the search in fsm_set_and_search, but we still have to have the exclusive for the update, so the contention point remains. Maybe there's not sufficient improvement to make a practical difference, so I'm not proposing changing this.) -- Álvaro Herrera diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 5016181fd7..d024b4fa59 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -8056,6 +8056,7 @@ heap_xlog_clean(XLogReaderState *record) xl_heap_clean *xlrec = (xl_heap_clean *) XLogRecGetData(record); Buffer buffer; Sizefreespace = 0; + boolknow_freespace = false; RelFileNode rnode; BlockNumber blkno; XLogRedoAction action; @@ -8107,8 +8108,6 @@ heap_xlog_clean(XLogReaderState *record) nowdead, ndead, nowunused, nunused); - freespace = PageGetHeapFreeSpace(page); /* needed to update FSM below */ - /* * Note: we don't worry about updating the page's prunability hints. * At worst this will cause an extra prune cycle to occur soon. @@ -8118,16 +8117,16 @@ heap_xlog_clean(XLogReaderState *record) MarkBufferDirty(buffer); } if (BufferIsValid(buffer)) + { + freespace = PageGetHeapFreeSpace(page); /* needed to update FSM below */ + know_freespace = true; UnlockReleaseBuffer(buffer); + } /* -* Update the FSM as well. -* -* XXX: Don't do this if the page was restored from full page image. We -* don't bother to update the FSM in that case, it doesn't need to be -* totally accurate anyway. +* Update the FSM as well, if we can. */ - if (action == BLK_NEEDS_REDO) + if (know_freespace) XLogRecordPa
Re: Internal error XX000 with enable_partition_pruning=on, pg 11 beta1 on Debian
On 2018-Aug-01, Tom Lane wrote: > David Rowley writes: > > On 20 July 2018 at 01:03, David Rowley wrote: > >> I've attached a patch intended for master which is just v2 based on > >> post 5220bb7533. > > I've pushed the v3 patch with a lot of editorial work (e.g. cleaning > up comments you hadn't). Thanks Tom, much appreciated. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Ideas for a relcache test mode about missing invalidations
At Thu, 2 Aug 2018 14:40:50 -0700, Peter Geoghegan wrote in pg> On Thu, Aug 2, 2018 at 3:18 AM, Kyotaro HORIGUCHI pg> wrote: pg> >> [1] http://archives.postgresql.org/message-id/CAKoxK%2B5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf%3DHEQ%40mail.gmail.com pg> >> [2] https://www.postgresql.org/message-id/12259.1532117...@sss.pgh.pa.us pg> > pg> > As for [1], it is not a issue on invalidation. It happens also if pg> > the relation has any index and even drop is not needed. The pg> > following steps are sufficient. pg> > pg> > create table t( pk serial, t text ); pg> > insert into t( t ) values( 'hello' ), ('world'); pg> > create index idx_fake0 on t (pk); pg> > create index idx_fake on t ( f_fake( pk ) ); -- ERROR pg> pg> The fact that there was a weird error wasn't really what we cared pg> about there. If the user is doing something that's clearly pg> unreasonable or nonsensical, then they cannot expect much from the pg> error message (maybe we could do better, but it's not a priority). Hmm. I don't think it's unreasonable or nonsensical, but I don't argue it here. pg> What we really cared about was the fact that it was possible to make a pg> backend's relcache irrecoverably corrupt. That should never be allowed pg> to happen, even when the user is determined to do something pg> unreasonable. I reread through the thread and IUCC, drop_index() is sending inval on the owing relation and invalidation happens (that is, RelationCacheInvalidateEntry is called for the owner relation.). Relcache for the owner is actually being rebuit during the following create index. At least the problem mentioned in [1] is happening using a fresh relcache created after invoking the index creation. The cause is RelationGetIndexList collects all "indislive" indexes, including the idx_fake before population, which can be fixed by the attached patch.. I'm I still misunderstanding something? # Anyway, I'll make another thread for the another issue. [1]: https://www.postgresql.org/message-id/20180628150209.n2qch5jtn3vt2xaa%40alap3.anarazel.de regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [report] memory leaks in COPY FROM on partitioned table
2018-08-02 5:38 GMT+09:00 Alvaro Herrera : > On 2018-Aug-01, Alvaro Herrera wrote: > >> Right, makes sense. Pushed that way. > > KaiGai, if you can please confirm that the pushed change fixes your test > case, I'd appreciate it. > Can you wait for a few days? I can drop the test dataset and reuse the storage once benchmark test is over -- HeteroDB, Inc / The PG-Strom Project KaiGai Kohei
Re: Pluggable Storage - Andres's take
On Tue, Jul 24, 2018 at 11:31 PM Haribabu Kommi wrote: > On Tue, Jul 17, 2018 at 11:01 PM Haribabu Kommi > wrote: > >> >> I added new API in the tableam.h to get all the page visible tuples to > abstract the bitgetpage() function. > > >>- Merge tableam.h and tableamapi.h and make most tableam.c functions > >> small inline functions. Having one-line tableam.c wrappers makes this > >> more expensive than necessary. We'll have a big enough trouble not > >> regressing performancewise. > > I merged tableam.h and tableamapi.h into tableam.h and changed all the > functions as inline. This change may have added some additional headers, > will check them if I can remove their need. > > Attached are the updated patches on top your github tree. > > Currently I am working on the following. > - I observed that there is a crash when running isolation tests. > while investing the crash, I observed that it is due to the lot of FIXME's in the code. So I just fixed minimal changes and looking into correcting the FIXME's first. One thing I observed is lack relation pointer is leading to crash in the flow of EvalPlan* functions, because all ROW_MARK types doesn't contains relation pointer. will continue to check all FIXME fixes. > - COPY's multi_insert path should probably deal with a bunch of slots, > rather than forming HeapTuples > Implemented supporting of slots in the copy multi insert path. Regards, Haribabu Kommi Fujitsu Australia 0002-Isolation-test-fixes-1.patch Description: Binary data 0001-COPY-s-multi_insert-path-deal-with-bunch-of-slots.patch Description: Binary data
RE: Recovery performance of standby for multiple concurrent truncates on large tables
Hi, I appreciate the feedback and suggestions. On Tue, Jul 31, 2018 at 8:01 AM, Robert Haas wrote: >> How would this work if a relfilenode number that belonged to an old >> relation got recycled for a new relation? >> .. >> I think something like this could be made to work -- both on the >> master and the standby, and not just while waiting for a failover -- >> ... >> It's not clear to me whether it would be worth the overhead of doing >> something like this. Making relation drops faster at the cost of >> making buffer cleaning slower could be a loser. The deferred list has the information such as relfilenode and the head/top page number of invalid pages. The standby in the promote mode only registers info in the deferred list when redoing the COMMIT record. However, it scans the shared buffer cache only once before the end of the promote, and discards the page related to the table included in the deferred list. After removing, increment the head page number of the abovementioned invalid page. In this way, if ever while promoting an INSERT progresses (I'm not sure if it's possible), the discard progresses too, & the app can refer to it. As mentioned by Tsunakawa-san above, the purpose of the design is to make the failover process as shorter as possible and not really to make the drop of relations faster. My initial thought was not to touch the regular buffer invalidation process, but I am also not sure of the level of complexity that the proposed design would affect and how crucial the requirement (shorter failover) would make, so I asked for community's feedback. On Tue, July 31, 2018 8:27 AM, Thomas Munro wrote: > Anyway, it's a lot of complexity, and it falls back to a worst cases > like today, and can also transfer work to innocent foreground processes. > I see why Andres says we should just get a better data structure so we > can make the guy doing the dropping pay for it up front, but more > efficiently. I suspect we may also want an ordered data structure for > write clustering purposes one day. I also understand the value of solving the root cause of the problem that's why I asked Andres if we could expect a development from the community for V12 regarding the radix tree approach for buffer management, or even just from anyone who could start a WIP patch regardless if it's radix tree or not. And perhaps we'd like to get involved as well as this is also our customer's problem. So I am just curious about the radix tree approach's design. Maybe we should start discussing what kind of data structures, processing, etc. are involved? I also read other design solutions from another thread [1]. a. Fujii-san's proposal Add $SUBJECT for performance improvement; reloption to prevent vacuum from truncating empty pages b. Pavan's comment > What if we remember the buffers as seen by count_nondeletable_pages() and > then just discard those specific buffers instead of scanning the entire > shared_buffers again? Surely we revisit all to-be-truncated blocks before > actual truncation. So we already know which buffers to discard. And we're > holding exclusive lock at that point, so nothing can change underneath. Of > course, we can't really remember a large number of buffers, so we can do > this in small chunks. Scan last K blocks, remember those K buffers, discard > those K buffers, truncate the relation and then try for next K blocks. If > another backend requests lock on the table, we give up or retry after a > while. [1] https://www.postgresql.org/message-id/flat/20180419203802.hqrb4o2wexjnb2ft%40alvherre.pgsql#7d4a8c56a01392a3909b2150371e6495 Now, how do we move forward? Thank you everyone. Regards, Kirk Jamison
Re: [report] memory leaks in COPY FROM on partitioned table
On 2018-Aug-03, Kohei KaiGai wrote: > 2018-08-02 5:38 GMT+09:00 Alvaro Herrera : > > On 2018-Aug-01, Alvaro Herrera wrote: > > > >> Right, makes sense. Pushed that way. > > > > KaiGai, if you can please confirm that the pushed change fixes your test > > case, I'd appreciate it. > > Can you wait for a few days? I can drop the test dataset and reuse the storage > once benchmark test is over Of course -- take your time. Thanks, -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hello, thank you for the comment. At Wed, 01 Aug 2018 21:21:57 +0900, Etsuro Fujita wrote in <5b61a5e5.6010...@lab.ntt.co.jp> > (2018/06/12 12:19), Kyotaro HORIGUCHI wrote: > > I have demonstrated and actually shown a problem of the > > PARAM_EXEC case. > > > A. Just detecting and reporting/erroring the problematic case. > > > > B. Giving to Sort-like nodes an ability to convert PARAMS into > > junk columns. > > > > C. Adding a space for 64bit tuple identifier in a tuple header. > > > > D. Somehow inhibiting tuple-storing node like Sort between. (This > >should break something working.) > > > > > > B seems to have possibility to fix this but I haven't have a > > concrete design of it. > > I'm just wondering whether we could modify the planner (or executor) > so that Params can propagate up to the ModifyTable node through all > joins like Vars/PHVs. Yeah, it's mentioned somewhere upthread. The most large obstacle in my view is the fact that the tuple descriptor for an RTE_RELATION baserel is tied with the relation definition. So we need to separate the two to use "(junk) Vars/PHVs" to do that purpose. The four above is based on the premise of EXEC_PARAMS. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hello. At Tue, 26 Jun 2018 10:19:45 +0530, Ashutosh Bapat wrote in > On Tue, Jun 26, 2018 at 9:59 AM, Kyotaro HORIGUCHI > wrote: > > > >> But good thing is because of join and aggregate > >> push-down we already have ability to push arbitrary kinds of nodes > >> down to the foreign server through the targetlist. We should be able > >> to leverage that capability. It looks like a lot of change, which > >> again doesn't seem to be back-portable. > > > > After some struggles as you know, I agree to the opinion. As my > > first impression, giving (physical) base relations (*1) an > > ability to have junk attribute is rather straightforward. > > By giving base relations an ability to have junk attribute you mean to > add junk attribute in the targetlist of DML, something like > postgresAddForeignUpdateTargets(). You seem to be fine with the new Maybe. > node approach described above. Just confirm. Something-like-but-other-hanVar node? I'm not sure it is needed, because whatever node we add to the relation-tlist, we must add the correspondence to the relation descriptor. And if we do that, a Var works to point it. (Am I correctly understanding?) > > > > Well, is our conclusion here like this? > > > > - For existing versions, check the errorneous situation and ERROR out. > > (documentaion will be needed.) > > > > - For 12, we try the above thing. > > I think we have to see how invasive the fix is, and whether it's > back-portable. If it's back-portable, we back-port it and the problem > is fixed in previous versions as well. If not, we fix previous > versions to ERROR out instead of corrupting the database. Mmm. Ok, I try to make a patch. Please wait for a while. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [PATCH] Improve geometric types
At Thu, 2 Aug 2018 11:50:55 +0200, Tomas Vondra wrote in > > > On 08/01/2018 01:40 PM, Emre Hasegeli wrote: > >> Ah, so there's an assumption that NaNs are handled earlier and never > >> reach > >> this place? That's probably a safe assumption. I haven't thought about > >> that, > >> it simply seemed suspicious that the code mixes direct comparisons and > >> float8_mi() calls. > > The comparison functions handle NaNs. The arithmetic functions handle > > returning error on underflow, overflow and division by zero. I > > assumed we want to return error on those in any case, but we don't > > want to handle NaNs at every place. > > > >> Not sure, I'll leave that up to you. I don't mind doing it in a > >> separate > >> patch (I'd probably prefer that over mixing it into unrelated patch). > > It is attached separately. > > > > OK, thanks. > > So, have we reached conclusion about all the bits I mentioned on 7/31? > The delta and float8/double cast are fixed, and for computeDistance > (i.e. doing comparisons directly or using float8_lt), the code may > seem a bit inconsistent, but it is in fact correct as the NaNs are > handled elsewhere. That seems reasonable, but perhaps a comment > pointing that out would be nice. I'm not confident on replacing double to float8 partially in gist code. After the 0002 patch applied, I see most of problematic usage of double or bare arithmetic on dimentional values in gistproc.c. > static inline float > non_negative(float val) > { > if (val >= 0.0f) > return val; > else > return 0.0f; > } It is used as "non_negative(overlap)", where overlap is float4, which is calculated using float8_mi. Float4 makes sense only if we need to store a large number of it to somewhere but they are just working varialbles. Couldn't we eliminate float4 that doesn't have a requirement to do so? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [PATCH] Improve geometric types
At Fri, 03 Aug 2018 13:38:40 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20180803.133840.180843182.horiguchi.kyot...@lab.ntt.co.jp> > At Thu, 2 Aug 2018 11:50:55 +0200, Tomas Vondra > wrote in > > > > > > > On 08/01/2018 01:40 PM, Emre Hasegeli wrote: > > >> Ah, so there's an assumption that NaNs are handled earlier and never > > >> reach > > >> this place? That's probably a safe assumption. I haven't thought about > > >> that, > > >> it simply seemed suspicious that the code mixes direct comparisons and > > >> float8_mi() calls. > > > The comparison functions handle NaNs. The arithmetic functions handle > > > returning error on underflow, overflow and division by zero. I > > > assumed we want to return error on those in any case, but we don't > > > want to handle NaNs at every place. > > > > > >> Not sure, I'll leave that up to you. I don't mind doing it in a > > >> separate > > >> patch (I'd probably prefer that over mixing it into unrelated patch). > > > It is attached separately. > > > > > > > OK, thanks. > > > > So, have we reached conclusion about all the bits I mentioned on 7/31? > > The delta and float8/double cast are fixed, and for computeDistance > > (i.e. doing comparisons directly or using float8_lt), the code may > > seem a bit inconsistent, but it is in fact correct as the NaNs are > > handled elsewhere. That seems reasonable, but perhaps a comment > > pointing that out would be nice. > > I'm not confident on replacing double to float8 partially in gist > code. After the 0002 patch applied, I see most of problematic > usage of double or bare arithmetic on dimentional values in > gistproc.c. > > > static inline float > > non_negative(float val) > > { > > if (val >= 0.0f) > > return val; > > else > > return 0.0f; > > } > > This takes float(4) and it is used as "non_negative(overlap)", > where overlap is float4, which is calculated using float8_mi. > Float4 makes sense if we store a large number of values somewhere > but they are just working varialbles. Couldn't we eliminate > float(4) whthout any specifc requirement? I'm fine with the 0002-geo-float-v14 except the above. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [HACKERS] Restricting maximum keep segments by repslots
At Thu, 2 Aug 2018 09:05:33 -0400, Robert Haas wrote in > On Tue, Jul 31, 2018 at 9:52 PM, Kyotaro HORIGUCHI > wrote: > > I thought it's to be deprecated for some reason so I'm leaving > > wal_keep_segments in '# of segments' even though the new GUC is > > in MB. I'm a bit uneasy that the two similar settings are in > > different units. Couldn't we turn it into MB taking this > > opportunity if we will keep wal_keep_segments, changing its name > > to min_wal_keep_size? max_slot_wal_keep_size could be changed to > > just max_wal_keep_size along with it. > > This seems like it's a little bit of a separate topic from what this > thread about, but FWIW, +1 for standardizing on MB. Thanks. Ok, I'll raise this after separately with this. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Speeding up INSERTs and UPDATEs to partitioned tables
(looking at the v5 patch but replying to an older email) On 2018/07/31 16:03, David Rowley wrote: > I've attached a complete v4 patch. > >> By the way, when going over the updated code, I noticed that the code >> around child_parent_tupconv_maps could use some refactoring too. >> Especially, I noticed that ExecSetupChildParentMapForLeaf() allocates >> child-to-parent map array needed for transition tuple capture even if not >> needed by any of the leaf partitions. I'm attaching here a patch that >> applies on top of your v3 to show what I'm thinking we could do. > > Maybe we can do that as a follow-on patch. We probably could, but I think it would be a good idea get rid of *all* redundant allocations due to tuple routing in one patch, if that's the mission of this thread and the patch anyway. > I think what we have so far > is already ended up quite complex to review. What do you think? Yeah, it's kind of complex, but at least it seems that we're clear on the point that what we're trying to do here is to try to get rid of redundant allocations. Parts of the patch that appear complex seems to be around the allocation of various maps. Especially the child-to-parent maps, which as things stand today, come from two arrays -- a per-update-subplan array that's needed by update tuple routing proper and per-leaf partition array (one in PartitionTupleRouting) that's needed by transition capture machinery. The original coding was such the update tuple routing handling code would try to avoid allocating the per-update-subplan array if it saw that per-leaf partition array was already set up in PartitionTupleRouting, because transition capture is active in the query. For update-tuple-routing code to be able to use maps from the per-leaf array, it would have to know which update-subplans mapped to which tuple-routing-initialized partitions. That was maintained in the subplan_partition_offset array that's now gone with this patch, because we no longer want to fix the tuple-routing-initialized-partition offsets in advance. So, it's better to dissociate per-subplan maps which are initialized during ExecInitModifyTable from per-leaf maps which are initialized lazily when tuple routing initializes a partition, which is what my portion of the patch did. As mentioned in my last email, I still think it would be a good idea to simplify the handling of child-to-parent maps in PartitionTupleRouting even further, while we're at improving the code in this area. I revised the patch such that it makes the handling of maps in PartitionTupleRouting even more uniform. With that patch, we no longer have two completely unrelated places in the code managing parent-to-child and child-to-parent maps, even though both arrays are in the same PartitionTupleRouting. Please find the updated patch attached with this email. Thanks, Amit From 1a814f5a40774a51bf702757ec91e02f418a5aba Mon Sep 17 00:00:00 2001 From: amit Date: Fri, 3 Aug 2018 14:09:51 +0900 Subject: [PATCH v2 2/2] Refactor handling of child_parent_tupconv_maps They're currently created and handed out by TupConvMapForLeaf, which makes them look somewhat different from parent_to_child_tupconv_maps. In fact, both contain conversion maps possibly needed between a partition initialized by tuple routing and the root parent in one or the other direction, so it seems odd that parent-to-child ones are created in ExecInitRoutingInfo, whereas child-to-parent ones in TupConvMapForLeaf. The child-to-parent ones are only needed if transition capture is active, but we can already check that in ExecInitRoutingInfo via the incoming ModifyTableState (sure, we need to teach CopyFrom to add the necessary information into its dummy ModifyTableState, but that doesn't seem too bad). This way, we can manage both parent-to-child and child-to-parent maps in similar ways, and more importantly, use the same criterion of checking whether a partition's slot in the respective array is NULL or not to conclude if tuple conversion is necessary or not. --- src/backend/commands/copy.c| 37 +--- src/backend/executor/execPartition.c | 102 + src/backend/executor/nodeModifyTable.c | 11 ++-- src/include/executor/execPartition.h | 33 ++- 4 files changed, 64 insertions(+), 119 deletions(-) diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 752ba3d767..6f4069d321 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -2510,8 +2510,12 @@ CopyFrom(CopyState cstate) /* * If there are any triggers with transition tables on the named relation, * we need to be prepared to capture transition tuples. +* +* Because partition tuple routing would like to know about whether +* transition capture is active, we also set it in mtstate, which is +* passed to ExecFindPartition below. */ - cstate->transition_capture = + cstate->t
Re: [PATCH] pg_hba.conf : new auth option : clientcert=verify-full
On Mon, Jul 30, 2018 at 02:20:43PM +0200, Julian Markwort wrote: > On 07/19/2018 03:00 AM, Thomas Munro wrote: > >Some more comments: > > > > if (parsedline->auth_method == uaCert) > > { > >- parsedline->clientcert = true; > >+ parsedline->clientcert = clientCertOn; > > } > > > >The "cert" method is technically redundant with this patch, because > >it's equivalent to "trust clientcert=verify-full", right? That gave > >me an idea: wouldn't the control flow be a bit nicer if you just > >treated uaCert the same as uaTrust in the big switch statement, but > >also enforced that clientcert = clientCertFull in the hunk above? > >Then you could just have one common code path to reach CheckCertAuth() > >for all auth methods after that switch statement, instead of the more > >complicated conditional coding you have now with two ways to reach it. > >Would that be better? > That would result in a couple less LOC and a bit clearer conditionals, I > agree. > If there are no objections to make uaCert a quasi-alias of uaTrust with > clientcert=verify-full, I'll go ahead and change the code accordingly. > I'll make sure that the error messages are still handled based on the auth > method and not only depending on the clientcert flag. > > >In the "auth-cert" section there are already a couple of examples > >using lower case "cn": > > > > will be sent to the client. The cn (Common Name) > > > > is a check that the cn attribute matches the database > > > >I guess you should do the same, or if there is a good reason to use > >upper case "CN" instead then you should change those to match. > I see that there are currently no places that use CN > right now. > However, we refer to Certification Authority as CA in most cases (without > the literal tag surrounding it). > The different fields of certificates seem to be abbreviated with capital > letters in most literature; That was my reasoning for capitalizing it in the > first place. > I'm open for suggestions, but in absence of objections I might just > capitalize all occurrences of CN. > > >There is still a place in the documentation where we refer to "1": > > > > In a pg_hba.conf record specifying certificate > > authentication, the authentication option clientcert > > is > > assumed to be 1, and it cannot be turned off > >since a client > > certificate is necessary for this method. What the > > cert > > method adds to the basic clientcert certificate > >validity test > > is a check that the cn attribute matches the database > > user name. > > > >Maybe we should use "verify-ca" here, though as I noted above I think > >it should really "verify-full" and we should simplify the flow. > Yes, we should adopt the new style in all places. > I'll rewrite that passage to indicate that cert authentication essentially > results in the same behaviour as clientcert=verify-full. > The existing text is somewhat ambiguous anyway, in case somebody decides to > skip over the restriction described in the second sentence. > > >I think we should also document that "1" and "0" are older spellings > >still accepted, where the setting names are introduced. > > > >+typedef enum ClientCertMode > >+{ > >+ clientCertOff, > >+ clientCertOn, > >+ clientCertFull > >+} ClientCertMode; > >+ > +1 > >What do you think about using clientCertCA for the enumerator name > >instead of clientCertOn? That would correspond better to the names > >"verify-ca" and "verify-full". > > > +1 > I'm not sure if Magnus had any other cases in mind when he named it > clientCertOn? > > > Should I mark this entry as "Needs review" in the commitfest? It seems some > discussion is still needed to get this commited... I've rebased the patch atop master so it applies and passes 'make check-world'. I didn't make any other changes. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate >From 4dc30c3af69ced946b2ceab8c6b8e003d3583a00 Mon Sep 17 00:00:00 2001 From: David Fetter Date: Thu, 2 Aug 2018 23:07:29 -0700 Subject: [PATCH] clientcert_verify_full v06 To: pgsql-hack...@postgresql.org --- doc/src/sgml/client-auth.sgml | 15 +++--- doc/src/sgml/runtime.sgml | 51 +++--- src/backend/libpq/auth.c | 39 -- src/backend/libpq/hba.c| 28 ++- src/include/libpq/hba.h| 9 +- src/test/ssl/ServerSetup.pm| 10 ++- src/test/ssl/t/001_ssltests.pl | 21 ++ 7 files changed, 147 insertions(+), 26 deletions(-) mode change 100644 => 100755 src/test/ssl/t/001_ssltests.pl diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml index c2114021c3..6b261aea92 100644 --- a/doc/src/sgml/client-auth.sgml +++ b/doc/src/sgml/client-auth.sgml @@ -563,10 +563,17 @@ hostnossl database user
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Fri, Aug 3, 2018 at 9:43 AM, Kyotaro HORIGUCHI wrote: > > Something-like-but-other-hanVar node? I'm not sure it is needed, > because whatever node we add to the relation-tlist, we must add > the correspondence to the relation descriptor. And if we do that, > a Var works to point it. (Am I correctly understanding?) > The purpose of non-Var node is to avoid adding the attribute to relation descriptor. Idea is to create a new node, which will act as a place holder for table oid or row id (whatever) to be fetched from the foreign server. I don't understand why do you think we need it to be added to the relation descriptor. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company