date:20181019

Re: partition tree inspection functions

2018-10-19 Thread Michael Paquier

On Fri, Oct 19, 2018 at 01:05:52PM +0900, Amit Langote wrote:
> As I said above, the price of removing relhassubclass might be a bit
> steep.  So, the other alternative I mentioned before is to set
> relhassubclass correctly even for indexes if only for pg_partition_tree to
> be able to use find_inheritance_children unchanged.

Playing devil's advocate a bit more...  Another alternative here would
be to remove the fast path using relhassubclass from
find_inheritance_children and instead have its callers check for it :)

Anyway, it seems that you are right here.  Just setting relhassubclass
for partitioned indexes feels more natural with what's on HEAD now.
Even if I'd like to see all those hypothetical columns in pg_class go
away, that cannot happen without a close lookup at the performance
impact.
--
Michael


signature.asc
Description: PGP signature

Re: Protect syscache from bloating with negative cache entries

2018-10-19 Thread Kyotaro HORIGUCHI

Hello. Thank you for the comment.

At Thu, 4 Oct 2018 04:27:04 +, "Ideriha, Takeshi" 
 wrote in 
<4E72940DA2BF16479384A86D54D0988A6F1BCB6F@G01JPEXMBKW04>
> >As a *PoC*, in the attached patch (which applies to current master), size of 
> >CTups are
> >counted as the catcache size.
> >
> >It also provides pg_catcache_size system view just to give a rough idea of 
> >how such
> >view looks. I'll consider more on that but do you have any opinion on this?
> >
...
> Great! I like this view.
> One of the extreme idea would be adding all the members printed by 
> CatCachePrintStats(), 
> which is only enabled with -DCATCACHE_STATS at this moment. 
> All of the members seems too much for customers who tries to change the cache 
> limit size
> But it may be some of the members are useful because for example cc_hits 
> would indicate that current 
> cache limit size is too small.

The attached introduces four features below. (But the features on
relcache and plancache are omitted).

1. syscache stats collector (in 0002)

Records syscache status consists of the same columns above and
"ageclass" information. We could somehow triggering a stats
report with signal but we don't want take/send/write the
statistics in signal handler. Instead, it is turned on by setting
track_syscache_usage_interval to a positive number in
milliseconds.

2. pg_stat_syscache view.  (in 0002)

This view shows catcache statistics. Statistics is taken only on
the backends where syscache tracking is active.

>  pid  | application_name |relname |cache_name 
> |   size   |ageclass | nentries  
> --+--++---+--+-+---
>  9984 | psql | pg_statistic   | pg_statistic_relid_att_inh_index  
> | 12676096 | {30,60,600,1200,1800,0} | {17660,17310,55870,0,0,0}

Age class is the basis of catcache truncation mechanism and shows
the distribution based on elapsed time since last access. As I
didn't came up an appropriate way, it is represented as two
arrays.  Ageclass stores maximum age for each class in
seconds. Nentries holds entry numbers correnponding to the same
element in ageclass. In the above example,

 age class  : # of entries in the cache
   up to   30s  : 17660
   up to   60s  : 17310
   up to  600s  : 55870
   up to 1200s  : 0
   up to 1800s  : 0
   more longer  : 0

 The ageclass is {0, 0.05, 0.1, 1, 2, 3}th multiples of
 cache_prune_min_age on the backend.

3. non-transactional GUC setting (in 0003)

It allows setting GUC variable set by the action
GUC_ACTION_NONXACT(the name requires condieration) survive beyond
rollback. It is required by remote guc setting to work
sanely. Without the feature a remote-set value within a trasction
will disappear involved in rollback. The only local interface for
the NONXACT action is set_config(name, value, is_local=false,
is_nonxact = true). pg_set_backend_guc() below works on this
feature.

4. pg_set_backend_guc() function.

Of course syscache statistics recording consumes significant
amount of time so it cannot be turned on usually. On the other
hand since this feature is turned on by GUC, it is needed to grab
the active client connection to turn on/off the feature(but we
cannot). Instead, I provided a means to change GUC variables in
another backend.

pg_set_backend_guc(pid, name, value) sets the GUC variable "name"
on the backend "pid" to "value".



With the above tools, we can inspect catcache statistics of
seemingly bloated process.

A. Find a bloated process pid using ps or something.

B. Turn on syscache stats on the process.
  =# select pg_set_backend_guc(9984, 'track_syscache_usage_interval', '1');

C. Examine the statitics.

=# select pid, relname, cache_name, size from pg_stat_syscache order by size 
desc limit 3;
 pid  |   relname|cache_name|   size   
--+--+--+--
 9984 | pg_statistic | pg_statistic_relid_att_inh_index | 32154112
 9984 | pg_cast  | pg_cast_source_target_index  | 4096
 9984 | pg_operator  | pg_operator_oprname_l_r_n_index  | 4096


=# select * from pg_stat_syscache where cache_name = 
'pg_statistic_relid_att_inh_index'::regclass;
-[ RECORD 1 ]-
pid | 9984
relname | pg_statistic
cache_name  | pg_statistic_relid_att_inh_index
size| 11026176
ntuples | 77950
searches| 77950
hits| 0
neg_hits| 0
ageclass| {30,60,600,1200,1800,0}
nentries| {17630,16950,43370,0,0,0}
last_update | 2018-10-17 15:58:19.738164+09


> >> Another option is that users only specify the total memory target size
> >> and postgres dynamically change each CatCache memory target size according 
> >> to a
> >certain metric.
> >> (, which still seems difficult and expensive to develop per benefit)
> >> What do you think about this?
> >
> >

Problem about partitioned table

2018-10-19 Thread Mehman Jafarov

Hi everyone,

I have a problem with partitioned table in PostgreSql.
Actually I use the version 10. I created the partitioned table in test
environment but face some problems with partitioned table constraint.
I google all about it last week and from the official site I get that
version 11 will be released and that feature will be supported as I
understand it.
>From version 11 documentation
"*Add support for PRIMARY KEY, FOREIGN KEY, indexes, and triggers on
partitioned tables*"
I install and configure yesterday as new 11 version released. And test it.
Unfortunately I didn't achieve again.
Neither I don't understand the new feature nor this case is actually not
supported.
Please help me about the problem.

In my test environment *CASE* is like that (I used the declarative
partitioning)

I have a *er_doc_to_user_relation* table before. And I partitioned this
table by list with column *state*.
I have created two partitions as following
*CREATE TABLE xx.er_doc_to_user_state_1_3*
* PARTITION OF xx.er_doc_to_user_relation
(oid,created_date,state,status,updated_date,branch_oid,state_update_date,user_position,*
*
fk_action_owner,fk_action_owner_org,fk_document,fk_flow,fk_org,fk_prew_doc_user_rel,fk_user,fk_workflow,fk_action_login_type)*
* FOR VALUES IN (1,3);*
* CREATE TABLE xx.er_doc_to_user_state_2_4_9*
* PARTITION OF xx.er_doc_to_user_relation
(oid,created_date,state,status,updated_date,branch_oid,state_update_date,user_position,*
*
fk_action_owner,fk_action_owner_org,fk_document,fk_flow,fk_org,fk_prew_doc_user_rel,fk_user,fk_workflow,fk_action_login_type)*
* FOR VALUES IN (2,4,9);*
After that I have created constraints and indexes for each partition
manually. Everything is OK until here.
When I try to create constraint in another table which references
*er_doc_to_user_relation* table.
Case 1: Try to create foreign key constraint reference to parent table
*er_doc_to_user_relation.*
* ALTER TABLE xx.er_doc_workflow_action*
* ADD CONSTRAINT fk_doc_work_act FOREIGN KEY
(fk_to_user_doc_rel)*
* REFERENCES xx.er_doc_to_user_relation(oid) MATCH SIMPLE*
* ON UPDATE NO ACTION*
* ON DELETE NO ACTION;*
Following error occurred:
*ERROR: cannot reference partitioned table
"er_doc_to_user_relation"*
* SQL state: 42809*

Because it is not supported so I try the second case as following.

Case 2: Try to create foreign key constraint reference to each partitioned
table separately (*er_doc_to_user_state_1_3, er_doc_to_user_state_2_4_9*).
* ALTER TABLE xx.er_doc_workflow_action*
* ADD CONSTRAINT fk_doc_work_act_1_3 FOREIGN KEY
(fk_to_user_doc_rel)*
* REFERENCES xx.er_doc_to_user_state_1_3(oid) MATCH SIMPLE*
* ON UPDATE NO ACTION*
* ON DELETE NO ACTION;*
Following error occurred:
* ERROR: insert or update on table
"er_doc_workflow_action" violates foreign key constraint
"fk_doc_work_act_1_3"*
* DETAIL: Key (fk_to_user_doc_rel)=(3hjbzok1mn100g) is not present in table
"er_doc_to_user_state_1_3". SQL state: 23503*

I think this error is normal because oid which is referenced is in other
partitioned table so it can't validate all data.
If I try to create foreign key constraint on second partition again same
error will be occurred due to same reason.

Note: I want to create constraint only one-to-one column (*fk_to_user_doc_rel
- oid*)

BIG QUESTION IS THAT

How can I solve this problem? What is your recommendations?

*PLEASE HELP ME !!!*

--
*Best Regards,*
*Mehman Jafarov*
*DBA Aministrator at CyberNet LLC*

Re: Optimze usage of immutable functions as relation

2018-10-19 Thread Anthony Bykov

The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

Hello, 
I was trying to review your patch, but I couldn't install it:

prepjointree.c: In function ‘pull_up_simple_function’:
prepjointree.c:1793:41: error: ‘functions’ undeclared (first use in this 
function); did you mean ‘PGFunction’?
  Assert(!contain_vars_of_level((Node *) functions, 0));

Was it a typo?

The new status of this patch is: Waiting on Author

Re: relhassubclass and partitioned indexes

2018-10-19 Thread Amit Langote

Thanks for commenting.

On 2018/10/19 15:17, Michael Paquier wrote:
> On Fri, Oct 19, 2018 at 01:45:03AM -0400, Tom Lane wrote:
>> Amit Langote  writes:
>>> Should relhassubclass be set/reset for partitioned indexes?
>>
>> Seems like a reasonable idea to me, at least the "set" end of it.
>> We don't ever clear relhassubclass for tables, so maybe that's
>> not necessary for indexes either.

We *do* reset opportunistically during analyze of inheritance parent; the
following code in acquire_inherited_sample_rows() can reset it:

 /*
  * Check that there's at least one descendant, else fail.  This could
  * happen despite analyze_rel's relhassubclass check, if table once had a
  * child but no longer does.  In that case, we can clear the
  * relhassubclass field so as not to make the same mistake again later.
  * (This is safe because we hold ShareUpdateExclusiveLock.)
  */
 if (list_length(tableOIDs) < 2)
 {
 /* CCI because we already updated the pg_class row in this command */
 CommandCounterIncrement();
 SetRelationHasSubclass(RelationGetRelid(onerel), false);
 ereport(elevel,
 (errmsg("skipping analyze of \"%s.%s\" inheritance tree ---
this inheritance tree contains no child tables",
 get_namespace_name(RelationGetNamespace(onerel)),
 RelationGetRelationName(onerel;
 return 0;
 }


Similarly, perhaps we can make REINDEX on a partitioned index reset the
flag if it doesn't find any children.  We won't be able to do that today
though, because:

static void
ReindexPartitionedIndex(Relation parentIdx)
{
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("REINDEX is not yet implemented for partitioned
indexes")));
}

> No objections to the proposal.  Allowing find_inheritance_children to
> find index trees for partitioned indexes could be actually useful for
> extensions like pg_partman.

Thanks.  Attached a patch to set relhassubclass when an index partition is
added to a partitioned index.

Regards,
Amit
>From f0c01ab41b35a5f21a90b0294d8216da78eb8882 Mon Sep 17 00:00:00 2001
From: amit 
Date: Fri, 19 Oct 2018 17:05:00 +0900
Subject: [PATCH 1/2] Set relhassubclass on index parents

---
 src/backend/catalog/index.c|  5 
 src/backend/commands/indexcmds.c   |  4 +++
 src/test/regress/expected/indexing.out | 46 +-
 src/test/regress/sql/indexing.sql  | 11 ++--
 4 files changed, 46 insertions(+), 20 deletions(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index f1ef4c319a..62cc6a5bb9 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -996,8 +996,13 @@ index_create(Relation heapRelation,
 
/* update pg_inherits, if needed */
if (OidIsValid(parentIndexRelid))
+   {
StoreSingleInheritance(indexRelationId, parentIndexRelid, 1);
 
+   /* Also, set the parent's relhassubclass. */
+   SetRelationHasSubclass(parentIndexRelid, true);
+   }
+
/*
 * Register constraint and dependencies for the index.
 *
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 3975f62c00..c392352871 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -2608,6 +2608,10 @@ IndexSetParentIndex(Relation partitionIdx, Oid parentOid)
systable_endscan(scan);
relation_close(pg_inherits, RowExclusiveLock);
 
+   /* If we added an index partition to parent, set its relhassubclass. */
+   if (OidIsValid(parentOid))
+   SetRelationHasSubclass(parentOid, true);
+
if (fix_dependencies)
{
ObjectAddress partIdx;
diff --git a/src/test/regress/expected/indexing.out 
b/src/test/regress/expected/indexing.out
index 225f4e9527..710b32192f 100644
--- a/src/test/regress/expected/indexing.out
+++ b/src/test/regress/expected/indexing.out
@@ -1,25 +1,35 @@
 -- Creating an index on a partitioned table makes the partitions
 -- automatically get the index
 create table idxpart (a int, b int, c text) partition by range (a);
+-- relhassubclass of a partitioned index is false before creating its partition
+-- it will be set below once partitions get created
+create index check_relhassubclass_of_this on idxpart (a);
+select relhassubclass from pg_class where relname = 
'check_relhassubclass_of_this';
+ relhassubclass 
+
+ f
+(1 row)
+
+drop index check_relhassubclass_of_this;
 create table idxpart1 partition of idxpart for values from (0) to (10);
 create table idxpart2 partition of idxpart for values from (10) to (100)
partition by range (b);
 create table idxpart21 partition of idxpart2 for values from (0) to (100);
 create index on idxpart (a);
-select relname, relkind, inhparent::regclass
+select relname, relkind, relhassubclass, inhparent::regclass
 from pg_class left join pg_index ix on (indexrelid =

Re: partition tree inspection functions

2018-10-19 Thread Amit Langote

On 2018/10/19 16:47, Michael Paquier wrote:
> On Fri, Oct 19, 2018 at 01:05:52PM +0900, Amit Langote wrote:
>> As I said above, the price of removing relhassubclass might be a bit
>> steep.  So, the other alternative I mentioned before is to set
>> relhassubclass correctly even for indexes if only for pg_partition_tree to
>> be able to use find_inheritance_children unchanged.
> 
> Playing devil's advocate a bit more...  Another alternative here would
> be to remove the fast path using relhassubclass from
> find_inheritance_children and instead have its callers check for it :)

Yeah, we could make it the responsibility of the callers of
find_all_inheritors and find_inheritance_children to check relhassubclass
as an optimization and remove any reference to relhassubclass from
pg_inherits.c.  Although we can write such a patch, it seems like it'd be
bigger than the patch to ensure the correct value of relhassubclass for
indexes, which I just posted on the other thread [1].

> Anyway, it seems that you are right here.  Just setting relhassubclass
> for partitioned indexes feels more natural with what's on HEAD now.
> Even if I'd like to see all those hypothetical columns in pg_class go
> away, that cannot happen without a close lookup at the performance
> impact.

Okay, I updated the patch on this thread.

Since the updated patch depends on the correct value of relhassubclass
being set for indexes, this patch should be applied on top of the other
patch.  I've attached here both.

Another change I made is something Robert and Alvaro seem to agree about
-- to use regclass instead of oid type as input/output columns.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/85d50b48-1b59-ae6c-e984-dd0b2926be3c%40lab.ntt.co.jp
>From f0c01ab41b35a5f21a90b0294d8216da78eb8882 Mon Sep 17 00:00:00 2001
From: amit 
Date: Fri, 19 Oct 2018 17:05:00 +0900
Subject: [PATCH 1/2] Set relhassubclass on index parents

---
 src/backend/catalog/index.c|  5 
 src/backend/commands/indexcmds.c   |  4 +++
 src/test/regress/expected/indexing.out | 46 +-
 src/test/regress/sql/indexing.sql  | 11 ++--
 4 files changed, 46 insertions(+), 20 deletions(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index f1ef4c319a..62cc6a5bb9 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -996,8 +996,13 @@ index_create(Relation heapRelation,
 
/* update pg_inherits, if needed */
if (OidIsValid(parentIndexRelid))
+   {
StoreSingleInheritance(indexRelationId, parentIndexRelid, 1);
 
+   /* Also, set the parent's relhassubclass. */
+   SetRelationHasSubclass(parentIndexRelid, true);
+   }
+
/*
 * Register constraint and dependencies for the index.
 *
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 3975f62c00..c392352871 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -2608,6 +2608,10 @@ IndexSetParentIndex(Relation partitionIdx, Oid parentOid)
systable_endscan(scan);
relation_close(pg_inherits, RowExclusiveLock);
 
+   /* If we added an index partition to parent, set its relhassubclass. */
+   if (OidIsValid(parentOid))
+   SetRelationHasSubclass(parentOid, true);
+
if (fix_dependencies)
{
ObjectAddress partIdx;
diff --git a/src/test/regress/expected/indexing.out 
b/src/test/regress/expected/indexing.out
index 225f4e9527..710b32192f 100644
--- a/src/test/regress/expected/indexing.out
+++ b/src/test/regress/expected/indexing.out
@@ -1,25 +1,35 @@
 -- Creating an index on a partitioned table makes the partitions
 -- automatically get the index
 create table idxpart (a int, b int, c text) partition by range (a);
+-- relhassubclass of a partitioned index is false before creating its partition
+-- it will be set below once partitions get created
+create index check_relhassubclass_of_this on idxpart (a);
+select relhassubclass from pg_class where relname = 
'check_relhassubclass_of_this';
+ relhassubclass 
+
+ f
+(1 row)
+
+drop index check_relhassubclass_of_this;
 create table idxpart1 partition of idxpart for values from (0) to (10);
 create table idxpart2 partition of idxpart for values from (10) to (100)
partition by range (b);
 create table idxpart21 partition of idxpart2 for values from (0) to (100);
 create index on idxpart (a);
-select relname, relkind, inhparent::regclass
+select relname, relkind, relhassubclass, inhparent::regclass
 from pg_class left join pg_index ix on (indexrelid = oid)
left join pg_inherits on (ix.indexrelid = inhrelid)
where relname like 'idxpart%' order by relname;
- relname | relkind |   inhparent
--+-+
- idxpart | p   | 
- idxpart1| r   | 
- idxpart1_a_idx  | i

Re: Function to promote standby servers

2018-10-19 Thread Laurenz Albe

Michael Paquier wrote:
> +   /* wait for up to a minute for promotion */
> +   for (i = 0; i < WAITS_PER_SECOND * WAIT_SECONDS; ++i)
> +   {
> +   if (!RecoveryInProgress())
> +   PG_RETURN_BOOL(true);
> +
> +   pg_usleep(100L / WAITS_PER_SECOND);
> +   }
> I would recommend to avoid pg_usleep and instead use a WaitLatch() or
> similar to generate a wait event.  The wait can then also be seen in
> pg_stat_activity, which is useful for monitoring purposes.  Using
> RecoveryInProgress is indeed doable, and that's more simple than what I
> thought first.

Agreed, done.

I have introduced a new wait event, because I couldn't find one that fit.

> Something I missed to mention in the previous review: the timeout should
> be manually enforceable, with a default at 60s.

Ok, added as a new parameter "wait_seconds".

> Is the function marked as strict?  Per the code it should be, I am not
> able to test now though.

Yes, it is.

> You are missing REVOKE EXECUTE ON FUNCTION pg_promote() in
> system_views.sql, or any users could trigger a promotion, no?

You are right *blush*.
Fixed.

Yours,
Laurenz Albe
From 08951fea7c526450d9a632ef0e6e246dd9dba307 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Fri, 19 Oct 2018 13:24:29 +0200
Subject: [PATCH] Add pg_promote() to promote standby servers

---
 doc/src/sgml/func.sgml | 21 ++
 doc/src/sgml/high-availability.sgml|  2 +-
 doc/src/sgml/recovery-config.sgml  |  3 +-
 src/backend/access/transam/xlog.c  |  6 --
 src/backend/access/transam/xlogfuncs.c | 82 ++
 src/backend/catalog/system_views.sql   |  8 +++
 src/backend/postmaster/pgstat.c|  3 +
 src/include/access/xlog.h  |  6 ++
 src/include/catalog/pg_proc.dat|  4 ++
 src/include/pgstat.h   |  3 +-
 src/test/recovery/t/004_timeline_switch.pl |  6 +-
 src/test/recovery/t/009_twophase.pl|  6 +-
 12 files changed, 138 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 5193df3366..88121cdc66 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18731,6 +18731,9 @@ SELECT set_config('log_statement_stats', 'off', false);

 pg_terminate_backend

+   
+pg_promote
+   
 

 signal
@@ -18790,6 +18793,16 @@ SELECT set_config('log_statement_stats', 'off', false);
 however only superusers can terminate superuser backends.

   
+  
+   
+pg_promote(wait boolean DEFAULT true, wait_seconds integer DEFAULT 60)
+
+   boolean
+   Promote a physical standby server.  This function is restricted to
+superusers by default, but other users can be granted EXECUTE to run
+the function.
+   
+  
  
 

@@ -18827,6 +18840,14 @@ SELECT set_config('log_statement_stats', 'off', false);
 subprocess.

 
+   
+pg_promote can only be called on standby servers.
+If the argument wait is true,
+the function waits until promotion is complete or wait_seconds
+seconds have passed, otherwise the function returns immediately after sending
+the promotion signal to the postmaster.
+   
+
   
 
   
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index ebcb3daaed..f8e036965c 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1472,7 +1472,7 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
 

 To trigger failover of a log-shipping standby server,
-run pg_ctl promote or create a trigger
+run pg_ctl promote, call pg_promote(), or create a trigger
 file with the file name and path specified by the trigger_file
 setting in recovery.conf. If you're planning to use
 pg_ctl promote to fail over, trigger_file is
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index 92825fdf19..d06cd0b08e 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -439,7 +439,8 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
  
   Specifies a trigger file whose presence ends recovery in the
   standby.  Even if this value is not set, you can still promote
-  the standby using pg_ctl promote.
+  the standby using pg_ctl promote or calling
+  pg_promote().
   This setting has no effect if standby_mode is off.
  
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 7375a78ffc..62fc418893 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -78,12 +78,6 @@
 
 extern uint32 bootstrap_data_checksum_version;
 
-/* File path names (all relative to $PGDATA) */
-#define RECOVERY_COMMAND_FILE	"recovery.conf"
-#define RECOVERY_COMMAND_DONE	"recovery.done"
-#d

Speeding up text_position_next with multibyte encodings

2018-10-19 Thread Heikki Linnakangas

Attached is a patch to speed up text_position_setup/next(), in some 
common cases with multibyte encodings.


text_position_next() uses the Boyer-Moore-Horspool search algorithm, 
with a skip table. Currently, with a multi-byte encoding, we first 
convert both input strings to arrays of wchars, and run the algorithm on 
those arrays.


This patch removes the mb->wchar conversion, and instead runs the 
single-byte version of the algorithm directly on the inputs, even with 
multi-byte encodings. That avoids the mb->wchar conversion altogether, 
when there is no match. Even when there is a match, we don't need to 
convert the whole input string to wchars. It's enough to count the 
characters up to the match, using pg_mblen(). And when 
text_position_setup/next() are used as part of split_part() or replace() 
functions, we're not actually even interested in the character position, 
so we can skip that too.


Avoiding the large allocation is nice, too. That was actually why I 
stated to look into this: a customer complained that text_position fails 
with strings larger than 256 MB.


Using the byte-oriented search on multibyte strings might return false 
positives, though. To make that work, after finding a match, we verify 
that the match doesn't fall in the middle of a multi-byte character. 
However, as an important special case, that cannot happen with UTF-8, 
because it has the property that the byte sequence of one character 
never appears as part of another character. I think some other encodings 
might have the same property, but I wasn't sure.


This is a win in most cases. One case is slower: calling position() with 
a large haystack input, where the match is near the end of the string. 
Counting the characters up to that point is slower than the mb->wchar 
conversion. I think we could avoid that regression too, if we had a more 
optimized function to count characters. Currently, the code calls 
pg_mblen() in a loop, whereas in pg_mb2wchar_with_len(), the similar 
loop through all the characters is a tight loop within the function.


Thoughts?

- Heikki
>From 1efb5ace7cf9da63f300942f15f9da2fddfb4de5 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Fri, 19 Oct 2018 15:12:39 +0300
Subject: [PATCH 1/1] Use single-byte Boyer-Moore-Horspool search even with
 multibyte encodings.

The old implementation first converted the input strings to arrays of
wchars, and performed the on those. However, the conversion is expensive,
and for a large input string, consumes a lot of memory.

Avoid the conversion, and instead use the single-byte algorithm even with
multibyte encodings. That has a couple of problems, though. Firstly, we
might get fooled if the needle string's byte sequence appears embedded
inside a different string. We might incorrectly return a match in the
middle of a multi-byte character. The second problem is that the
text_position function needs to return the position of the match, counted
in characters, rather than bytes. We can work around both of those problems
by an extra step after finding a match. Walk the string one character at a
time, starting from the beginning, until we reach the position of the match.
We can then check if the match was found at a valid character boundary,
which solves the first problem, and we can count the characters, so that we
can return the character offset. Walking the characters is faster than the
wchar conversion, especially in the case that there are no matches and we
can avoid it altogether. Also, avoiding the large allocation allows these
functions to work with inputs larger than 256 MB, even with multibyte
encodings.

For UTF-8, we can even skip the step to verify that the match lands on a
character boundary, because in UTF-8, the byte sequence of one character
cannot appear in the middle of a different character. I think many other
encodings have the same property, but I wasn't sure, so I only enabled
that optimization for UTF-8.

Most of the callers of the text_position_setup/next functions were actually
not interested in the position of the match, counted in characters. To
cater for them, refactor the text_position_next() interface into two parts:
searching for the next match (text_position_next()), and returning the
current match's position as a pointer (text_position_get_match_ptr()) or
as a character offset (text_position_get_match_pos()). Most callers of
text_position_setup/next are not interested in the character offsets, so
getting the pointer to the match within the original string is a more
convenient API for them. It also allows skipping the character counting
with UTF-8 encoding altogether.
---
 src/backend/utils/adt/varlena.c | 475 +---
 1 file changed, 253 insertions(+), 222 deletions(-)

diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index a5e812d026..0c8c8b9a4f 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -43,18 +43,33 @@ int			byt

ERROR's turning FATAL in BRIN regression tests

2018-10-19 Thread John Naylor

Hi all,

I ran into a surprising behavior while hacking on the FSM delay patch.

I changed the signature of a freespace.c function that the BRIN code
calls, and this change by itself doesn't cause a crash. With the full
FSM patch, causing BRIN errors in manual queries in psql doesn't cause
a crash. However, during the BRIN regression tests, the queries that
purposely cause errors result in FATAL instead, causing a crash.

For example, brin_summarize_new_values() eventually leads to calling
index_open():

ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 errmsg("\"%s\" is not an index",
RelationGetRelationName(r;


But the elevel is 21 (FATAL) in this partial stack trace:

#0  0x5623aff8aed6 in proc_exit_prepare (code=1) at
/home/john/pgdev/postgresql/src/backend/storage/ipc/ipc.c:209
__func__ = "proc_exit_prepare"
#1  0x5623aff8adbb in proc_exit (code=1) at
/home/john/pgdev/postgresql/src/backend/storage/ipc/ipc.c:107
__func__ = "proc_exit"
#2  0x5623b01435c3 in errfinish (dummy=0) at
/home/john/pgdev/postgresql/src/backend/utils/error/elog.c:540
edata = 0x5623b06f63c0 
elevel = 21
oldcontext = 0x5623b2121b10
econtext = 0x0
__func__ = "errfinish"
#3  0x5623afbb9ab1 in index_open (relationId=52389, lockmode=4) at
/home/john/pgdev/postgresql/src/backend/access/index/indexam.c:159
__errno_location = 
r = 0x7f03773e1750
__func__ = "index_open"


If I comment out the tests that purposely cause errors, the BRIN tests
fail normally, but then another test in the same parallel group
crashes, causing all later tests to fail.

--- 1 
! psql: FATAL:  the database system is in recovery mode

and

*** 65,67 
--- 65,74 
  -- Modify fillfactor in existing index
  alter index spgist_point_idx set (fillfactor = 90);
  reindex index spgist_point_idx;
+ WARNING:  terminating connection because of crash of another server process
+ DETAIL:  The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
+ HINT:  In a moment you should be able to reconnect to the database
and repeat your command.
+ server closed the connection unexpectedly
+   This probably means the server terminated abnormally
+   before or while processing the request.
+ connection to server was lost


I'm thoroughly stumped -- anyone have an idea where to look next?

Thanks,
-John Naylor

Re: pgsql: Add TAP tests for pg_verify_checksums

2018-10-19 Thread Michael Paquier

On Wed, Oct 17, 2018 at 05:30:05PM -0400, Andrew Dunstan wrote:
> Fine by me.

Thanks.  This is now committed after some tweaks to the comments, a bit
earlier than I thought first.
--
Michael


signature.asc
Description: PGP signature

Re: lowering pg_regress privileges on Windows

2018-10-19 Thread Andrew Dunstan




On 10/18/2018 08:25 PM, Thomas Munro wrote:

On Fri, Oct 19, 2018 at 1:13 PM Michael Paquier  wrote:

On Thu, Oct 18, 2018 at 08:31:11AM -0400, Andrew Dunstan wrote:

The attached ridiculously tiny patch solves the problem whereby while we can
run Postgres on Windows safely from an Administrator account, we can't run
run the regression tests from the same account, since it fails on the
tablespace test, the tablespace directory having been set up without first
having lowered privileges. The solution is to lower pg_regress' privileges
in the same way that we do with other binaries. This is useful in setups
like Appveyor where running under any other account is ... difficult. For
the cfbot Thomas has had to make the script hack the schedule file to omit
the tablespace test. This would make that redundant.

I propose to backpatch this. It's close enough to a bug and the risk is
almost infinitely small.

+1.  get_restricted_token() refactoring has been done down to
REL9_5_STABLE.  With 9.4 and older you would need to copy again this
full routine into pg_regress.c, which is in my opinion not worth
worrying about.

FWIW here is a successful Appveyor build including the full test
schedule (CI patch attached in case anyone is interested).  Woohoo!
Thanks for figuring that out Andrew.  I will be very happy to remove
that wart from my workflows.

https://ci.appveyor.com/project/macdice/postgres/builds/19626669



Excellent. I'll apply back to 9.5 as Michael suggests.

Having got past that hurdle I encountered another one in the same area. 
pg_upgrade gives up its privileges and is then unable to write things 
like log files and analyze scripts.


The attached patch cures the problem, but it doesn't seem like the best 
cure. Maybe there is a more secure way to do it. Essentially it saves 
out the ACLS for the current directory and its subdirectories and then 
allows everyone to write to them, right before running pg_upgrade. When 
pg_upgrade is done it restores the saved ACLs.


Maybe someone who understands more about how this all works can suggest 
a less blunt force approach.


cheers

andrew

--
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

diff --git a/src/tools/msvc/vcregress.pl b/src/tools/msvc/vcregress.pl
index ce5c976c16..0f25b44d0a 100644
--- a/src/tools/msvc/vcregress.pl
+++ b/src/tools/msvc/vcregress.pl
@@ -569,11 +569,14 @@ sub upgradecheck
 	$ENV{PGDATA} = "$data";
 	print "\nSetting up new cluster\n\n";
 	standard_initdb() or exit 1;
+	system('icacls . /save savedacls /t');
+	system('icacls . /grant "*S-1-1-0":(OI)(CI)M');
 	print "\nRunning pg_upgrade\n\n";
 	@args = (
 		'pg_upgrade', '-d', "$data.old", '-D', $data, '-b',
 		$bindir,  '-B', $bindir);
 	system(@args) == 0 or exit 1;
+	system('icacls . /restore savedacls');
 	print "\nStarting new cluster\n\n";
 	@args = ('pg_ctl', '-l', "$logdir/postmaster2.log", 'start');
 	system(@args) == 0 or exit 1;

WAL archive (archive_mode = always) ?

2018-10-19 Thread Adelino Silva

Hi,

What is the advantage to use archive_mode = always in a slave server
compared to archive_mode = on (shared WAL archive) ?
I only see duplication of Wal files, what is the purpose of this feature ?

Many thanks in advance,
Adelino.

Re: removing unnecessary get_att*() lsyscache functions

2018-10-19 Thread Michael Paquier

On Thu, Oct 18, 2018 at 09:57:00PM +0200, Peter Eisentraut wrote:
> I noticed that get_attidentity() isn't really necessary because the
> information can be obtained from an existing tuple descriptor in each
> case.

This one is also recent, so it looks fine to remove it.

> Also, get_atttypmod() hasn't been used since 2004.

github is not actually reporting areas where this is used.

> I propose the attached patches to remove these two functions.

> - if (get_attidentity(RelationGetRelid(rel), attnum))
> + if (TupleDescAttr(RelationGetDescr(rel), attnum - 1)->attidentity)

I find this style heavy, saving Form_pg_attribute into a different
variable would be more readable in my opinion..
--
Michael


signature.asc
Description: PGP signature

Re: WAL archive (archive_mode = always) ?

2018-10-19 Thread Michael Paquier

On Fri, Oct 19, 2018 at 03:00:15PM +0100, Adelino Silva wrote:
> What is the advantage to use archive_mode = always in a slave server
> compared to archive_mode = on (shared WAL archive) ?
> I only see duplication of Wal files, what is the purpose of this
> feature?

Some users like having redundancy in their backups and archives, so as
all things are on multiple location.  archive_mode = always helps in
leveraging these needs.
--
Michael


signature.asc
Description: PGP signature

Re: ERROR's turning FATAL in BRIN regression tests

2018-10-19 Thread Tom Lane

John Naylor  writes:
> I changed the signature of a freespace.c function that the BRIN code
> calls, and this change by itself doesn't cause a crash. With the full
> FSM patch, causing BRIN errors in manual queries in psql doesn't cause
> a crash. However, during the BRIN regression tests, the queries that
> purposely cause errors result in FATAL instead, causing a crash.

Sounds like something's messing up the backend's exception stack,
so that elog.c has noplace to throw the error to.  See the
promote-ERROR-to-FATAL logic therein.

regards, tom lane

Re: Problem about partitioned table

2018-10-19 Thread Adrian Klaver


On 10/19/18 2:03 AM, Mehman Jafarov wrote:

Hi everyone,

I have a problem with partitioned table in PostgreSql.
Actually I use the version 10. I created the partitioned table in test 
environment but face some problems with partitioned table constraint.
I google all about it last week and from the official site I get that 
version 11 will be released and that feature will be supported as I 
understand it.

 From version 11 documentation
"/Add support for |PRIMARY KEY|, |FOREIGN KEY|, indexes, and triggers on 
partitioned tables/"
I install and configure yesterday as new 11 version released. And test 
it. Unfortunately I didn't achieve again.
Neither I don't understand the new feature nor this case is actually not 
supported.

Please help me about the problem.


As you found out:

https://www.postgresql.org/docs/11/static/ddl-partitioning.html

5.10.2.3. Limitations

"While primary keys are supported on partitioned tables, foreign keys 
referencing partitioned tables are not supported. (Foreign key 
references from a partitioned table to some other table are supported.)"





   Note: I want to create constraint only one-to-one column 
(/fk_to_user_doc_rel - oid/)


BIG QUESTION IS THAT

How can I solve this problem?  What is your recommendations?


Well a FK is a form of a trigger, so maybe create your own trigger on 
the child table(s).




*PLEASE HELP ME !!!*

--
*/Best Regards,/*
*/Mehman Jafarov/*
*/DBA Aministrator at CyberNet LLC/*
*/
/*



--
Adrian Klaver
adrian.kla...@aklaver.com

66 matches

Mail list logo