Re: [PATCH] Keeps tracking the uniqueness with UniqueKey

2020-12-07 Thread Jesper Pedersen

Hi,

On 12/5/20 10:38 PM, Andy Fan wrote:

Currently the UniqueKey is defined as a List of Expr, rather than
EquivalenceClasses.
A complete discussion until now can be found at [1] (The messages I replied
to also
care a lot and the information is completed). This patch has stopped at
this place for
a while,  I'm planning to try EquivalenceClasses,  but any suggestion would
be welcome.




Unfortunately I think we need a RfC style patch of both versions in 
their minimum implementation.


Hopefully this will make it easier for one or more committers to decide 
on the right direction since they can do a side-by-side comparison of 
the two solutions.


Just my $0.02.

Thanks for working on this Andy !

Best regards,
 Jesper





Re: [PATCH] Keeps tracking the uniqueness with UniqueKey

2020-11-30 Thread Jesper Pedersen

Hi,

On 11/30/20 5:04 AM, Heikki Linnakangas wrote:

On 26/11/2020 16:58, Andy Fan wrote:

This patch has stopped moving for a while,  any suggestion about
how to move on is appreciated.


The question on whether UniqueKey.exprs should be a list of 
EquivalenceClasses or PathKeys is unresolved. I don't have an opinion 
on that, but I'd suggest that you pick one or the other and just go 
with it. If it turns out to be a bad choice, then we'll change it.


In this case I think it is matter of deciding if we are going to use 
EquivalenceClasses or Exprs before going further; there has been work 
ongoing in this area for a while, so having a clear direction from a 
committer would be greatly appreciated.


Deciding would also help potential reviewers to give more feedback on 
the features implemented on top of the base.


Should there be a new thread with the minimum requirements in order to 
get closer ?


Best regards,
 Jesper





Re: [GSoC] Metrics and Monitoring for pgagroal

2021-04-09 Thread Jesper Pedersen

Hi Junduo,

On 4/9/21 11:53 AM, Junduo Dong wrote:

I'm Junduo Dong, a 22-year-old studying at China University of Geosciences.

I would like to participate in the project "pgagroal: Metrics and monitoring"
on page "https://wiki.postgresql.org/wiki/GSoC_2021"; for GSoC 2021.

Attachment is the proposal, please review.

I hope that my proposal will be successfully accepted and that I will join
the PostgreSQL hacker community in the future.



Thanks for your interest in Google Summer of Code, and the pgagroal 
proposal within the PostgreSQL umbrella.



I'll contact you off-list, and we can get the process going to finalize 
your submission to the GSoC program before the April 13 deadline.



Thanks !


Best regards,

 Jesper






Re: MDAM techniques and Index Skip Scan patch

2021-10-25 Thread Jesper Pedersen

Hi Peter,

On 10/21/21 22:16, Peter Geoghegan wrote:

I returned to the 1995 paper "Efficient Search of Multidimensional
B-Trees" [1] as part of the process of reviewing v39 of the skip scan
patch, which was posted back in May. It's a great paper, and anybody
involved in the skip scan effort should read it thoroughly (if they
haven't already). It's easy to see why people get excited about skip
scan [2]. But there is a bigger picture here.



Thanks for starting this thread !

The Index Skip Scan patch could affect a lot of areas, so keeping MDAM 
in mind is definitely important.


However, I think the key part to progress on the "core" functionality 
(B-tree related changes) is to get the planner functionality in place 
first. Hopefully we can make progress on that during the November 
CommitFest based on Andy's patch.


Best regards,
 Jesper





Re: Portability report: ninja

2021-11-01 Thread Jesper Pedersen

Hi,

On 11/1/21 15:25, Tom Lane wrote:

So it's pretty clear that if we go this way, it'll be the end of
the line for support for some very old OS versions.  I can't,
however, argue with the idea that it's reasonable to require
POSIX 2001 support now.  Based on these results, I doubt that
ninja will give us trouble on any platform that isn't old
enough to get its drivers license.



You can also look at it as:


If PostgreSQL choose a newer build system then it is up to the companies 
owning the "non-supported" operating systems to add support for the 
build system in question; not the PostgreSQL community.



+1 for POSIX.2001 and meson/ninja.


Best regards,

 Jesper






Re: [HACKERS] path toward faster partition pruning

2018-03-27 Thread Jesper Pedersen

Hi Amit,

On 03/27/2018 06:42 AM, Amit Langote wrote:

I have managed to make the recursion go away in the attached updated
version.  I guess that's the result of employing the idea of a "output
register" for individual pruning steps as mentioned in Robert's email
upthread where he detailed the "pruning steps" approach [1].

With the new patch, pruning steps for arguments of, say, an OR clause are
not performed recursively.  Instead, each pruning step is performed
independently and its output is stored in a slot dedicated to it.  Combine
steps are always executed after all of the steps corresponding to its
arguments have been executed.  That's ensured by the way steps are allocated.



Running v41 with "partition_prune" under valgrind gives the attached report.

Best regards,
 Jesper


25970.log.gz
Description: application/gzip


Re: [HACKERS] path toward faster partition pruning

2018-03-27 Thread Jesper Pedersen

Hi,

On 03/27/2018 01:46 PM, Jesper Pedersen wrote:
Running v41 with "partition_prune" under valgrind gives the attached 
report.




The reports mostly involve interaction with catcache.c and dynahash.c, 
so something for a separate thread.


Best regards,
 Jesper




Re: [HACKERS] path toward faster partition pruning

2018-03-28 Thread Jesper Pedersen

Hi,

On 03/28/2018 06:30 AM, Amit Langote wrote:

On 2018/03/28 18:29, Amit Langote wrote:

Attached is the updated set of patches, which contains other miscellaneous
changes such as updated comments, beside the main changes described above.


Sorry, one of those miscellaneous changes was a typo that would cause
compilation to fail... Sigh.   Fixed in the updated version.



Just some trivial changes.

However,

explain (costs off) select * from mc2p where a = 2 and b < 1;

is picking up

   ->  Seq Scan on mc2p2
 Filter: ((b < 1) AND (a = 2))

which doesn't seem right, as its definition is

create table mc2p2 partition of mc2p for values from (1, 1) to (2, 
minvalue);


Best regards,
 Jesper
>From 8bb5f25d31471910db2e7907b4c13029edd7bbdf Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Wed, 28 Mar 2018 12:39:42 -0400
Subject: [PATCH] Trivial changes

---
 src/backend/catalog/partition.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 3c26daa098..2ee5ce605c 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -1774,7 +1774,7 @@ perform_pruning_base_step(PartitionPruneContext *context,
 	Datum		values[PARTITION_MAX_KEYS];
 	FmgrInfo	partsupfunc[PARTITION_MAX_KEYS];
 
-	/* There better be same number of expressions and compare functions. */
+	/* There better be the same number of expressions and compare functions. */
 	Assert(list_length(opstep->exprs) == list_length(opstep->cmpfns));
 
 	nvalues = 0;
@@ -1783,7 +1783,7 @@ perform_pruning_base_step(PartitionPruneContext *context,
 
 	/*
 	 * Generate the partition look-up key that will be used by one of
-	 * get_partitions_from_keys_* functions called below.
+	 * the get_partitions_from_keys_* functions called below.
 	 */
 	for (keyno = 0; keyno < context->partnatts; keyno++)
 	{
@@ -1969,7 +1969,7 @@ perform_pruning_combine_step(PartitionPruneContext *context,
 	PruneStepResult *result = NULL;
 
 	/*
-	 * In some cases, planner generates a combine step that doesn't contain
+	 * In some cases, the planner generates a combine step that doesn't contain
 	 * any argument steps, to signal us to not prune any partitions.  So,
 	 * return indexes of all datums in that case, including null and/or
 	 * default partition, if any.
@@ -1994,7 +1994,7 @@ perform_pruning_combine_step(PartitionPruneContext *context,
 			PruneStepResult *step_result;
 
 			/*
-			 * step_results[step_id] must contain valid result, which is
+			 * step_results[step_id] must contain a valid result, which is
 			 * confirmed by the fact that cstep's ID is greater than
 			 * step_id.
 			 */
@@ -2147,10 +2147,10 @@ get_partitions_for_keys_hash(PartitionPruneContext *context,
  * If special partitions (null and default) need to be scanned for given
  * values, set scan_null and scan_default in result if present.
  *
- * 'nvalues', if non-zero, should be exactly 1, because list partitioning.
+ * 'nvalues', if non-zero, should be exactly 1, because of list partitioning.
  * 'value' contains the value to use for pruning
  * 'opstrategy' if non-zero must be a btree strategy number
- * 'partsupfunc' contains list partitioning comparison function to be used to
+ * 'partsupfunc' contains the list partitioning comparison function to be used to
  * perform partition_list_bsearch
  * 'nullkeys' is the set of partition keys that are null.
  */
@@ -2161,7 +2161,6 @@ get_partitions_for_keys_list(PartitionPruneContext *context,
 {
 	PruneStepResult *result;
 	PartitionBoundInfo boundinfo = context->boundinfo;
-	int		   *partindices = boundinfo->indexes;
 	int			off,
 minoff,
 maxoff;
@@ -2272,7 +2271,7 @@ get_partitions_for_keys_list(PartitionPruneContext *context,
 			{
 /*
  * This case means all partition bounds are greater, which in
- * turn means that all partition satisfy this key.
+ * turn means that all partitions satisfy this key.
  */
 off = 0;
 			}
@@ -2333,7 +2332,7 @@ get_partitions_for_keys_list(PartitionPruneContext *context,
  * 'values' contains values for partition keys (or a prefix) to be used for
  * pruning
  * 'opstrategy' if non-zero must be a btree strategy number
- * 'partsupfunc' contains range partitioning comparison function to be used to
+ * 'partsupfunc' contains the range partitioning comparison function to be used to
  * perform partition_range_datum_bsearch or partition_rbound_datum_cmp
  * 'nullkeys' is the set of partition keys that are null.
  */
@@ -2404,7 +2403,7 @@ get_partitions_for_keys_range(PartitionPruneContext *context,
 			{
 if (nvalues == partnatts)
 {
-	/* There can only be zero or one matching partitions. */
+	/* There can only be zero or one matching partition. */
 	if (partindices[off + 1] >= 0)
 	{
 		result->datum_offsets = bms_make_singleton(off + 1);
@@ -2532,7 +2531,7 @@ get_partitions_for_keys_range(PartitionPruneContext *context,

Re: JIT compiling with LLVM v12

2018-03-29 Thread Jesper Pedersen

Hi Andres,

On 03/28/2018 05:27 PM, Andres Freund wrote:

On 2018-03-27 10:34:26 -0700, Andres Freund wrote:

On 2018-03-27 10:05:47 -0400, Peter Eisentraut wrote:

On 3/13/18 19:40, Andres Freund wrote:

I've pushed a revised and rebased version of my JIT patchset.


What is the status of this item as far as the commitfest is concerned?


7/10 committed. Inlining, Explain, Docs remain.


I've pushed these three.



It seems that clang is being picked up as the main compiler in certain 
situations, ala


ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
-Wformat-security -fno-strict-aliasing -fwrapv 
-fexcess-precision=standard -g -O0 -fno-omit-frame-pointer 
-I../../../src/include  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o 
auth-scram.o auth-scram.c -MMD -MP -MF .deps/auth-scram.Po
ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
-Wformat-security -fno-strict-aliasing -fwrapv 
-fexcess-precision=standard -g -O0 -fno-omit-frame-pointer 
-I../../../src/include  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o 
be-secure-openssl.o be-secure-openssl.c -MMD -MP -MF 
.deps/be-secure-openssl.Po
/usr/lib64/ccache/clang -Wno-ignored-attributes -fno-strict-aliasing 
-fwrapv -O2  -I../../../src/include  -D_GNU_SOURCE 
-I/usr/include/libxml2  -flto=thin -emit-llvm -c -o be-fsstubs.bc 
be-fsstubs.c
/usr/lib64/ccache/clang -Wno-ignored-attributes -fno-strict-aliasing 
-fwrapv -O2  -I../../../src/include  -D_GNU_SOURCE 
-I/usr/include/libxml2  -flto=thin -emit-llvm -c -o namespace.bc namespace.c


I would expect LLVM to be isolated to the jit/ hierarchy.

Using CC="ccache gcc" and --with-llvm.

And congrats on getting the feature in !

Best regards,
 Jesper



Re: JIT compiling with LLVM v12

2018-03-29 Thread Jesper Pedersen

Hi,

On 03/29/2018 11:03 AM, Pierre Ducroquet wrote:

Clang is needed to emit the LLVM bitcode required for inlining. The "-emit-
llvm" flag is used for that. A dual compilation is required for inlining to
work, one compilation with gcc/clang/msvc/… to build the postgresql binary,
one with clang to generate the .bc files for inlining.
It can be surprising, but there is little way around that (or we accept only
clang to build postgresql, but there would be a riot).



Thanks Pierre.

Best regards,
 Jesper
diff --git a/src/backend/jit/README b/src/backend/jit/README
index bfed319189..dca4aa761c 100644
--- a/src/backend/jit/README
+++ b/src/backend/jit/README
@@ -55,6 +55,9 @@ unlikely to be discontinued, because it has a license compatible with
 PostgreSQL, and because its LLVM IR can be generated from C
 using the clang compiler.
 
+clang will be used to compile files where LLVM bitcode is required to
+inline functions using the -emit-llvm flag. This may cause PostgreSQL
+to be compiled with multiple compilers, such as gcc and clang.
 
 Shared Library Separation
 -


Re: [HACKERS] Runtime Partition Pruning

2018-04-03 Thread Jesper Pedersen

Hi David,

On 03/31/2018 09:52 AM, David Rowley wrote:

I've attached a new version of the patch.  I'm now at v18 after having
some versions of the patch that I didn't release which were based on
various versions of Amit's faster partition pruning patch.



Thank you for the updated patch set !

I have tested this together with Amit's v46 patch.

The attached case doesn't trigger a generic plan, so basically all time 
is spent in GetCachedPlan.


The standard table case (std.sql) gives:

 generic_cost = 8.4525
 avg_custom_cost = 13.4525
 total_custom_cost = 67.2625

whereas the 64 hash partition case (hash.sql) gives:

 generic_cost = 540.32
 avg_custom_cost = 175.9425
 total_custom_cost = 879.7125

I tested with pgbench -M prepared -f select.sql.

Also, I'm seeing a regression for check-world in 
src/test/regress/results/inherit.out


***
*** 642,648 
  -+---+---+-
   mlparted_tab_part1  | 1 | a |
   mlparted_tab_part2a | 2 | a |
!  mlparted_tab_part2b | 2 | b | xxx
   mlparted_tab_part3  | 3 | a | xxx
  (4 rows)

--- 642,648 
  -+---+---+-
   mlparted_tab_part1  | 1 | a |
   mlparted_tab_part2a | 2 | a |
!  mlparted_tab_part2b | 2 | b |
   mlparted_tab_part3  | 3 | a | xxx
  (4 rows)

I'll spend some more time tomorrow.

Thanks for working on this !

Best regards,
 Jesper


hash.sql
Description: application/sql


select.sql
Description: application/sql


std.sql
Description: application/sql


Re: [HACKERS] MERGE SQL Statement for PG11

2018-04-04 Thread Jesper Pedersen

Hi Simon,

On 03/30/2018 07:10 AM, Simon Riggs wrote:

No problems found, but moving proposed commit to 2 April pm



There is a warning for this, as attached.

Best regards,
 Jesper
diff --git a/src/backend/executor/nodeMerge.c b/src/backend/executor/nodeMerge.c
index 0e0d0795d4..9aee073a94 100644
--- a/src/backend/executor/nodeMerge.c
+++ b/src/backend/executor/nodeMerge.c
@@ -485,7 +485,6 @@ ExecMerge(ModifyTableState *mtstate, EState *estate, TupleTableSlot *slot,
 	ItemPointer tupleid;
 	ItemPointerData tuple_ctid;
 	bool		matched = false;
-	char		relkind;
 	Datum		datum;
 	bool		isNull;
 


Re: [HACKERS] path toward faster partition pruning

2018-04-04 Thread Jesper Pedersen

Hi,

On 04/04/2018 09:29 AM, David Rowley wrote:

Thanks for updating. I've made a pass over v49 and I didn't find very
much wrong with it.

The only real bug I found was a missing IsA(rinfo->clause, Const) in
the pseudoconstant check inside
generate_partition_pruning_steps_internal.

Most of the changes are comment fixes with a few stylistic changes
thrown which are pretty much all there just to try to shrink the code
a line or two or reduce indentation.

I feel pretty familiar with this code now and assuming the attached is
included I'm happy for someone else, hopefully, a committer to take a
look at it.

I'll leave the following notes:

1. Still not sure about RelOptInfo->has_default_part. This flag is
only looked at in generate_partition_pruning_steps. The RelOptInfo and
the boundinfo is available to look at, it's just that the
partition_bound_has_default macro is defined in partition.c rather
than partition.h.

2. Don't really like the new isopne variable name. It's not very
simple to decode, perhaps something like is_not_eq is better?

3. The part of the code I'm least familiar with is
get_steps_using_prefix_recurse(). I admit to not having had time to
fully understand that and consider ways to break it.

Marking as ready for committer.



Passes check-world, and CommitFest app has been updated to reflect the 
current patch set. Trivial changes attached.


Best regards,
 Jesper
>From 82f718579dc8e06ab77d76df4ed72f0f03ed4a4e Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Wed, 4 Apr 2018 11:27:59 -0400
Subject: [PATCH] Trivial changes

---
 src/backend/catalog/partition.c| 10 +-
 src/backend/optimizer/util/partprune.c |  8 
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index d6bce9f348..7a268e05dc 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -146,7 +146,7 @@ typedef struct PruneStepResult
 {
 	/*
 	 * This contains the offsets of the bounds in a table's boundinfo, each of
-	 * which is a bound whose corresponding partition is selected by a a given
+	 * which is a bound whose corresponding partition is selected by a given
 	 * pruning step.
 	 */
 	Bitmapset  *bound_offsets;
@@ -2026,7 +2026,7 @@ get_matching_hash_bounds(PartitionPruneContext *context,
 		 int opstrategy, Datum *values, int nvalues,
 		 FmgrInfo *partsupfunc, Bitmapset *nullkeys)
 {
-	PruneStepResult *result = palloc0(sizeof(PruneStepResult));
+	PruneStepResult *result = (PruneStepResult *) palloc0(sizeof(PruneStepResult));
 	PartitionBoundInfo boundinfo = context->boundinfo;
 	int		   *partindices = boundinfo->indexes;
 	int			partnatts = context->partnatts;
@@ -2093,7 +2093,7 @@ get_matching_list_bounds(PartitionPruneContext *context,
 		 int opstrategy, Datum value, int nvalues,
 		 FmgrInfo *partsupfunc, Bitmapset *nullkeys)
 {
-	PruneStepResult *result = palloc0(sizeof(PruneStepResult));
+	PruneStepResult *result = (PruneStepResult *) palloc0(sizeof(PruneStepResult));
 	PartitionBoundInfo boundinfo = context->boundinfo;
 	int			off,
 minoff,
@@ -2147,7 +2147,7 @@ get_matching_list_bounds(PartitionPruneContext *context,
 		return result;
 	}
 
-	/* Speical case handling of values coming from a <> operator clause. */
+	/* Special case handling of values coming from a <> operator clause. */
 	if (opstrategy == InvalidStrategy)
 	{
 		/*
@@ -2295,7 +2295,7 @@ get_matching_range_bounds(PartitionPruneContext *context,
 		  int opstrategy, Datum *values, int nvalues,
 		  FmgrInfo *partsupfunc, Bitmapset *nullkeys)
 {
-	PruneStepResult *result = palloc0(sizeof(PruneStepResult));
+	PruneStepResult *result = (PruneStepResult *) palloc0(sizeof(PruneStepResult));
 	PartitionBoundInfo boundinfo = context->boundinfo;
 	Oid		   *partcollation = context->partcollation;
 	int			partnatts = context->partnatts;
diff --git a/src/backend/optimizer/util/partprune.c b/src/backend/optimizer/util/partprune.c
index 75b7232f5d..2d06c1a519 100644
--- a/src/backend/optimizer/util/partprune.c
+++ b/src/backend/optimizer/util/partprune.c
@@ -433,8 +433,8 @@ generate_partition_pruning_steps_internal(RelOptInfo *rel,
 			}
 
 			/*
-			 * Fall-through for a NOT clause, which if it's a Boolean clause
-			 * clause, will be handled in match_clause_to_partition_key(). We
+			 * Fall-through for a NOT clause, which if it's a Boolean clause,
+			 * will be handled in match_clause_to_partition_key(). We
 			 * currently don't perform any pruning for more complex NOT
 			 * clauses.
 			 */
@@ -665,7 +665,7 @@ match_clause_to_partition_key(RelOptInfo *rel,
 	 */
 	if (match_boolean_partition_clause(partopfamily, clause, partkey, &expr))
 	{
-		*pc = palloc(sizeof(PartClauseInfo));
+		*pc = (PartClauseInfo *) palloc(sizeof(PartClauseInfo));
 		(*pc)->keyno = partkeyidx;
 		/* Do pruning with the Boolean equality operator. */
 		(*pc)->opno = BooleanEqualOperator;
@@ -806,7 +806,7 @@ match_c

Re: [HACKERS] Runtime Partition Pruning

2018-04-04 Thread Jesper Pedersen

Hi David,

On 04/03/2018 10:10 PM, David Rowley wrote:

The attached case doesn't trigger a generic plan, so basically all time is
spent in GetCachedPlan.


Yeah, there's still no resolution to the fact that a generic plan +
runtime pruning might be cheaper than a custom plan.  The problem is
the generic plan appears expensive to the custom vs generic plan
comparison due to it containing more Append subnodes and the run-time
pruning not being taking into account by that comparison.

There's been some discussion about this on this thread somewhere.



Forgot about that, sorry.


I think the best solution is probably the one suggested by Robert [1]
and that's to alter the Append plan's cost when run-time pruning is
enabled to try to account for the run-time pruning. This would be a
bit of a blind guess akin to what we do for clause selectivity
estimates for Params, but it's probably better than nothing, and
likely better than doing nothing.



Yeah, something based on the number of WHERE clauses, and if the 
partition type has DEFAULT / NULL partition could help. Forcing 
choose_custom_plan() to return false does help a lot (> 400%) for the 
HASH case.


But maybe this area is best left for PG12.


Yeah, it's a bug in v46 faster partition pruning. Discussing a fix for
that with Amit over on [2].



I was running with a version of faster_part_prune_v45_fixups.patch.

Patch v49 with v18 (0001-0004) works. 0005 needs a rebase.

Thanks again,
 Jesper



Re: [HACKERS] MERGE SQL Statement for PG11

2018-04-05 Thread Jesper Pedersen

Hi Simon and Paven,

On 04/04/2018 08:46 AM, Jesper Pedersen wrote:

On 03/30/2018 07:10 AM, Simon Riggs wrote:

No problems found, but moving proposed commit to 2 April pm



There is a warning for this, as attached.



Updated version due to latest refactoring.

Best regards,
  Jesper

diff --git a/src/backend/executor/execMerge.c b/src/backend/executor/execMerge.c
index 471f64361d..e35025880d 100644
--- a/src/backend/executor/execMerge.c
+++ b/src/backend/executor/execMerge.c
@@ -48,7 +48,6 @@ ExecMerge(ModifyTableState *mtstate, EState *estate, TupleTableSlot *slot,
 	ItemPointer tupleid;
 	ItemPointerData tuple_ctid;
 	bool		matched = false;
-	char		relkind;
 	Datum		datum;
 	bool		isNull;
 


Re: [HACKERS] MERGE SQL Statement for PG11

2018-04-05 Thread Jesper Pedersen

Hi,

On 04/05/2018 07:48 AM, Simon Riggs wrote:

Updated version due to latest refactoring.


Thanks for your input. Removing that seems to prevent compilation.

Did something change in between?



Updated for non-assert build.

Best regards,
 Jesper
diff --git a/src/backend/executor/execMerge.c b/src/backend/executor/execMerge.c
index 471f64361d..53f4afff0f 100644
--- a/src/backend/executor/execMerge.c
+++ b/src/backend/executor/execMerge.c
@@ -48,13 +48,11 @@ ExecMerge(ModifyTableState *mtstate, EState *estate, TupleTableSlot *slot,
 	ItemPointer tupleid;
 	ItemPointerData tuple_ctid;
 	bool		matched = false;
-	char		relkind;
 	Datum		datum;
 	bool		isNull;
 
-	relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
-	Assert(relkind == RELKIND_RELATION ||
-		   relkind == RELKIND_PARTITIONED_TABLE);
+	Assert(resultRelInfo->ri_RelationDesc->rd_rel->relkind ||
+		   resultRelInfo->ri_RelationDesc->rd_rel->relkind == RELKIND_PARTITIONED_TABLE);
 
 	/*
 	 * Reset per-tuple memory context to free any expression evaluation


Re: [HACKERS] MERGE SQL Statement for PG11

2018-04-05 Thread Jesper Pedersen

Hi,

On 04/05/2018 08:04 AM, Simon Riggs wrote:

On 5 April 2018 at 12:56, Jesper Pedersen  wrote:

Updated for non-assert build.


Thanks, pushed. Sorry to have you wait til v3



That patch was a but rushed, and cut off too much.

As attached.

Best regards,
 Jesper
diff --git a/src/backend/executor/execMerge.c b/src/backend/executor/execMerge.c
index 53f4afff0f..d39ddd3034 100644
--- a/src/backend/executor/execMerge.c
+++ b/src/backend/executor/execMerge.c
@@ -51,7 +51,7 @@ ExecMerge(ModifyTableState *mtstate, EState *estate, TupleTableSlot *slot,
 	Datum		datum;
 	bool		isNull;
 
-	Assert(resultRelInfo->ri_RelationDesc->rd_rel->relkind ||
+	Assert(resultRelInfo->ri_RelationDesc->rd_rel->relkind == RELKIND_RELATION ||
 		   resultRelInfo->ri_RelationDesc->rd_rel->relkind == RELKIND_PARTITIONED_TABLE);
 
 	/*


Re: [HACKERS] Runtime Partition Pruning

2018-04-05 Thread Jesper Pedersen

Hi David,

First of all: Solid patch set with good documentation.

On 04/05/2018 09:41 AM, David Rowley wrote:

Seems mostly fair. I'm not a fan of using the term "unpruned" though.
I'll have a think.  The "all" is meant in terms of what exists as
subnodes.



'included_parts' / 'excluded_parts' probably isn't better...


subplan_indexes and parent_indexes seem like better names, I agree.



More clear.


* Also in make_partition_pruneinfo()

 /* Initialize to -1 to indicate the rel was not found */
 for (i = 0; i < root->simple_rel_array_size; i++)
 {
 allsubnodeindex[i] = -1;
 allsubpartindex[i] = -1;
 }

Maybe, allocate the arrays above mentioned using palloc0 and don't do this
initialization.  Instead make the indexes that are stored in these start
with 1 and consider 0 as invalid entries.


0 is a valid subplan index. It is possible to make this happen, but
I'd need to subtract 1 everywhere I used the map. That does not seem
very nice. Seems more likely to result in bugs where we might forget
to do the - 1.

Did you want this because you'd rather have the palloc0() than the for
loop setting the array elements to -1? Or is there another reason?



I think that doing palloc0 would be confusing; -1 is more clear, 
especially since it is used with 'allpartindexes' which is a Bitmapset.


Doing the variable renames as Amit suggests would be good.

I ran some tests (v50_v20) (make check-world passes), w/ and w/o 
choose_custom_plan() being false, and seeing good performance results 
without running into issues.


Maybe 0005 should be expanded in partition_prune.sql with the supported 
cases to make those more clear.


Thanks !

Best regards,
 Jesper



Re: [HACKERS] path toward faster partition pruning

2018-04-06 Thread Jesper Pedersen

Hi Alvaro,

On 04/06/2018 12:41 PM, Alvaro Herrera wrote:

Here's my proposed patch.

Idle thought: how about renaming the "constfalse" argument and variables
to "contradictory" or maybe just "contradict"?



Passes check-world.

New directories, and variable rename seems like a good idea; either is ok.

Best regards,
 Jesper



Re: [HACKERS] Runtime Partition Pruning

2018-04-07 Thread Jesper Pedersen

Hi,

On 04/07/2018 04:45 AM, David Rowley wrote:

Ok, so I've gone and done this.

PartitionPruning has become PartitionPruneState
PartitionRelPruning has become PartitionPruningData

I've changed pointers to PartitionPruneStates to be named prunestate,
sometimes having the node prefix; as_, ma_, in these cases prune and
state are separated with a _ which seems to be the general rule for
executor state struct members.

Generally, pointers to PartitionPruningData are now named pprune.
Hopefully, that's ok, as this was the name previously used for
PartitionPruning pointers.

I applied the patch to get rid of as_noop_scan in favour of using a
special as_whichplan value. There was already one special value
(INVALID_SUBPLAN_INDEX), so seemed better to build on that rather than
inventing something new. This also means we don't have to make the
AppendState struct and wider too, which seems like a good thing to try
to do.

I made all the fixups which I mentioned in my review earlier and also
re-removed the resultRelation parameter from make_partition_pruneinfo.
It sneaked back into v22.

v23 is attached.



Passes check-world.

Changing explain.c to "Subplans Removed" as suggested by you in [1] is a 
good idea.


[1] 
https://www.postgresql.org/message-id/CAKJS1f99JnkbOshdV_4zoJZ96DPtKeHMHv43JRL_ZdHRkkVKCA%40mail.gmail.com


Best regards,
 Jesper



Re: partitioned tables referenced by FKs

2019-03-01 Thread Jesper Pedersen

Hi Alvaro,

On 2/28/19 1:28 PM, Alvaro Herrera wrote:

Rebased to current master.  I'll reply later to handle the other issues
you point out.



Applying with hunks.

With 0003 using

export CFLAGS="-DCOPY_PARSE_PLAN_TREES -O0 -fno-omit-frame-pointer" && 
CC="ccache gcc" ./configure --prefix /usr/local/packages/postgresql-12.x 
--enable-dtrace --with-openssl --with-gssapi --with-libxml --with-llvm 
--enable-debug --enable-depend --enable-tap-tests --enable-cassert


I'm getting a failure in the pg_upgrade test:

 --
+-- Name: pk5 pk5_pkey; Type: CONSTRAINT; Schema: regress_fk; Owner: 
jpedersen

+--
+
+ALTER TABLE ONLY regress_fk.pk5
+ADD CONSTRAINT pk5_pkey PRIMARY KEY (a);
+
+
+--

when running check-world.

Best regards,
 Jesper



Re: speeding up planning with partitions

2019-03-04 Thread Jesper Pedersen

Hi Amit,

Passes check-world.

On 3/4/19 5:38 AM, Amit Langote wrote:

See patch 0001.



+* members of any appendrels we find here are added built later when

s/built//


See patch 0002.



+   grouping_planner(root, false, 0.0 /* retrieve all tuples */);

Move comment out of function call.

+   if (root->simple_rte_array[childRTindex])
+   elog(ERROR, "rel %d already exists", childRTindex);
+   root->simple_rte_array[childRTindex] = childrte;
+   if (root->append_rel_array[childRTindex])
+   elog(ERROR, "child relation %d already exists", 
childRTindex);
+   root->append_rel_array[childRTindex] = appinfo;


Could the "if"s be made into Assert's instead ?

+ * the newly added bytes with zero

Extra spaces

+   if (rte->rtekind == RTE_RELATION &&  !root->contains_inherit_children)

s/TAB/space


See patch 0003.



+* because they correspond to flattneed UNION ALL subqueries.  
Especially,

s/flattneed/flatten


See patch 0004.



+* no need to make any new entries, because anything that'd need those

Use "would" explicit

+ * this case, since it needn't be scanned.

, since it doesn't need to be scanned


See patch 0005.

See patch 0006.



I'll run some tests using a hash partitioned setup.

Best regards,
 Jesper



Re: speeding up planning with partitions

2019-03-05 Thread Jesper Pedersen

On 3/5/19 5:24 AM, Amit Langote wrote:

Attached an updated version.  This incorporates fixes for both Jesper's
and Imai-san's review.  I haven't been able to pin down the bug (or
whatever) that makes throughput go down as the partition count increases,
when tested with a --enable-cassert build.



Thanks !

I'm seeing the throughput going down as well, but are you sure it isn't 
just the extra calls of MemoryContextCheck you are seeing ? A flamegraph 
diff highlights that area -- sent offline.


A non cassert build shows the same profile for 64 and 1024 partitions.

Best regards,
 Jesper



Re: speeding up planning with partitions

2019-03-07 Thread Jesper Pedersen

Hi,

On 3/5/19 5:24 AM, Amit Langote wrote:
Attached an updated version. 


Need a rebase due to 898e5e3290.

Best regards,
 Jesper



Re: speeding up planning with partitions

2019-03-12 Thread Jesper Pedersen

Hi Amit,

On 3/12/19 4:22 AM, Amit Langote wrote:

I wrestled with this idea a bit and concluded that we don't have to
postpone *all* of preprocess_targetlist() processing to query_planner,
only the part that adds row mark "junk" Vars, because only those matter
for the problem being solved.  To recap, the problem is that delaying
adding inheritance children (and hence their row marks if any) means we
can't add "junk" columns needed to implement row marks, because which ones
to add is not clear until we've seen all the children.

I propose that we delay the addition of "junk" Vars to query_planner() so
that it doesn't stand in the way of deferring inheritance expansion to
query_planner() too.  That means the order of reltarget expressions for
row-marked rels will change, but I assume that's OK.  At least it will be
consistent for both non-inherited baserels and inherited ones.

Attached updated version of the patches with the above proposal
implemented by patch 0002.  To summarize, the patches are as follows:

0001: make building of "other rels" a separate step that runs after
deconstruct_jointree(), implemented by a new subroutine of query_planner
named add_other_rels_to_query()

0002: delay adding "junk" Vars to after add_other_rels_to_query()

0003: delay inheritance expansion to add_other_rels_to_query()

0004, 0005: adjust inheritance_planner() to account for the fact that
inheritance is now expanded by query_planner(), not subquery_planner()

0006: perform partition pruning while inheritance is being expanded,
instead of during set_append_append_rel_size()

0007: add a 'live_parts' field to RelOptInfo to store partition indexes
(not RT indexes) of unpruned partitions, which speeds up looping over
part_rels array of the partitioned parent

0008: avoid copying PartitionBoundInfo struct for planning



After applying 0004 check-world fails with the attached. CFBot agrees [1].

[1] https://travis-ci.org/postgresql-cfbot/postgresql/builds/505107884

Best regards,
 Jesper

diff -U3 
/home/jpedersen/PostgreSQL/postgresql/contrib/postgres_fdw/expected/postgres_fdw.out
 
/home/jpedersen/PostgreSQL/postgresql/contrib/postgres_fdw/results/postgres_fdw.out
--- 
/home/jpedersen/PostgreSQL/postgresql/contrib/postgres_fdw/expected/postgres_fdw.out
2019-03-12 07:46:08.430690229 -0400
+++ 
/home/jpedersen/PostgreSQL/postgresql/contrib/postgres_fdw/results/postgres_fdw.out
 2019-03-12 07:59:01.134285159 -0400
@@ -7190,8 +7190,8 @@
 -- Check UPDATE with inherited target and an inherited source table
 explain (verbose, costs off)
 update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
- QUERY PLAN
  
--
+  QUERY PLAN   

+---
  Update on public.bar
Update on public.bar
Foreign Update on public.bar2
@@ -7214,22 +7214,22 @@
  Output: foo2.f1, foo2.ctid, foo2.*, 
foo2.tableoid
  Remote SQL: SELECT f1, f2, f3, ctid FROM 
public.loct1
->  Hash Join
- Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, foo.ctid, 
foo.*, foo.tableoid
+ Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, foo.ctid, foo.*
  Inner Unique: true
  Hash Cond: (bar2.f1 = foo.f1)
  ->  Foreign Scan on public.bar2
Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
  ->  Hash
-   Output: foo.f1, foo.ctid, foo.*, foo.tableoid
+   Output: foo.f1, foo.ctid, foo.*
->  HashAggregate
- Output: foo.f1, foo.ctid, foo.*, foo.tableoid
+ Output: foo.f1, foo.ctid, foo.*
  Group Key: foo.f1
  ->  Append
->  Seq Scan on public.foo
- Output: foo.f1, foo.ctid, foo.*, foo.tableoid
+ Output: foo.f1, foo.ctid, foo.*
->  Foreign Scan on public.foo2
- Output: foo2.f1, foo2.ctid, foo2.*, 
foo2.tableoid
+ Output: foo2.f1, foo2.ctid, foo2.*
  Remote SQL: SELECT f1, f2, f3, ctid FROM 
public.loct1
 (39 rows)
 


Re: pg_upgrade: Pass -j down to vacuumdb

2019-03-12 Thread Jesper Pedersen

Hi,

Thanks Kirk !

On 3/12/19 2:20 PM, Robert Haas wrote:

The words 'by default' should be removed here, because there is also
no non-default way to get that behavior, either.



Here is v9 based on Kirk's and your input.

Best regards,
 Jesper

>From 5b879f79300412638705e32aa3ed51aff3cbe75c Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Tue, 12 Mar 2019 16:02:27 -0400
Subject: [PATCH] Highlight that the --jobs option isn't passed down to
 vacuumdb.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Jesper Pedersen and Kirk Jamison
Reviewed-by: Peter Eisentraut, Jerry Sievers, Michael Paquier, Tom Lane, Fabrízio de Royes Mello, Álvaro Herrera and Robert Haas
Discussion: https://www.postgresql.org/message-id/flat/11b6b108-6ac0-79a6-785a-f13dfe60a...@redhat.com
---
 doc/src/sgml/ref/pgupgrade.sgml |  6 +-
 src/bin/pg_upgrade/check.c  | 23 +++
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 7e1afaf0a5..5f2a94918d 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -395,7 +395,11 @@ NET STOP postgresql-&majorversion;
  in parallel;  a good place to start is the maximum of the number of
  CPU cores and tablespaces.  This option can dramatically reduce the
  time to upgrade a multi-database server running on a multiprocessor
- machine.
+ machine. Note that this option isn't passed to the
+ vacuumdb application.
+ The system environment variable VACUUMDB_OPTS can be
+ used to pass extra options to the vacuumdb
+ application during the script execution.
 
 
 
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fc5aa7010f..e4356f011c 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -493,17 +493,32 @@ create_script_for_cluster_analyze(char **analyze_script_file_name)
 
 	fprintf(script, "echo %sIf you would like default statistics as quickly as possible, cancel%s\n",
 			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %sthis script and run:%s\n",
+	fprintf(script, "echo %sthis script. You may add optional vacuumdb options (e.g. --jobs)%s\n",
 			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %s\"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
+	fprintf(script, "echo %sby editing $VACUUMDB_OPTS, then run the command below.%s\n",
+			ECHO_QUOTE, ECHO_QUOTE);
+#ifndef WIN32
+	fprintf(script, "echo %s\"%s/vacuumdb\" %s $VACUUMDB_OPTS --all %s%s\n", ECHO_QUOTE,
 			new_cluster.bindir, user_specification.data,
-	/* Did we copy the free space files? */
+			/* Did we copy the free space files? */
 			(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
 			"--analyze-only" : "--analyze", ECHO_QUOTE);
+#else
+	fprintf(script, "echo %s\"%s\\vacuumdb.exe\" %s %%VACUUMDB_OPTS%% --all %s%s\n", ECHO_QUOTE,
+			new_cluster.bindir, user_specification.data,
+			/* Did we copy the free space files? */
+			(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+			"--analyze-only" : "--analyze", ECHO_QUOTE);
+#endif
 	fprintf(script, "echo%s\n\n", ECHO_BLANK);
 
-	fprintf(script, "\"%s/vacuumdb\" %s--all --analyze-in-stages\n",
+#ifndef WIN32
+	fprintf(script, "\"%s/vacuumdb\" %s $VACUUMDB_OPTS --all --analyze-in-stages\n",
 			new_cluster.bindir, user_specification.data);
+#else
+	fprintf(script, "\"%s\\vacuumdb.exe\" %s %%VACUUMDB_OPTS%% --all --analyze-in-stages\n",
+			new_cluster.bindir, user_specification.data);
+#endif
 	/* Did we copy the free space files? */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) < 804)
 		fprintf(script, "\"%s/vacuumdb\" %s--all\n", new_cluster.bindir,
-- 
2.17.2



Re: speeding up planning with partitions

2019-03-20 Thread Jesper Pedersen

Hi,

On 3/19/19 11:15 PM, Imai, Yoshikazu wrote:

Here the details.

[creating partitioned tables (with 1024 partitions)]
drop table if exists rt;
create table rt (a int, b int, c int) partition by range (a);
\o /dev/null
select 'create table rt' || x::text || ' partition of rt for values from (' ||
  (x)::text || ') to (' || (x+1)::text || ');' from generate_series(1, 1024) x;
\gexec
\o

[select1024.sql]
\set a random (1, 1024)
select * from rt where a = :a;

[pgbench]
pgbench -n -f select1024.sql -T 60




My tests - using hash partitions - shows that the extra time is spent in 
make_partition_pruneinfo() for the relid_subplan_map variable.


  64 partitions: make_partition_pruneinfo() 2.52%
8192 partitions: make_partition_pruneinfo() 5.43%

TPS goes down ~8% between the two. The test setup is similar to the above.

Given that Tom is planning to change the List implementation [1] in 13 I 
think using the palloc0 structure is ok for 12 before trying other 
implementation options.


perf sent off-list.

[1] https://www.postgresql.org/message-id/24783.1551568303%40sss.pgh.pa.us

Best regards,
 Jesper



Re: speeding up planning with partitions

2019-03-21 Thread Jesper Pedersen

Hi Amit,

On 3/19/19 8:41 PM, Amit Langote wrote:

I have fixed all.  Attached updated patches.



The comments in the thread has been addressed, so I have put the CF 
entry into Ready for Committer.


Best regards,
 Jesper



Re: partitioned tables referenced by FKs

2019-03-21 Thread Jesper Pedersen

Hi Alvaro,

On 3/18/19 6:02 PM, Alvaro Herrera wrote:

I spent a few hours studying this and my conclusion is the opposite of
yours: we should make addFkRecurseReferencing the recursive one, and
CloneFkReferencing a non-recursive caller of that.  So we end up with
both addFkRecurseReferenced and addFkRecurseReferencing as recursive
routines, and CloneFkReferenced and CloneFkReferencing being
non-recursive callers of those.  With this structure change there is one
more call to CreateConstraintEntry than before, and now there are two
calls of tryAttachPartitionForeignKey instead of one; I think with this
new structure things are much simpler.  I also changed
CloneForeignKeyConstraints's API: instead of returning a list of cloned
constraint giving its caller the responsibility of adding FK checks to
phase 3, we now give CloneForeignKeyConstraints the 'wqueue' list, so
that it can add the FK checks itself.  It seems much cleaner this way.



Using

-- ddl.sql --
CREATE TABLE t1 (i1 INT PRIMARY KEY, i2 INT NOT NULL) PARTITION BY HASH 
(i1);
CREATE TABLE t2 (i1 INT PRIMARY KEY, i2 INT NOT NULL) PARTITION BY HASH 
(i1);


\o /dev/null
SELECT 'CREATE TABLE t1_p' || x::text || ' PARTITION OF t1
FOR VALUES WITH (MODULUS 64, REMAINDER ' || x::text || ');'
from generate_series(0,63) x;
\gexec
\o

\o /dev/null
SELECT 'CREATE TABLE t2_p' || x::text || ' PARTITION OF t2
FOR VALUES WITH (MODULUS 64, REMAINDER ' || x::text || ');'
from generate_series(0,63) x;
\gexec
\o

ALTER TABLE t1 ADD CONSTRAINT fk_t1_i2_t2_i1 FOREIGN KEY (i2) REFERENCES 
t2(i1);


ANALYZE;


with

-- select.sql --
\set a random(1, 10)
SELECT t1.i1 AS t1i1, t1.i2 AS t1i2, t2.i1 AS t2i1, t2.i2 AS t2i2 FROM 
t1, t2 WHERE t1.i1 = :a;


running

pgbench -M prepared -f select.sql 

I'm seeing 82.64% spent in GetCachedPlan(). plan_cache_mode is auto.

Best regards,
 Jesper



Re: speeding up planning with partitions

2019-03-22 Thread Jesper Pedersen

Hi Amit,

On 3/22/19 3:39 AM, Amit Langote wrote:

I took a stab at this.  I think rearranging the code in
make_partitionedrel_pruneinfo() slightly will take care of this.

The problem is that make_partitionedrel_pruneinfo() does some work which
involves allocating arrays containing nparts elements and then looping
over the part_rels array to fill values in those arrays that will be used
by run-time pruning.  It does all that even *before* it has checked that
run-time pruning will be needed at all, which if it is not, it will simply
discard the result of the aforementioned work.

Patch 0008 of 0009 rearranges the code such that we check whether we will
need run-time pruning at all before doing any of the work that's necessary
for run-time to work.



Yeah, this is better :) Increasing number of partitions leads to a TPS 
within the noise level.


Passes check-world.

Best regards,
 Jesper



Re: partitioned tables referenced by FKs

2019-03-22 Thread Jesper Pedersen

Hi Alvaro,

On 3/21/19 4:49 PM, Alvaro Herrera wrote:

I think the attached is a better solution, which I'll go push shortly.



I see you pushed this, plus 0002 as well.

Thanks !

Best regards,
 Jesper



Re: partitioned tables referenced by FKs

2019-03-22 Thread Jesper Pedersen

Hi Alvaro,

On 3/21/19 6:18 PM, Alvaro Herrera wrote:

On 2019-Mar-21, Jesper Pedersen wrote:

pgbench -M prepared -f select.sql 

I'm seeing 82.64% spent in GetCachedPlan(). plan_cache_mode is auto.


Hmm, I can't reproduce this at all ...  I don't even see GetCachedPlan
in the profile.  Do you maybe have some additional patch in your tree?



No, with 7df159a62 and v7 compiled with "-O0 -fno-omit-frame-pointer" I 
still see it.


plan_cache_mode = auto
  2394 TPS w/ GetCachePlan() @ 81.79%

plan_cache_mode = force_generic_plan
 10984 TPS w/ GetCachePlan() @ 23.52%

perf sent off-list.

Best regards,
 Jesper



Re: partitioned tables referenced by FKs

2019-03-26 Thread Jesper Pedersen

Hi Amit,

On 3/26/19 2:06 AM, Amit Langote wrote:

Wouldn't you get the same numbers on HEAD too?  IOW, I'm not sure how the
patch here, which seems mostly about getting DDL in order to support
foreign keys on partitioned tables, would have affected the result of this
benchmark.  Can you clarify your intention of running this benchmark
against these patches?



Yeah, you are right. As I'm also seeing this issue with a test case of

-- ddl.sql --
CREATE TABLE t1 (i1 INT PRIMARY KEY, i2 INT NOT NULL) PARTITION BY HASH 
(i1);

CREATE TABLE t2 (i1 INT PRIMARY KEY, i2 INT NOT NULL);

\o /dev/null
SELECT 'CREATE TABLE t1_p' || x::text || ' PARTITION OF t1
FOR VALUES WITH (MODULUS 64, REMAINDER ' || x::text || ');'
from generate_series(0,63) x;
\gexec
\o

ALTER TABLE t1 ADD CONSTRAINT fk_t1_i2_t2_i1 FOREIGN KEY (i2) REFERENCES 
t2(i1);


ANALYZE;

we shouldn't focus on it in this thread.

Best regards,
 Jesper



Re: partitioned tables referenced by FKs

2019-03-27 Thread Jesper Pedersen

Hi,

On 3/26/19 5:39 PM, Alvaro Herrera wrote:

As I said before, I'm thinking of getting rid of the whole business of
checking partitions on the referenced side of an FK at DROP time, and
instead jut forbid the DROP completely if any FKs reference an ancestor
of that partition.


Will that allow `DROP TABLE parted_pk CASCADE` to succeed even if
partitions still contain referenced data?  I suppose that's the example
you cited upthread as a bug that remains to be solved.


That's the idea, yes, it should do that: only allow a DROP of a
partition referenced by an FK if the topmost constraint is also being
dropped.  Maybe this means I need to get rid of 0002 completely.  But I
haven't got to doing that yet.



I think that is ok, if doc/src/sgml/ref/drop_table.sgml is updated with 
a description of how CASCADE works in this scenario.


Best regards,
 Jesper




Re: partitioned tables referenced by FKs

2019-03-29 Thread Jesper Pedersen

Hi Alvaro,

On 3/28/19 2:59 PM, Alvaro Herrera wrote:

I ended up revising the dependencies that we give to the constraint in
the partition -- instead of giving it partition-type dependencies, we
give it an INTERNAL dependency.  Now when you request to drop the
partition, it says this:

create table pk (a int primary key) partition by list (a);
create table fk (a int references pk);
create table pk1 partition of pk for values in (1);

alvherre=# drop table pk1;
ERROR:  cannot drop table pk1 because other objects depend on it
DETAIL:  constraint fk_a_fkey on table fk depends on table pk1
HINT:  Use DROP ... CASCADE to drop the dependent objects too.

If you do say CASCADE, the constraint is dropped.  Not really ideal (I
would prefer that the drop is prevented completely), but at least it's
not completely bogus.  If you do "DROP TABLE pk", it works sanely.
Also, if you DETACH the partition that pg_depend row goes away, so a
subsequent drop of the partition works sanely.

Fixed the psql issue pointed out by Amit L too.



Could expand a bit on the change to DEPENDENCY_INTERNAL instead of 
DEPENDENCY_PARTITION_PRI / DEPENDENCY_PARTITION_SEC ?


If you run "DROP TABLE t2_p32 CASCADE" the foreign key constraint is 
removed from all of t1.


-- ddl.sql --
CREATE TABLE t1 (i1 INT PRIMARY KEY, i2 INT NOT NULL) PARTITION BY HASH 
(i1);
CREATE TABLE t2 (i1 INT PRIMARY KEY, i2 INT NOT NULL) PARTITION BY HASH 
(i1);


\o /dev/null
SELECT 'CREATE TABLE t1_p' || x::text || ' PARTITION OF t1
FOR VALUES WITH (MODULUS 64, REMAINDER ' || x::text || ');'
from generate_series(0,63) x;
\gexec
\o

\o /dev/null
SELECT 'CREATE TABLE t2_p' || x::text || ' PARTITION OF t2
FOR VALUES WITH (MODULUS 64, REMAINDER ' || x::text || ');'
from generate_series(0,63) x;
\gexec
\o

ALTER TABLE t1 ADD CONSTRAINT fk_t1_i2_t2_i1 FOREIGN KEY (i2) REFERENCES 
t2(i1);


INSERT INTO t2 (SELECT i, i FROM generate_series(1, 1000) AS i);
INSERT INTO t1 (SELECT i, i FROM generate_series(1, 1000) AS i);

ANALYZE;
-- ddl.sql --

Detaching the partition for DROP seems safer to me.

Thanks in advance !

Best regards,
 Jesper




Re: partitioned tables referenced by FKs

2019-03-29 Thread Jesper Pedersen

Hi,

On 3/29/19 11:22 AM, Alvaro Herrera wrote:

On 2019-Mar-29, Jesper Pedersen wrote:


Could expand a bit on the change to DEPENDENCY_INTERNAL instead of
DEPENDENCY_PARTITION_PRI / DEPENDENCY_PARTITION_SEC ?


The PARTITION dependencies work in a way that doesn't do what we want.
Admittedly, neither does INTERNAL, but at least it's less bad.


If you run "DROP TABLE t2_p32 CASCADE" the foreign key constraint is removed
from all of t1.


Yes.  CASCADE is always a dangerous tool; if you run the DROP partition
without cascade, it explicitly lists that the constraint is going to be
dropped.

If you get in the habit of added CASCADE to all your drops, you're going
to lose data pretty quickly.  In this case, no data is lost, only a
constraint.



Thanks !

Maybe the "(" / ")" in the CASCADE description should be removed from 
ref/drop_table.sgml as part of this patch.


Should catalogs.sgml be updated for this case ?

Best regards,
 Jesper




Re: partitioned tables referenced by FKs

2019-03-29 Thread Jesper Pedersen

Hi,

On 3/29/19 12:29 PM, Alvaro Herrera wrote:

On 2019-Mar-29, Jesper Pedersen wrote:

Maybe the "(" / ")" in the CASCADE description should be removed from
ref/drop_table.sgml as part of this patch.


I'm not sure what text you propose to remove?



Just the attached.


Should catalogs.sgml be updated for this case ?


I'm not adding any new dependency type, nor modifying any existing one.
Are you looking at anything that needs changing specifically?



No, I just mentioned it if it was a requirement that the precise 
use-case for each dependency type were documented.


Best regards,
 Jesper
diff --git a/doc/src/sgml/ref/drop_table.sgml b/doc/src/sgml/ref/drop_table.sgml
index bf8996d198..328157c849 100644
--- a/doc/src/sgml/ref/drop_table.sgml
+++ b/doc/src/sgml/ref/drop_table.sgml
@@ -41,9 +41,9 @@ DROP TABLE [ IF EXISTS ] name [, ..
triggers, and constraints that exist for the target table.
However, to drop a table that is referenced by a view or a foreign-key
constraint of another table, CASCADE must be
-   specified. (CASCADE will remove a dependent view 
entirely,
+   specified. CASCADE will remove a dependent view entirely,
but in the foreign-key case it will only remove the foreign-key
-   constraint, not the other table entirely.)
+   constraint, not the other table entirely.
   
  
 


Re: partitioned tables referenced by FKs

2019-03-29 Thread Jesper Pedersen

Hi Alvaro,

On 3/28/19 2:59 PM, Alvaro Herrera wrote:

I ended up revising the dependencies that we give to the constraint in
the partition -- instead of giving it partition-type dependencies, we
give it an INTERNAL dependency.  Now when you request to drop the
partition, it says this:

create table pk (a int primary key) partition by list (a);
create table fk (a int references pk);
create table pk1 partition of pk for values in (1);

alvherre=# drop table pk1;
ERROR:  cannot drop table pk1 because other objects depend on it
DETAIL:  constraint fk_a_fkey on table fk depends on table pk1
HINT:  Use DROP ... CASCADE to drop the dependent objects too.

If you do say CASCADE, the constraint is dropped.  Not really ideal (I
would prefer that the drop is prevented completely), but at least it's
not completely bogus.  If you do "DROP TABLE pk", it works sanely.
Also, if you DETACH the partition that pg_depend row goes away, so a
subsequent drop of the partition works sanely.

Fixed the psql issue pointed out by Amit L too.



I ran my test cases for this feature, and havn't seen any issues.

Therefore I'm marking 1877 as Ready for Committer. If others have 
additional feedback feel free to switch it back.


Best regards,
 Jesper




Re: Index Skip Scan

2019-07-22 Thread Jesper Pedersen

Hi,

On 7/22/19 1:44 AM, David Rowley wrote:

Here are the comments I noted down during the review:

cost_index:

I know you've not finished here, but I think it'll need to adjust
tuples_fetched somehow to account for estimate_num_groups() on the
Path's unique keys. Any Eclass with an ec_has_const = true does not
need to be part of the estimate there as there can only be at most one
value for these.

For example, in a query such as:

SELECT x,y FROM t WHERE x = 1 GROUP BY x,y;

you only need to perform estimate_num_groups() on "y".

I'm really not quite sure on what exactly will be required from
amcostestimate() here.  The cost of the skip scan is not the same as
the normal scan. So other that API needs adjusted to allow the caller
to mention that we want skip scans estimated, or there needs to be
another callback.



I think this part will become more clear once the index skip scan patch 
is rebased, as we got the uniquekeys field on the Path, and the 
indexskipprefixy info on the IndexPath node.




build_index_paths:

I don't quite see where you're checking if the query's unique_keys
match what unique keys can be produced by the index.  This is done for
pathkeys with:

pathkeys_possibly_useful = (scantype != ST_BITMAPSCAN &&
!found_lower_saop_clause &&
has_useful_pathkeys(root, rel));
index_is_ordered = (index->sortopfamily != NULL);
if (index_is_ordered && pathkeys_possibly_useful)
{
index_pathkeys = build_index_pathkeys(root, index,
   ForwardScanDirection);
useful_pathkeys = truncate_useless_pathkeys(root, rel,
index_pathkeys);
orderbyclauses = NIL;
orderbyclausecols = NIL;
}

Here has_useful_pathkeys() checks if the query requires some ordering.
For unique keys you'll want to do the same. You'll have set the
query's unique key requirements in standard_qp_callback().

I think basically build_index_paths() should be building index paths
with unique keys, for all indexes that can support the query's unique
keys. I'm just a bit uncertain if we need to create both a normal
index path and another path for the same index with unique keys.
Perhaps we can figure that out down the line somewhere. Maybe it's
best to build path types for now, when possible, and we can consider
later if we can skip the non-uniquekey paths.  Likely that would
require a big XXX comment to explain we need to review that before the
code makes it into core.

get_uniquekeys_for_index:

I think this needs to follow more the lead from build_index_pathkeys.
Basically, ask the index what it's pathkeys are.

standard_qp_callback:

build_group_uniquekeys & build_distinct_uniquekeys could likely be one
function that takes a list of SortGroupClause. You just either pass
the groupClause or distinctClause in.  Pretty much the UniqueKey
version of make_pathkeys_for_sortclauses().

Yeah, I'll move this part of the index skip scan patch to the unique key 
patch.


Thanks for your feedback !

Best regards,
 Jesper




Re: pg_receivewal documentation

2019-07-22 Thread Jesper Pedersen

Hi,

On 7/21/19 9:48 PM, Michael Paquier wrote:

Since pg_receivewal does not apply WAL, you should not allow it to
become a synchronous standby when synchronous_commit = remote_apply.
If it does, it will appear to be a standby which never catches up,
which may cause commits to block.  To avoid this, you should either
configure an appropriate value for synchronous_standby_names, or
specify an application_name for pg_receivewal that does not match it,
or change the value of synchronous_commit to something other than
remote_apply.

I think that'd be a lot more useful than enumerating the total-failure
scenarios.


+1.  Thanks for the suggestions!  Your wording looks good to me.


+1

Here is the patch for it, with Robert as the author.

Best regards,
 Jesper

>From a6512e79e9fd054b188489757ee622dbf98ddf2b Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Mon, 22 Jul 2019 13:19:56 -0400
Subject: [PATCH] Highlight that pg_receivewal doesn't apply WAL, and as such
 synchronous-commit needs to be remote_write or lower.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Robert Haas
Review-by: Michael Paquier, Álvaro Herrera, Laurenz Albe and Jesper Pedersen
---
 doc/src/sgml/ref/pg_receivewal.sgml | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/doc/src/sgml/ref/pg_receivewal.sgml b/doc/src/sgml/ref/pg_receivewal.sgml
index e96d753955..beab6f0391 100644
--- a/doc/src/sgml/ref/pg_receivewal.sgml
+++ b/doc/src/sgml/ref/pg_receivewal.sgml
@@ -52,7 +52,16 @@ PostgreSQL documentation
Unlike the WAL receiver of a PostgreSQL standby server, pg_receivewal
by default flushes WAL data only when a WAL file is closed.
The option --synchronous must be specified to flush WAL data
-   in real time.
+   in real time. Since pg_receivewal does not apply WAL,
+   you should not allow it to become a synchronous standby when
+equals remote_apply.
+   If it does, it will appear to be a standby which never catches up,
+   which may cause commits to block.  To avoid this, you should either
+   configure an appropriate value for , or
+   specify an application_name for
+   pg_receivewal that does not match it, or change the value of
+to something other than
+   remote_apply.
   
 
   
@@ -207,16 +216,6 @@ PostgreSQL documentation
 server as a synchronous standby, to ensure that timely feedback is
 sent to the server.

-
-   
-Note that while WAL will be flushed with this setting,
-pg_receivewal never applies it,
-so  must not be set
-to remote_apply or on if
-pg_receivewal is a synchronous standby, be it a
-member of a priority-based (FIRST) or a
-quorum-based (ANY) synchronous replication setup.
-   
   
  
 
-- 
2.21.0



Re: pg_receivewal documentation

2019-07-23 Thread Jesper Pedersen

Hi,

On 7/22/19 8:08 PM, Michael Paquier wrote:

+to something other than


Looks fine to me.  Just a tiny nit.  For the second reference to
synchronous_commit, I would change the link to a  markup.


Sure.

Best regards,
 Jesper


>From f6c5e9128e0779f928d94bf9bcc8813bf03150f0 Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Tue, 9 Jul 2019 13:14:25 -0400
Subject: [PATCH] Highlight that pg_receivewal doesn't apply WAL, and as such
 synchronous-commit needs to be remote_write or lower.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Robert Haas
Review-by: Michael Paquier, Álvaro Herrera, Laurenz Albe and Jesper Pedersen
---
 doc/src/sgml/ref/pg_receivewal.sgml | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/pg_receivewal.sgml b/doc/src/sgml/ref/pg_receivewal.sgml
index 0506120c00..a7536bed92 100644
--- a/doc/src/sgml/ref/pg_receivewal.sgml
+++ b/doc/src/sgml/ref/pg_receivewal.sgml
@@ -52,7 +52,15 @@ PostgreSQL documentation
Unlike the WAL receiver of a PostgreSQL standby server, pg_receivewal
by default flushes WAL data only when a WAL file is closed.
The option --synchronous must be specified to flush WAL data
-   in real time.
+   in real time. Since pg_receivewal does not apply WAL,
+   you should not allow it to become a synchronous standby when
+equals remote_apply.
+   If it does, it will appear to be a standby which never catches up,
+   which may cause commits to block.  To avoid this, you should either
+   configure an appropriate value for , or
+   specify an application_name for
+   pg_receivewal that does not match it, or change the value of
+   synchronous_commit to something other than remote_apply.
   
 
   
-- 
2.21.0



Re: pg_receivewal documentation

2019-07-24 Thread Jesper Pedersen

Hi,

On 7/23/19 10:29 PM, Michael Paquier wrote:

Thanks.  Applied down to 9.6 where remote_apply has been introduced,
with tweaks for 9.6 as the tool is named pg_receivexlog there.


Thanks to everybody involved !

Best regards,
 Jesper




Re: Index Skip Scan

2019-08-02 Thread Jesper Pedersen

Hi,

On 8/2/19 8:14 AM, Dmitry Dolgov wrote:

And this too (slightly rewritten:). We will soon post the new version of patch
with updates about UniqueKey from Jesper.



Yes.

We decided to send this now, although there is still feedback from David 
that needs to be considered/acted on.


The patches can be reviewed independently, but we will send them as a 
set from now on. Development of UniqueKey will be kept separate though [1].


Note, that while UniqueKey can form the foundation of optimizations for 
GROUP BY queries it isn't the focus for this patch series. Contributions 
are very welcomed of course.


[1] https://github.com/jesperpedersen/postgres/tree/uniquekey

Best regards,
 Jesper
>From 35018a382d792d6ceeb8d0e9d16bc14ea2e3f148 Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Fri, 2 Aug 2019 07:52:08 -0400
Subject: [PATCH 1/2] Unique key

Design by David Rowley.

Author: Jesper Pedersen
---
 src/backend/nodes/outfuncs.c   |  14 +++
 src/backend/nodes/print.c  |  39 +++
 src/backend/optimizer/path/Makefile|   2 +-
 src/backend/optimizer/path/allpaths.c  |   8 ++
 src/backend/optimizer/path/costsize.c  |   5 +
 src/backend/optimizer/path/indxpath.c  |  41 +++
 src/backend/optimizer/path/pathkeys.c  |  72 ++--
 src/backend/optimizer/path/uniquekey.c | 147 +
 src/backend/optimizer/plan/planner.c   |  18 ++-
 src/backend/optimizer/util/pathnode.c  |  12 ++
 src/backend/optimizer/util/tlist.c |   1 -
 src/include/nodes/nodes.h  |   1 +
 src/include/nodes/pathnodes.h  |  18 +++
 src/include/nodes/print.h  |   2 +-
 src/include/optimizer/pathnode.h   |   1 +
 src/include/optimizer/paths.h  |  11 ++
 16 files changed, 377 insertions(+), 15 deletions(-)
 create mode 100644 src/backend/optimizer/path/uniquekey.c

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index e6ce8e2110..9a4f3e8e4b 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1720,6 +1720,7 @@ _outPathInfo(StringInfo str, const Path *node)
 	WRITE_FLOAT_FIELD(startup_cost, "%.2f");
 	WRITE_FLOAT_FIELD(total_cost, "%.2f");
 	WRITE_NODE_FIELD(pathkeys);
+	WRITE_NODE_FIELD(uniquekeys);
 }
 
 /*
@@ -2201,6 +2202,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
 	WRITE_NODE_FIELD(eq_classes);
 	WRITE_BOOL_FIELD(ec_merging_done);
 	WRITE_NODE_FIELD(canon_pathkeys);
+	WRITE_NODE_FIELD(canon_uniquekeys);
 	WRITE_NODE_FIELD(left_join_clauses);
 	WRITE_NODE_FIELD(right_join_clauses);
 	WRITE_NODE_FIELD(full_join_clauses);
@@ -2210,6 +2212,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
 	WRITE_NODE_FIELD(placeholder_list);
 	WRITE_NODE_FIELD(fkey_list);
 	WRITE_NODE_FIELD(query_pathkeys);
+	WRITE_NODE_FIELD(query_uniquekeys);
 	WRITE_NODE_FIELD(group_pathkeys);
 	WRITE_NODE_FIELD(window_pathkeys);
 	WRITE_NODE_FIELD(distinct_pathkeys);
@@ -2397,6 +2400,14 @@ _outPathKey(StringInfo str, const PathKey *node)
 	WRITE_BOOL_FIELD(pk_nulls_first);
 }
 
+static void
+_outUniqueKey(StringInfo str, const UniqueKey *node)
+{
+	WRITE_NODE_TYPE("UNIQUEKEY");
+
+	WRITE_NODE_FIELD(eq_clause);
+}
+
 static void
 _outPathTarget(StringInfo str, const PathTarget *node)
 {
@@ -4073,6 +4084,9 @@ outNode(StringInfo str, const void *obj)
 			case T_PathKey:
 _outPathKey(str, obj);
 break;
+			case T_UniqueKey:
+_outUniqueKey(str, obj);
+break;
 			case T_PathTarget:
 _outPathTarget(str, obj);
 break;
diff --git a/src/backend/nodes/print.c b/src/backend/nodes/print.c
index 4ecde6b421..62c9d4ef7a 100644
--- a/src/backend/nodes/print.c
+++ b/src/backend/nodes/print.c
@@ -459,6 +459,45 @@ print_pathkeys(const List *pathkeys, const List *rtable)
 	printf(")\n");
 }
 
+/*
+ * print_uniquekeys -
+ *	  unique_key an UniqueKey
+ */
+void
+print_uniquekeys(const List *uniquekeys, const List *rtable)
+{
+	ListCell   *l;
+
+	printf("(");
+	foreach(l, uniquekeys)
+	{
+		UniqueKey *unique_key = (UniqueKey *) lfirst(l);
+		EquivalenceClass *eclass = (EquivalenceClass *) unique_key->eq_clause;
+		ListCell   *k;
+		bool		first = true;
+
+		/* chase up */
+		while (eclass->ec_merged)
+			eclass = eclass->ec_merged;
+
+		printf("(");
+		foreach(k, eclass->ec_members)
+		{
+			EquivalenceMember *mem = (EquivalenceMember *) lfirst(k);
+
+			if (first)
+first = false;
+			else
+printf(", ");
+			print_expr((Node *) mem->em_expr, rtable);
+		}
+		printf(")");
+		if (lnext(uniquekeys, l))
+			printf(", ");
+	}
+	printf(")\n");
+}
+
 /*
  * print_tl
  *	  print targetlist in a more legible way.
diff --git a/src/backend/optimizer/path/Makefile b/src/backend/optimizer/path/Makefile
index 6864a62132..8249a6b6db 100644
--- a/src/backend/optimizer/path/Makefile
+++ b/src/backend/optimizer/path/Makefile
@@ -13,6 +13,6 @@ top_builddir = .

Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-02 Thread Jesper Pedersen

Hi Alvaro,

On 12/22/2017 10:10 AM, Alvaro Herrera wrote:

I believe these are all fixed by the attached delta patch.



Thanks.


If you have wording suggestions for the doc changes, please send them
along.



Maybe we should make the default index name more explicit under the 
'name' parameter as attached.


Best regards,
 Jesper
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 5137fe6383..7d32c5556d 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -146,7 +146,11 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] PostgreSQL chooses a
 suitable name based on the parent table's name and the indexed column
-name(s).
+name(s) using the following formular {tablename}_{columnname(s)}_{suffix}
+where {suffix} is pkey for a primary key constraint,
+key for a unique constraint, excl for an exclusion constraint,
+idx or any other kind of index, fkey for a foreign key and
+check for a check constraint.

   
  


Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-02 Thread Jesper Pedersen

Hi Alvaro,

On 12/29/2017 12:59 PM, Alvaro Herrera wrote:

This seems to work pretty well, much to my surprise.  I was a bit scared
of adding a new deptype, but actually the only affected code is
findDependentObjects() and the semantics of the new type is a subset of
the existing DEPTYPE_INTERNAL, so I think it's okay.  I need to fill in
its description in docs and comments though -- I left it out because the
real difference between INTERNAL and INTERNAL_AUTO is not something that
is covered by the existing description of INTERNAL, so maybe I'll need
to expand that one.

This version includes the fixes I posted as "delta" to the problems
Jesper reported, as well as fixes to the ones reported by Amit.  It's
looking pretty good to me -- I'm seeing it as a candidate to commit
early in the upcoming commitfest.



indexing.sql contains a \di command which list the owner of the db, so 
make check-world fails.


Best regards,
 Jesper



Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA

2018-01-02 Thread Jesper Pedersen

Hi,

On 11/27/2017 07:41 AM, Юрий Соколов wrote:

I looked at assembly, and remembered, that last commit simplifies
`init_local_spin_delay` to just two-three writes of zeroes (looks
like compiler combines 2*4byte write into 1*8 write). Compared to
code around (especially in LWLockAcquire itself), this overhead
is negligible.

Though, I found that there is benefit in calling LWLockAttemptLockOnce
before entering loop with calls to LWLockAttemptLockOrQueue in the
LWLockAcquire (in there is not much contention). And this way, `inline`
decorator for LWLockAttemptLockOrQueue could be omitted. Given, clang
doesn't want to inline this function, it could be the best way.


In attach version with LWLockAcquireOnce called before entering loop
in LWLockAcquire.



Oh... there were stupid error in previos file.
Attached fixed version.



As the patch still applies, make check-world passes and I believe that 
Yura has provided feedback for Andres' comments I'll leave this entry in 
"Ready for Committer".


Best regards,
 Jesper



Re: [HACKERS] path toward faster partition pruning

2018-01-04 Thread Jesper Pedersen

Hi Amit,

On 12/21/2017 11:25 PM, Amit Langote wrote:

Thanks again.

Please find attached updated patches.



I have been looking at this patch from a simple hash partition point of 
view.


-- ddl.sql --
CREATE TABLE t1 (
a integer NOT NULL,
b integer NOT NULL
) PARTITION BY HASH (b);

CREATE TABLE t1_p00 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, 
REMAINDER 0);
CREATE TABLE t1_p01 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, 
REMAINDER 1);
CREATE TABLE t1_p02 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, 
REMAINDER 2);
CREATE TABLE t1_p03 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, 
REMAINDER 3);


CREATE INDEX idx_t1_b_a_p00 ON t1_p00 USING btree (b, a);
CREATE INDEX idx_t1_b_a_p01 ON t1_p01 USING btree (b, a);
CREATE INDEX idx_t1_b_a_p02 ON t1_p02 USING btree (b, a);
CREATE INDEX idx_t1_b_a_p03 ON t1_p03 USING btree (b, a);

INSERT INTO t1 (SELECT i, i FROM generate_series(1, 100) AS i);

ANALYZE;
-- ddl.sql --

w/

-- select.sql --
\set b random(1, 100)
BEGIN;
SELECT t1.a, t1.b FROM t1 WHERE t1.b = :b;
COMMIT;
-- select.sql --

using pgbench -c X -j X -M prepared -T X -f select.sql part-hash

On master we have generic_cost planning cost of 33.75, and an 
avg_custom_cost of 51.25 resulting in use of the generic plan and a TPS 
of 8893.


Using v17 we have generic_cost planning cost of 33.75, and an 
avg_custom_cost of 25.9375 resulting in use of the custom plan and a TPS 
of 7129 - of course due to the generation of a custom plan for each 
invocation.


Comparing master with an non-partitioned scenario; we have a TPS of 
12968, since there is no overhead of ExecInitAppend (PortalStart) and 
ExecAppend (PortalRun).


Could you share your thoughts on

1) if the generic plan mechanics should know about the pruning and hence 
give a lower planner cost


1) if the patch should be more aggressive in removing planning nodes 
that aren't necessary, e.g. going from Append -> IndexOnly to just 
IndexOnly.


I have tested with both [1] and [2], but would like to know about your 
thoughts on the above first.


Thanks in advance !

[1] https://commitfest.postgresql.org/16/1330/
[2] https://commitfest.postgresql.org/16/1353/

Best regards,
 Jesper



Re: [HACKERS] path toward faster partition pruning

2018-01-05 Thread Jesper Pedersen

Hi David,

On 01/04/2018 09:21 PM, David Rowley wrote:

On 5 January 2018 at 07:16, Jesper Pedersen  wrote:

Could you share your thoughts on

1) if the generic plan mechanics should know about the pruning and hence
give a lower planner cost


I think the problem here is that cached_plan_cost() is costing the
planning cost of the query too low. If this was costed higher then its
more likely the generic plan would have been chosen, instead of
generating a custom plan each time.

How well does it perform if you change cpu_operator_cost = 0.01?



It gives 38.82 for generic_cost, and 108.82 for avg_custom_cost on 
master (8249 TPS). And, 38.82 for generic_cost, and 79.705 for 
avg_custom_cost with v17 (7891 TPS). Non-partitioned is 11722 TPS.



I think cached_plan_cost() does need an overhaul, but I think it's not
anything that should be done as part of this patch. You've picked HASH
partitioning here just because the current master does not perform any
partition pruning for that partitioning strategy.



Well, I mainly picked HASH because that is my use-case :)

For a range based setup it gives 39.84 for generic_cost, and 89.705 for 
avg_custom_cost (7862 TPS).


Best regards,
 Jesper



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-08 Thread Jesper Pedersen

Hi Alvaro,

On 01/04/2018 09:30 AM, Alvaro Herrera wrote:

v11 fixes the dependency problem I mentioned in
https://postgr.es/m/20171229203818.pqxf2cyl4g2wre6h@alvherre.pgsql
and adds some test to cover that stuff.



Thanks, make check-world passes.

Maybe a warning for existing indexes on the same column(s), but with a 
different type, should be emitted during ATTACH, e.g.


-- test.sql --
CREATE TABLE test (a integer NOT NULL) PARTITION BY HASH(a);
CREATE TABLE test_p00 PARTITION OF test FOR VALUES WITH (MODULUS 2, 
REMAINDER 0);
CREATE TABLE test_p01 PARTITION OF test FOR VALUES WITH (MODULUS 2, 
REMAINDER 1);

CREATE INDEX idx_test_a ON test (a);

INSERT INTO test (SELECT i FROM generate_series(1, 100) AS i);

ANALYZE;

ALTER TABLE test DETACH PARTITION test_p00;
DROP INDEX test_p00_a_idx;
CREATE INDEX test_p00_a_idx ON test_p00 USING hash (a);
ALTER TABLE test ATTACH PARTITION test_p00 FOR VALUES WITH (MODULUS 2, 
REMAINDER 0);


-- test.sql --

Of course, this could also fall under index maintenance and no warning 
emitted. Docs have "Each partition is first checked to determine whether 
an equivalent index already exists," so it is covered.


Best regards,
 Jesper



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-08 Thread Jesper Pedersen

Hi Alvaro,

On 01/08/2018 03:36 PM, Alvaro Herrera wrote:

Jesper Pedersen wrote:


Maybe a warning for existing indexes on the same column(s), but with a
different type, should be emitted during ATTACH, e.g.



[detach one partition, replace index with a different one, attach
partition]



Of course, this could also fall under index maintenance and no warning
emitted. Docs have "Each partition is first checked to determine whether an
equivalent index already exists," so it is covered.


Yeah, I'm of two minds about this also -- in the initial versions I had
a code comment wondering exactly about having a hash index in a
partition attached to a btree index on parent.

As another example, having a documents table with two partitions (one
"long term archival" and other "documents currently being messed with")
you could have a text search index which is GIN in the former and GiST
in the latter.  There is a performance argument for doing it that way,
so it's not merely academic.

Anyway, while I think attaching an index that doesn't match the
properties of the index on parent can be a useful feature, on the other
hand it could be surprising (you end up losing an index because it was
matched during attach that you didn't think was going to be matched).
One idea would be to have a way to specify at ATTACH time what to do
about indexes when they don't match exactly, but I think the user
interface would be pretty messy: exactly how different do you want to
allow the indexes to be?  Is an index having one more column than the
one in parent good enough?  I think the answer to this is mostly a
judgement call, and I'd rather not spend my time debating those.



Yeah, agreed - lets leave as is.

Migrating an index to another type would mean to drop the top-level 
definition anyway.


Best regards,
 Jesper



Re: [HACKERS] Proposal: Local indexes for partitioned table

2018-01-19 Thread Jesper Pedersen

Hi Alvaro,

On 01/18/2018 04:43 PM, Alvaro Herrera wrote:

Here's a patch to add pg_dump tests.  This verifies that the
CREATE INDEX on parent and partition appear, as well as ALTER INDEX ..
ATTACH PARTITION.

While doing this, I noticed a small bug: the ALTER INDEX would not be
dumped when only one of the schemas are specified to pg_dump (schema of
parent and schema of partition), but it would be if both were specified.
This patch fixes that problem too.  AFAICS the test is correct after
that fix (ie. the "like" and "unlike" sets are correct now.)



I get

pg_restore: creating INDEX "pg_catalog.pg_authid_oid_index"
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 5493; 1259 2677 INDEX 
pg_authid_oid_index jpedersen
pg_restore: [archiver (db)] could not execute query: ERROR:  permission 
denied: "pg_authid" is a system catalog

Command was:
-- For binary upgrade, must preserve pg_class oids
SELECT 
pg_catalog.binary_upgrade_set_next_index_pg_class_oid('2677'::pg_catalog.oid);


CREATE UNIQUE INDEX "pg_authid_oid_index" ON "pg_authid" USING "btree" 
("oid");


during check-world.

Best regards,
 Jesper



Re: unique indexes on partitioned tables

2018-01-25 Thread Jesper Pedersen

Hi Alvaro,

On 01/22/2018 05:55 PM, Alvaro Herrera wrote:

Alvaro Herrera wrote:

Version 4 of this patch, rebased on today's master.





Passes make check-world.

Maybe add a test case to indexing.sql that highlights that hash indexes 
doesn't support UNIQUE; although not unique to partitioned indexes.


Thanks for working on this !

Best regards,
 Jesper



Re: [HACKERS] path toward faster partition pruning

2018-01-29 Thread Jesper Pedersen

Hi Amit,

On 01/26/2018 04:17 AM, Amit Langote wrote:

I made a few of those myself in the updated patches.

Thanks a lot again for your work on this.



This needs a rebase.

Best regards,
 Jesper





Re: [HACKERS] path toward faster partition pruning

2018-01-30 Thread Jesper Pedersen

Hi Amit,

On 01/29/2018 07:57 PM, Amit Langote wrote:

This needs a rebase.


AFAICS, v22 cleanly applies to HEAD (c12693d8f3 [1]), compiles, and make
check passes.



It was a rebase error; I should have checked against a clean master.

Sorry for the noise.

Best regards,
 Jesper




Application name for pg_basebackup and friends

2019-10-31 Thread Jesper Pedersen

Hi,

The attached patch adds an -a / --appname command line switch to 
pg_basebackup, pg_receivewal and pg_recvlogical.


This is useful when f.ex. pg_receivewal needs to connect as a 
synchronous client (synchronous_standby_names),


 pg_receivewal -h myhost -p 5432 -S replica1 -a replica1 --synchronous 
-D /wal


I'll add the patch to the CommitFest for discussion, as there is overlap 
with the -d switch.


Best regards,
 Jesper
>From 3aee659423137a547ed178a1dab34fe3caf30702 Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Thu, 31 Oct 2019 08:34:41 -0400
Subject: [PATCH] Add an -a / --appname command line switch to control the
 application_name property.

Author: Jesper Pedersen 
---
 doc/src/sgml/ref/pg_basebackup.sgml| 11 +++
 doc/src/sgml/ref/pg_receivewal.sgml| 11 +++
 doc/src/sgml/ref/pg_recvlogical.sgml   | 11 +++
 src/bin/pg_basebackup/pg_basebackup.c  |  7 ++-
 src/bin/pg_basebackup/pg_receivewal.c  |  7 ++-
 src/bin/pg_basebackup/pg_recvlogical.c |  7 ++-
 src/bin/pg_basebackup/streamutil.c |  7 +++
 src/bin/pg_basebackup/streamutil.h |  1 +
 8 files changed, 59 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index fc9e222f8d..28b5dee206 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -543,6 +543,17 @@ PostgreSQL documentation
 The following command-line options control the database connection parameters.
 
 
+ 
+  -a application_name
+  --appname=application_name
+  
+   
+Specifies the application name used to connect to the server.
+See  for more information.
+   
+  
+ 
+
  
   -d connstr
   --dbname=connstr
diff --git a/doc/src/sgml/ref/pg_receivewal.sgml b/doc/src/sgml/ref/pg_receivewal.sgml
index 177e9211c0..0957e0a9f5 100644
--- a/doc/src/sgml/ref/pg_receivewal.sgml
+++ b/doc/src/sgml/ref/pg_receivewal.sgml
@@ -248,6 +248,17 @@ PostgreSQL documentation
 The following command-line options control the database connection parameters.
 
 
+ 
+  -a application_name
+  --appname=application_name
+  
+   
+Specifies the application name used to connect to the server.
+See  for more information.
+   
+  
+ 
+
  
   -d connstr
   --dbname=connstr
diff --git a/doc/src/sgml/ref/pg_recvlogical.sgml b/doc/src/sgml/ref/pg_recvlogical.sgml
index 4c79f90414..b2d9b35362 100644
--- a/doc/src/sgml/ref/pg_recvlogical.sgml
+++ b/doc/src/sgml/ref/pg_recvlogical.sgml
@@ -272,6 +272,17 @@ PostgreSQL documentation
 The following command-line options control the database connection parameters.
 
 
+  
+   -a application_name
+   --appname=application_name
+   
+
+ Specifies the application name used to connect to the server.
+ See  for more information.
+
+   
+  
+
   
-d database
--dbname=database
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index a9d162a7da..237945f879 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -351,6 +351,7 @@ usage(void)
 			 " do not verify checksums\n"));
 	printf(_("  -?, --help show this help, then exit\n"));
 	printf(_("\nConnection options:\n"));
+	printf(_("  -a, --appname=NAME application name\n"));
 	printf(_("  -d, --dbname=CONNSTR   connection string\n"));
 	printf(_("  -h, --host=HOSTNAMEdatabase server host or socket directory\n"));
 	printf(_("  -p, --port=PORTdatabase server port number\n"));
@@ -2031,6 +2032,7 @@ main(int argc, char **argv)
 		{"label", required_argument, NULL, 'l'},
 		{"no-clean", no_argument, NULL, 'n'},
 		{"no-sync", no_argument, NULL, 'N'},
+		{"appname", required_argument, NULL, 'a'},
 		{"dbname", required_argument, NULL, 'd'},
 		{"host", required_argument, NULL, 'h'},
 		{"port", required_argument, NULL, 'p'},
@@ -2070,7 +2072,7 @@ main(int argc, char **argv)
 
 	atexit(cleanup_directories_atexit);
 
-	while ((c = getopt_long(argc, argv, "CD:F:r:RS:T:X:l:nNzZ:d:c:h:p:U:s:wWkvP",
+	while ((c = getopt_long(argc, argv, "CD:F:r:RS:T:X:l:nNzZ:a:d:c:h:p:U:s:wWkvP",
 			long_options, &option_index)) != -1)
 	{
 		switch (c)
@@ -2176,6 +2178,9 @@ main(int argc, char **argv)
 	exit(1);
 }
 break;
+			case 'a':
+application_name = pg_strdup(optarg);
+break;
 			case 'd':
 connection_string = pg_strdup(optarg);
 break;
diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c
index c0c8747982..8

Re: Application name for pg_basebackup and friends

2019-11-07 Thread Jesper Pedersen

On 11/7/19 1:51 AM, Michael Paquier wrote:

I don't think we need a new comand line switch for it.


+1.


Please note that I have marked this patch as rejected in the CF app,
per the arguments upthread.


Ok, agreed.

Thanks for the feedback !

Best regards,
 Jesper





Re: Index Skip Scan

2019-11-11 Thread Jesper Pedersen

Hi Tomas,

On 11/10/19 4:18 PM, Tomas Vondra wrote:

I've looked at the patch again - in general it seems in pretty good
shape, all the issues I found are mostly minor.

Firstly, I'd like to point out that not all of the things I complained
about in my 2019/06/23 review got addressed. Those were mostly related
to formatting and code style, and the attached patch fixes some (but
maybe not all) of them.



Sorry about that !


The patch also tweaks wording of some bits in the docs and comments that
I found unclear. Would be nice if a native speaker could take a look.

A couple more comments:


1) pathkey_is_unique

The one additional issue I found is pathkey_is_unique - it's not really
explained what "unique" means and hot it's different from "redundant"
(which has quite a long explanation before pathkey_is_redundant).

My understanding is that pathkey is "unique" when it's EC does not match
an EC of another pathkey in the list. But if that's the case, then the
function name is wrong - it does exactly the opposite (i.e. it returns
'true' when the pathkey is *not* unique).



Yeah, you are correct - forgot to move that part from the _uniquekey 
version of the patch.




2) explain

I wonder if we should print the "Skip scan" info always, or similarly to
"Inner Unique" which does this:

/* try not to be too chatty about this in text mode */
     if (es->format != EXPLAIN_FORMAT_TEXT ||
     (es->verbose && ((Join *) plan)->inner_unique))
     ExplainPropertyBool("Inner Unique",
     ((Join *) plan)->inner_unique,
     es);
     break;

I'd do the same thing for skip scan - print it only in verbose mode, or
when using non-text output format.



I think it is of benefit to see if skip scan kicks in, but used your 
"Skip scan" suggestion.




3) There's an annoying limitation that for this to kick in, the order of
expressions in the DISTINCT clause has to match the index, i.e. with
index on (a,b,c) the skip scan will only kick in for queries using

    DISTINCT a
    DISTINCT a,b
    DISTINCT a,b,c

but not e.g. DISTINCT a,c,b. I don't think there's anything forcing us
to sort result of DISTINCT in any particular case, except that we don't
consider the other orderings "interesting" so we don't really consider
using the index (so no chance of using the skip scan).

That leads to pretty annoying speedups/slowdowns due to seemingly
irrelevant changes:

-- everything great, a,b,c matches an index
test=# explain (analyze, verbose) select distinct a,b,c from t;
  QUERY PLAN 
- 

  Index Only Scan using t_a_b_c_idx on public.t  (cost=0.42..565.25 
rows=1330 width=12) (actual time=0.016..10.387 rows=1331 loops=1)

    Output: a, b, c
    Skip scan: true
    Heap Fetches: 1331
  Planning Time: 0.106 ms
  Execution Time: 10.843 ms
(6 rows)

-- slow, mismatch with index
test=# explain (analyze, verbose) select distinct a,c,b from t;
     QUERY PLAN 
--- 

  HashAggregate  (cost=22906.00..22919.30 rows=1330 width=12) (actual 
time=802.067..802.612 rows=1331 loops=1)

    Output: a, c, b
    Group Key: t.a, t.c, t.b
    ->  Seq Scan on public.t  (cost=0.00..15406.00 rows=100 
width=12) (actual time=0.010..355.361 rows=100 loops=1)

  Output: a, b, c
  Planning Time: 0.076 ms
  Execution Time: 803.078 ms
(7 rows)

-- fast again, the extra ordering allows using the index again
test=# explain (analyze, verbose) select distinct a,c,b from t order by 
a,b,c;
  QUERY PLAN 
- 

  Index Only Scan using t_a_b_c_idx on public.t  (cost=0.42..565.25 
rows=1330 width=12) (actual time=0.035..12.120 rows=1331 loops=1)

    Output: a, c, b
    Skip scan: true
    Heap Fetches: 1331
  Planning Time: 0.053 ms
  Execution Time: 12.632 ms
(6 rows)


This is a more generic issue, not specific to this patch, of course. I
think we saw it with the incremental sort patch, IIRC. I wonder how
difficult would it be to fix this here (not necessarily in v1).



Yeah, I see it as separate to this patch as well. But definitely 
something that should be revisited.


Thanks for your patch ! v29 using UniqueKey attached.

Best regards,
 Jesper
>From 4e27a04702002d06f60468f8a9033d2ac2e12d8a Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Mon, 11 Nov 2019 0

Re: Index Skip Scan

2019-11-15 Thread Jesper Pedersen

Hi,

On 11/11/19 1:24 PM, Jesper Pedersen wrote:

v29 using UniqueKey attached.



Just a small update to the UniqueKey patch to hopefully keep CFbot happy.

Feedback, especially on the planner changes, would be greatly appreciated.

Best regards,
 Jesper
>From b1a69c2791c8aba6caa85d7f24b9836641150875 Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Tue, 9 Jul 2019 06:44:57 -0400
Subject: [PATCH] Unique key

Design by David Rowley.

Author: Jesper Pedersen
---
 src/backend/nodes/outfuncs.c   |  14 +++
 src/backend/nodes/print.c  |  39 +++
 src/backend/optimizer/path/Makefile|   3 +-
 src/backend/optimizer/path/allpaths.c  |   8 ++
 src/backend/optimizer/path/indxpath.c  |  41 +++
 src/backend/optimizer/path/pathkeys.c  |  71 ++--
 src/backend/optimizer/path/uniquekey.c | 147 +
 src/backend/optimizer/plan/planagg.c   |   1 +
 src/backend/optimizer/plan/planmain.c  |   1 +
 src/backend/optimizer/plan/planner.c   |  17 ++-
 src/backend/optimizer/util/pathnode.c  |  12 ++
 src/include/nodes/nodes.h  |   1 +
 src/include/nodes/pathnodes.h  |  18 +++
 src/include/nodes/print.h  |   1 +
 src/include/optimizer/pathnode.h   |   1 +
 src/include/optimizer/paths.h  |  11 ++
 16 files changed, 373 insertions(+), 13 deletions(-)
 create mode 100644 src/backend/optimizer/path/uniquekey.c

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index b0dcd02ff6..1ccd68d3aa 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1720,6 +1720,7 @@ _outPathInfo(StringInfo str, const Path *node)
 	WRITE_FLOAT_FIELD(startup_cost, "%.2f");
 	WRITE_FLOAT_FIELD(total_cost, "%.2f");
 	WRITE_NODE_FIELD(pathkeys);
+	WRITE_NODE_FIELD(uniquekeys);
 }
 
 /*
@@ -2201,6 +2202,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
 	WRITE_NODE_FIELD(eq_classes);
 	WRITE_BOOL_FIELD(ec_merging_done);
 	WRITE_NODE_FIELD(canon_pathkeys);
+	WRITE_NODE_FIELD(canon_uniquekeys);
 	WRITE_NODE_FIELD(left_join_clauses);
 	WRITE_NODE_FIELD(right_join_clauses);
 	WRITE_NODE_FIELD(full_join_clauses);
@@ -2210,6 +2212,7 @@ _outPlannerInfo(StringInfo str, const PlannerInfo *node)
 	WRITE_NODE_FIELD(placeholder_list);
 	WRITE_NODE_FIELD(fkey_list);
 	WRITE_NODE_FIELD(query_pathkeys);
+	WRITE_NODE_FIELD(query_uniquekeys);
 	WRITE_NODE_FIELD(group_pathkeys);
 	WRITE_NODE_FIELD(window_pathkeys);
 	WRITE_NODE_FIELD(distinct_pathkeys);
@@ -2397,6 +2400,14 @@ _outPathKey(StringInfo str, const PathKey *node)
 	WRITE_BOOL_FIELD(pk_nulls_first);
 }
 
+static void
+_outUniqueKey(StringInfo str, const UniqueKey *node)
+{
+	WRITE_NODE_TYPE("UNIQUEKEY");
+
+	WRITE_NODE_FIELD(eq_clause);
+}
+
 static void
 _outPathTarget(StringInfo str, const PathTarget *node)
 {
@@ -4083,6 +4094,9 @@ outNode(StringInfo str, const void *obj)
 			case T_PathKey:
 _outPathKey(str, obj);
 break;
+			case T_UniqueKey:
+_outUniqueKey(str, obj);
+break;
 			case T_PathTarget:
 _outPathTarget(str, obj);
 break;
diff --git a/src/backend/nodes/print.c b/src/backend/nodes/print.c
index 4ecde6b421..435b32063c 100644
--- a/src/backend/nodes/print.c
+++ b/src/backend/nodes/print.c
@@ -459,6 +459,45 @@ print_pathkeys(const List *pathkeys, const List *rtable)
 	printf(")\n");
 }
 
+/*
+ * print_uniquekeys -
+ *	  uniquekeys list of UniqueKeys
+ */
+void
+print_uniquekeys(const List *uniquekeys, const List *rtable)
+{
+	ListCell   *l;
+
+	printf("(");
+	foreach(l, uniquekeys)
+	{
+		UniqueKey *unique_key = (UniqueKey *) lfirst(l);
+		EquivalenceClass *eclass = (EquivalenceClass *) unique_key->eq_clause;
+		ListCell   *k;
+		bool		first = true;
+
+		/* chase up */
+		while (eclass->ec_merged)
+			eclass = eclass->ec_merged;
+
+		printf("(");
+		foreach(k, eclass->ec_members)
+		{
+			EquivalenceMember *mem = (EquivalenceMember *) lfirst(k);
+
+			if (first)
+first = false;
+			else
+printf(", ");
+			print_expr((Node *) mem->em_expr, rtable);
+		}
+		printf(")");
+		if (lnext(uniquekeys, l))
+			printf(", ");
+	}
+	printf(")\n");
+}
+
 /*
  * print_tl
  *	  print targetlist in a more legible way.
diff --git a/src/backend/optimizer/path/Makefile b/src/backend/optimizer/path/Makefile
index 1e199ff66f..63cc1505d9 100644
--- a/src/backend/optimizer/path/Makefile
+++ b/src/backend/optimizer/path/Makefile
@@ -21,6 +21,7 @@ OBJS = \
 	joinpath.o \
 	joinrels.o \
 	pathkeys.o \
-	tidpath.o
+	tidpath.o \
+	uniquekey.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index db3a68a51d..5fc9b81746 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -3954,6 +3954,14 @@ print_path(PlannerInfo *root, Path *path, int indent)
 		print_pathkeys(path->pathkeys, root->

Re: Index Skip Scan

2020-01-20 Thread Jesper Pedersen

Hi Floris,

On 1/15/20 8:33 AM, Floris Van Nee wrote:

I reviewed the latest version of the patch. Overall some good improvements I 
think. Please find my feedback below.



Thanks for your review !


- I think I mentioned this before - it's not that big of a deal, but it just 
looks weird and inconsistent to me:
create table t2 as (select a, b, c, 10 d from generate_series(1,5) a, 
generate_series(1,100) b, generate_series(1,1) c); create index on t2 
(a,b,c desc);

postgres=# explain select distinct on (a,b) a,b,c from t2 where a=2 and b>=5 and 
b<=5 order by a,b,c desc;
QUERY PLAN
-
  Index Only Scan using t2_a_b_c_idx on t2  (cost=0.43..216.25 rows=500 
width=12)
Skip scan: true
Index Cond: ((a = 2) AND (b >= 5) AND (b <= 5))
(3 rows)

postgres=# explain select distinct on (a,b) a,b,c from t2 where a=2 and b=5 
order by a,b,c desc;
QUERY PLAN
-
  Unique  (cost=0.43..8361.56 rows=500 width=12)
->  Index Only Scan using t2_a_b_c_idx on t2  (cost=0.43..8361.56 rows=9807 
width=12)
  Index Cond: ((a = 2) AND (b = 5))
(3 rows)

When doing a distinct on (params) and having equality conditions for all 
params, it falls back to the regular index scan even though there's no reason 
not to use the skip scan here. It's much faster to write b between 5 and 5 now 
rather than writing b=5. I understand this was a limitation of the unique-keys 
patch at the moment which could be addressed in the future. I think for the 
sake of consistency it would make sense if this eventually gets addressed.



Agreed, that it is an improvement that should be made. I would like 
David's view on this since it relates to the UniqueKey patch.



- nodeIndexScan.c, line 126
This sets xs_want_itup to true in all cases (even for non skip-scans). I don't 
think this is acceptable given the side-effects this has (page will never be 
unpinned in between returned tuples in _bt_drop_lock_and_maybe_pin)



Correct - fixed.


- nbsearch.c, _bt_skip, line 1440
_bt_update_skip_scankeys(scan, indexRel); This function is called twice now - 
once in the else {} and immediately after that outside of the else. The second 
call can be removed I think.



Yes, removed the "else" call site.


- nbtsearch.c _bt_skip line 1597
LockBuffer(so->currPos.buf, BUFFER_LOCK_UNLOCK);
scan->xs_itup = (IndexTuple) PageGetItem(page, 
itemid);

This is an UNLOCK followed by a read of the unlocked page. That looks incorrect?



Yes, that needed to be changed.


- nbtsearch.c _bt_skip line 1440
if (BTScanPosIsValid(so->currPos) &&
_bt_scankey_within_page(scan, so->skipScanKey, so->currPos.buf, 
dir))

Is it allowed to look at the high key / low key of the page without have a read 
lock on it?



In case of a split the page will still contain a high key and a low key, 
so this should be ok.



- nbtsearch.c line 1634
if (_bt_readpage(scan, indexdir, offnum))  ...
else
  error()

Is it really guaranteed that a match can be found on the page itself? Isn't it 
possible that an extra index condition, not part of the scan key, makes none of 
the keys on the page match?



The logic for this has been changed.


- nbtsearch.c in general
Most of the code seems to rely quite heavily on the fact that xs_want_itup 
forces _bt_drop_lock_and_maybe_pin to never release the buffer pin. Have you 
considered that compacting of a page may still happen even if you hold the pin? 
[1] I've been trying to come up with cases in which this may break the patch, 
but I haven't able to produce such a scenario - so it may be fine. But it would 
be good to consider again. One thing I was thinking of was a scenario where 
page splits and/or compacting would happen in between returning tuples. Could 
this break the _bt_scankey_within_page check such that it thinks the scan key 
is within the current page, while it actually isn't? Mainly for backward and/or 
cursor scans. Forward scans shouldn't be a problem I think. Apologies for being 
a bit vague as I don't have a clear example ready when it would go wrong. It 
may well be fine, but it was one of the things on my mind.



There is a BT_READ lock in place when finding the correct leaf page, or 
searching within the leaf page itself. _bt_vacuum_one_page deletes only 
LP_DEAD tuples, but those are already ignored in _bt_readpage. Peter, do 
you have some feedback for this ?



Please, find the updated patches attached that Dmitry and I made.

Thanks again !

Best regards,
 Jesper
>From 3c540c93307e6cbe792b31b12d4ecb025cd6b327 Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Fri, 15 Nov 2019 09:46:0

Re: Index Skip Scan

2020-01-21 Thread Jesper Pedersen

Hi Peter,

Thanks for your feedback; Dmitry has followed-up with some additional 
questions.


On 1/20/20 8:05 PM, Peter Geoghegan wrote:

This is definitely not okay.


I suggest that you find a way to add assertions to code like
_bt_readpage() that verify that we do in fact have the buffer content
lock. 


If you apply the attached patch on master it will fail the test suite; 
did you mean something else ?


Best regards,
 Jesper
diff --git a/src/backend/access/nbtree/nbtsearch.c 
b/src/backend/access/nbtree/nbtsearch.c
index 4189730f3a..57882f0b8d 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -1721,6 +1721,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, 
OffsetNumber offnum)
 * used here; this function is what makes it good for currPos.
 */
Assert(BufferIsValid(so->currPos.buf));
+   Assert(IsBufferPinnedAndLocked(so->currPos.buf));
 
page = BufferGetPage(so->currPos.buf);
opaque = (BTPageOpaque) PageGetSpecialPointer(page);
diff --git a/src/backend/storage/buffer/bufmgr.c 
b/src/backend/storage/buffer/bufmgr.c
index aba3960481..f29f40f9b6 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -3774,6 +3774,23 @@ HoldingBufferPinThatDelaysRecovery(void)
return false;
 }
 
+/*
+ * Assert that the buffer is pinned and locked
+ */
+bool
+IsBufferPinnedAndLocked(Buffer buffer)
+{
+   BufferDesc *bufHdr;
+
+   Assert(BufferIsPinned(buffer));
+
+   bufHdr = GetBufferDescriptor(buffer - 1);
+
+   Assert(LWLockHeldByMe(BufferDescriptorGetContentLock(bufHdr)));
+
+   return true;
+}
+
 /*
  * ConditionalLockBufferForCleanup - as above, but don't wait to get the lock
  *
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 73c7e9ba38..46a6aa6560 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -217,6 +217,7 @@ extern void LockBufferForCleanup(Buffer buffer);
 extern bool ConditionalLockBufferForCleanup(Buffer buffer);
 extern bool IsBufferCleanupOK(Buffer buffer);
 extern bool HoldingBufferPinThatDelaysRecovery(void);
+extern bool IsBufferPinnedAndLocked(Buffer buffer);
 
 extern void AbortBufferIO(void);
 


Re: partitioned tables referenced by FKs

2019-04-02 Thread Jesper Pedersen

Hi,

On 4/1/19 4:03 PM, Alvaro Herrera wrote:

I'm satisfied with this patch now, so I intend to push early tomorrow.



Passes check-world, and my tests.

Best regards,
 Jesper




Re: partitioned tables referenced by FKs

2019-04-03 Thread Jesper Pedersen

On 4/3/19 1:52 PM, Alvaro Herrera wrote:

Pushed, many thanks Amit and Jesper for reviewing.



Thank you for working on this feature.

Best regards,
 Jesper





COLLATE: Hash partition vs UPDATE

2019-04-08 Thread Jesper Pedersen

Hi,

The following case

-- test.sql --
CREATE TABLE test (a text PRIMARY KEY, b text) PARTITION BY HASH (a);
CREATE TABLE test_p0 PARTITION OF test FOR VALUES WITH (MODULUS 2, 
REMAINDER 0);
CREATE TABLE test_p1 PARTITION OF test FOR VALUES WITH (MODULUS 2, 
REMAINDER 1);

-- CREATE INDEX idx_test_b ON test USING HASH (b);

INSERT INTO test VALUES ('', '');

-- Regression
UPDATE test SET b = '' WHERE a = '';
-- test.sql --

fails on master, which includes [1], with


psql:test.sql:9: ERROR:  could not determine which collation to use for 
string hashing

HINT:  Use the COLLATE clause to set the collation explicitly.


It passes on 11.x.

I'll add it to the open items list.

[1] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=5e1963fb764e9cc092e0f7b58b28985c311431d9


Best regards,
 Jesper




Re: COLLATE: Hash partition vs UPDATE

2019-04-09 Thread Jesper Pedersen

Hi Amit,

On 4/8/19 11:18 PM, Amit Langote wrote:

As of this commit, hashing functions hashtext() and hashtextextended()
require a valid collation to be passed in.  ISTM,
satisfies_hash_partition() that's called by hash partition constraint
checking should have been changed to use FunctionCall2Coll() interface to
account for the requirements of the above commit.  I see that it did that
for compute_partition_hash_value(), which is used by hash partition tuple
routing.  That also seems to be covered by regression tests, but there are
no tests that cover satisfies_hash_partition().

Attached patch is an attempt to fix this.  I've also added Amul Sul who
can maybe comment on the satisfies_hash_partition() changes.



Yeah, that works here - apart from an issue with the test case; fixed in 
the attached.


Best regards,
 Jesper
>From 1902dcf1fd31c95fd95e1231630b3c446857cd59 Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Tue, 9 Apr 2019 07:35:28 -0400
Subject: [PATCH] 0001

---
 src/backend/partitioning/partbounds.c   | 12 +---
 src/test/regress/expected/hash_part.out | 14 ++
 src/test/regress/sql/hash_part.sql  | 10 ++
 3 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/src/backend/partitioning/partbounds.c b/src/backend/partitioning/partbounds.c
index c8770ccfee..331dab3954 100644
--- a/src/backend/partitioning/partbounds.c
+++ b/src/backend/partitioning/partbounds.c
@@ -2778,6 +2778,7 @@ satisfies_hash_partition(PG_FUNCTION_ARGS)
 		bool		variadic_typbyval;
 		char		variadic_typalign;
 		FmgrInfo	partsupfunc[PARTITION_MAX_KEYS];
+		Oid			partcollid[PARTITION_MAX_KEYS];
 	} ColumnsHashData;
 	Oid			parentId;
 	int			modulus;
@@ -2865,6 +2866,7 @@ satisfies_hash_partition(PG_FUNCTION_ARGS)
 fmgr_info_copy(&my_extra->partsupfunc[j],
 			   &key->partsupfunc[j],
 			   fcinfo->flinfo->fn_mcxt);
+my_extra->partcollid[j] = key->partcollation[j];
 			}
 
 		}
@@ -2888,6 +2890,7 @@ satisfies_hash_partition(PG_FUNCTION_ARGS)
 
 			/* check argument types */
 			for (j = 0; j < key->partnatts; ++j)
+			{
 if (key->parttypid[j] != my_extra->variadic_type)
 	ereport(ERROR,
 			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -2895,6 +2898,8 @@ satisfies_hash_partition(PG_FUNCTION_ARGS)
 	j + 1,
 	format_type_be(key->parttypid[j]),
 	format_type_be(my_extra->variadic_type;
+my_extra->partcollid[j] = key->partcollation[j];
+			}
 
 			fmgr_info_copy(&my_extra->partsupfunc[0],
 		   &key->partsupfunc[0],
@@ -2928,9 +2933,10 @@ satisfies_hash_partition(PG_FUNCTION_ARGS)
 
 			Assert(OidIsValid(my_extra->partsupfunc[i].fn_oid));
 
-			hash = FunctionCall2(&my_extra->partsupfunc[i],
- PG_GETARG_DATUM(argno),
- seed);
+			hash = FunctionCall2Coll(&my_extra->partsupfunc[i],
+	 my_extra->partcollid[i],
+	 PG_GETARG_DATUM(argno),
+	 seed);
 
 			/* Form a single 64-bit hash value */
 			rowHash = hash_combine64(rowHash, DatumGetUInt64(hash));
diff --git a/src/test/regress/expected/hash_part.out b/src/test/regress/expected/hash_part.out
index 731d26fc3d..adb4410b4b 100644
--- a/src/test/regress/expected/hash_part.out
+++ b/src/test/regress/expected/hash_part.out
@@ -99,6 +99,20 @@ ERROR:  number of partitioning columns (2) does not match number of partition ke
 SELECT satisfies_hash_partition('mcinthash'::regclass, 4, 0,
 variadic array[now(), now()]);
 ERROR:  column 1 of the partition key has type "integer", but supplied value is of type "timestamp with time zone"
+-- check satisfies_hash_partition passes correct collation
+create table text_hashp (a text) partition by hash (a);
+create table text_hashp0 partition of text_hashp for values with (modulus 2, remainder 0);
+create table text_hashp1 partition of text_hashp for values with (modulus 2, remainder 1);
+-- The result here should always be true, because 'xxx' must belong to at least
+-- one of the two defined partitions
+select satisfies_hash_partition('text_hashp'::regclass, 2, 0, 'xxx'::text) OR
+	   satisfies_hash_partition('text_hashp'::regclass, 2, 1, 'xxx'::text) AS satisfies;
+ satisfies 
+---
+ t
+(1 row)
+
 -- cleanup
 DROP TABLE mchash;
 DROP TABLE mcinthash;
+DROP TABLE text_hashp;
diff --git a/src/test/regress/sql/hash_part.sql b/src/test/regress/sql/hash_part.sql
index f457ac344c..8cd9e8a8a2 100644
--- a/src/test/regress/sql/hash_part.sql
+++ b/src/test/regress/sql/hash_part.sql
@@ -75,6 +75,16 @@ SELECT satisfies_hash_partition('mcinthash'::regclass, 4, 0,
 SELECT satisfies_hash_partition('mcinthash'::regclass, 4, 0,
 variadic array[now(), now()]);
 
+-- check satisfies_hash_partition passes correct collation
+create table text_hashp (a text) partition by hash (a);
+create table text_hashp0 partition of text_hashp for values with (modulus 2, remainder 0);
+create table text_hashp1 partition of text_hashp for values with (modulus 2, remainder 1);
+-- The result here should always be tru

Re: New committer: David Rowley

2019-05-31 Thread Jesper Pedersen

On 5/30/19 11:39 AM, Magnus Hagander wrote:

Congratulations to David, may the buildfarm be gentle to him, and his first
revert far away!



Congrats !

Best regards,
 Jesper





Re: Index Skip Scan

2019-06-03 Thread Jesper Pedersen

Hi Floris,

On 6/1/19 12:10 AM, Floris Van Nee wrote:

Given a table definition of (market text, feedcode text, updated_at 
timestamptz, value float8) and an index on (market, feedcode, updated_at desc) 
(note that this table slightly deviates from what I described in my previous 
mail) and filling it with data.


The following query uses an index skip scan and returns just 1 row (incorrect!)

select distinct on (market, feedcode) market, feedcode
from streams.base_price
where market='TEST'

The following query still uses the regular index scan and returns many more 
rows (correct)
select distinct on (market, feedcode) *
from streams.base_price
where market='TEST'


It seems that partially filtering on one of the distinct columns triggers 
incorrect behavior where too many rows in the index are skipped.




Thanks for taking a look at the patch, and your feedback on it.

I'll def look into this once I'm back from my travels.

Best regards,
 Jesper




Re: Index Skip Scan

2019-06-03 Thread Jesper Pedersen

Hi Rafia,

On 6/1/19 6:03 AM, Rafia Sabih wrote:

Here is my repeatable test case,

create table t (market text, feedcode text, updated_at timestamptz,
value float8) ;
create index on t (market, feedcode, updated_at desc);
insert into t values('TEST', 'abcdef', (select timestamp '2019-01-10
20:00:00' + random() * (timestamp '2014-01-20 20:00:00' - timestamp
'2019-01-20 20:00:00') ), generate_series(1,100)*9.88);
insert into t values('TEST', 'jsgfhdfjd', (select timestamp
'2019-01-10 20:00:00' + random() * (timestamp '2014-01-20 20:00:00' -
timestamp '2019-01-20 20:00:00') ), generate_series(1,100)*9.88);

Now, without the patch,
select distinct on (market, feedcode) market, feedcode from t  where
market='TEST';
  market | feedcode
+---
  TEST   | abcdef
  TEST   | jsgfhdfjd
(2 rows)
explain select distinct on (market, feedcode) market, feedcode from t
where market='TEST';
QUERY PLAN

  Unique  (cost=12.20..13.21 rows=2 width=13)
->  Sort  (cost=12.20..12.70 rows=201 width=13)
  Sort Key: feedcode
  ->  Seq Scan on t  (cost=0.00..4.51 rows=201 width=13)
Filter: (market = 'TEST'::text)
(5 rows)

And with the patch,
select distinct on (market, feedcode) market, feedcode from t   where
market='TEST';
  market | feedcode
+--
  TEST   | abcdef
(1 row)

explain select distinct on (market, feedcode) market, feedcode from t
where market='TEST';
QUERY PLAN

  Index Only Scan using t_market_feedcode_updated_at_idx on t
(cost=0.14..0.29 rows=2 width=13)
Scan mode: Skip scan
Index Cond: (market = 'TEST'::text)
(3 rows)

Notice that in the explain statement it shows correct number of rows
to be skipped.



Thanks for your test case; this is very helpful.

For now, I would like to highlight that

 SET enable_indexskipscan = OFF

can be used for testing with the patch applied.

Dmitry and I will look at the feedback provided.

Best regards,
 Jesper




Re: partition tree inspection functions

2018-10-05 Thread Jesper Pedersen

On 10/5/18 2:52 AM, Michael Paquier wrote:

On Fri, Oct 05, 2018 at 03:31:49PM +0900, Amit Langote wrote:

Thanks for making those changes yourself and posting the new version.

Can you check the attached diff file for some updates to the documentation
part of the patch.  Other parts look fine.


OK, I merged that into my local branch.  From what I can see Mr Robot is
happy as well:
http://cfbot.cputube.org/amit-langote.html


Looks good.

Best regards,
 Jesper




Re: Index Skip Scan

2018-10-09 Thread Jesper Pedersen

Hi Pavel,

On 10/9/18 9:42 AM, Pavel Stehule wrote:

I tested last patch and I have some notes:

1.

postgres=# explain select distinct a1 from foo;
+---+
|QUERY PLAN 
|
+---+
| Unique  (cost=0.43..4367.56 rows=9983 width=4)
|
|   ->  Index Skip Scan using foo_a1_idx on foo  (cost=0.43..4342.60 
rows=9983 width=4) |
+---+
(2 rows)

In this case Unique node is useless and can be removed

2. Can be nice COUNT(DISTINCT support) similarly like MIN, MAX suppport

3. Once time patched postgres crashed, but I am not able to reproduce it.



Please, send that query through if you can replicate it. The patch 
currently passes an assert'ed check-world, so your query clearly 
triggered something that isn't covered yet.



Looks like very interesting patch, and important for some BI platforms



Thanks for your review !

Best regards,
 Jesper



Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Jesper Pedersen

On 10/25/18 2:33 PM, Andrew Dunstan wrote:


Yeah, very good point. xqilla/xerces-C appears to be widely available 
(Centos and ubuntu, at least).




xqilla/xerces-c are in the Fedora/RHEL repo too.

Best regards,
 Jesper



Re: pread() and pwrite()

2018-11-02 Thread Jesper Pedersen

Hi Thomas,

On 10/9/18 4:56 PM, Thomas Munro wrote:

Thanks, much nicer.  Rebased.



This still applies, and passes make check-world.

I wonder what the commit policy is on this, if the Windows part isn't 
included. I read Heikki's comment [1] as it would be ok to commit 
benefiting all platforms that has pread/pwrite.


The functions in [2] could be a follow-up patch as well.

[1] 
https://www.postgresql.org/message-id/6cc7c8dd-29f9-7d75-d18a-99f19c076d10%40iki.fi
[2] 
https://www.postgresql.org/message-id/c2f56d0a-cadd-3df1-ae48-b84dc8128c37%40redhat.com


Best regards,
 Jesper




Re: pread() and pwrite()

2018-11-05 Thread Jesper Pedersen

Hi Thomas,

On 11/5/18 7:08 AM, Thomas Munro wrote:

On Sun, Nov 4, 2018 at 12:03 AM Thomas Munro
 wrote:

On Sat, Nov 3, 2018 at 2:07 AM Jesper Pedersen
 wrote:

This still applies, and passes make check-world.

I wonder what the commit policy is on this, if the Windows part isn't
included. I read Heikki's comment [1] as it would be ok to commit
benefiting all platforms that has pread/pwrite.


Here's a patch to add Windows support by supplying
src/backend/port/win32/pread.c.  Thoughts?


If we do that, I suppose we might as well supply implementations for
HP-UX 10.20 as well, and then we can get rid of the conditional macro
stuff at various call sites and use pread() and pwrite() freely.
Here's a version that does it that way.  One question is whether the
caveat mentioned in patch 0001 is acceptable.



Passed check-world, but I can't verify the 0001 patch. Reading the the 
API it looks ok to me.


I guess the caveat in 0001 is ok, as it is a side-effect of the 
underlying API.


Best regards,
 Jesper



Re: pread() and pwrite()

2018-11-06 Thread Jesper Pedersen

Hi,

On 11/5/18 9:10 PM, Thomas Munro wrote:

On Tue, Nov 6, 2018 at 5:07 AM Alvaro Herrera  wrote:

Please remove Tell from line 18 in fd.h.  To Küssnacht with him!


Thanks, done.  But what is this arrow sticking through my Mac laptop's
screen...?

On Tue, Nov 6, 2018 at 6:23 AM Tom Lane  wrote:

Alvaro Herrera  writes:

On 2018-Nov-04, Thomas Munro wrote:

Here's a patch to add Windows support by supplying
src/backend/port/win32/pread.c.  Thoughts?



Hmm, so how easy is to detect that somebody runs read/write on fds where
pread/pwrite have occurred?  I guess for data files it's easy to detect
since you'd quickly end up with corrupted files, but what about other
kinds of files?  I wonder if we should be worrying about using this
interface somewhere other than fd.c and forgetting about the limitation.


Yeah.  I think the patch as presented is OK; it uses pread/pwrite only
inside fd.c, which is a reasonably non-leaky abstraction.  But there's
definitely a hazard of somebody submitting a patch that depends on
using pread/pwrite elsewhere, and then that maybe not working.

What I suggest is that we *not* try to make this a completely transparent
substitute.  Instead, make the functions exported by src/port/ be
"pg_pread" and "pg_pwrite", and inside fd.c we'd write something like

#ifdef HAVE_PREAD
#define pg_pread pread
#endif

and then refer to pg_pread/pg_pwrite in the body of that file.  That
way, if someone refers to pread and expects standard functionality
from it, they'll get a failure on platforms not supporting it.


OK.  But since we're using this from both fd.c and xlog.c, I put that
into src/include/port.h.


FWIW, I tested the given patches on HPUX 10.20; they compiled cleanly
and pass the core regression tests.




Passes check-world, and includes the feedback on this thread.

New status: Ready for Committer

Best regards,
 Jesper




Re: Speeding up INSERTs and UPDATEs to partitioned tables

2018-11-06 Thread Jesper Pedersen

On 11/4/18 5:07 AM, David Rowley wrote:

I've attached v13 which hopefully addresses these.



I ran a test against the INSERT case using a 64 hash partition.

Non-partitioned table: ~73k TPS
Master: ~25k TPS
0001: ~26k TPS
0001 + 0002: ~68k TPS

The profile of 0001 shows that almost all of 
ExecSetupPartitionTupleRouting() is find_all_inheritors(), hence the 
last test with a rebase of the original v1-0002 patch.


Best regards,
 Jesper



Re: pread() and pwrite()

2018-11-07 Thread Jesper Pedersen

Hi Thomas,

On 11/6/18 4:04 PM, Thomas Munro wrote:

On Wed, Nov 7, 2018 at 4:42 AM Jesper Pedersen
Thanks!  Pushed.  I'll keep an eye on the build farm to see if
anything breaks on Cygwin or some other frankenOS.



There is [1] on Andres' skink setup. Looking.

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2018-11-07%2001%3A01%3A01


Best regards,
 Jesper



valgrind on initdb

2018-11-07 Thread Jesper Pedersen

Hi,

While looking at [1] (included in 23315.log) there are other warnings as 
well.


I ran with

valgrind --leak-check=full --show-leak-kinds=all --gen-suppressions=all 
--suppressions=/path/to/postgresql/src/tools/valgrind.supp 
--time-stamp=yes --log-file=/tmp/valgrind/%p.log --trace-children=yes 
--track-origins=yes --read-var-info=yes initdb -D /tmp/pgsql --no-clean 
--no-sync --no-locale 2>&1 | tee /tmp/valgrind/initdb.log


[1] 
https://www.postgresql.org/message-id/cb062f4d-55b8-28be-b55f-2e593a3b3755%40redhat.com


Best regards,
 Jesper


initdb.tgz
Description: application/compressed-tar


Re: valgrind on initdb

2018-11-07 Thread Jesper Pedersen

On 11/7/18 8:08 AM, Jesper Pedersen wrote:
While looking at [1] (included in 23315.log) there are other warnings as 
well.




On 77366d90.

Best regards,
 Jesper



Re: pread() and pwrite()

2018-11-07 Thread Jesper Pedersen

Hi,

On 11/7/18 7:26 AM, Jesper Pedersen wrote:

On 11/6/18 4:04 PM, Thomas Munro wrote:

On Wed, Nov 7, 2018 at 4:42 AM Jesper Pedersen
Thanks!  Pushed.  I'll keep an eye on the build farm to see if
anything breaks on Cygwin or some other frankenOS.



There is [1] on Andres' skink setup. Looking.



Attached is a reproducer.

Adding the memset() command for the page makes valgrind happy.

Thoughts on how to proceed with this ? The report in [1] shows that 
there are a number of call sites where the page(s) aren't fully initialized.


[1] 
https://www.postgresql.org/message-id/3fe1e38a-fb70-6260-9300-ce67ede21c32%40redhat.com


Best regards,
 Jesper
#include 
#include 
#include 
#include 

int main(int argc, const char* argv[])
{
   FILE* f;
   int fd;
   void* m;
   ssize_t written;

   f = fopen("test.bin", "w+");
   if (f == NULL)
   {
  printf("Cannot open file\n");
  exit(0);
   }
   fd = fileno(f);
   m = malloc((size_t)8192);
   // memset(m, 0, (size_t)8192); <-- will make valgrind succeed

   written = pwrite(fd, m, (size_t)8192, (off_t)0);
   
   printf("Written: %i\n", written);

   free(m);
   fclose(f);
   
   return 0;
}


Re: pread() and pwrite()

2018-11-07 Thread Jesper Pedersen

Hi Tom,

On 11/7/18 9:30 AM, Tom Lane wrote:

I'm confused by this.  Surely the pwrite-based code is writing exactly the
same data as before.  Do we have to conclude that valgrind is complaining
about passing uninitialized data to pwrite() when it did not complain
about exactly the same thing for write()?

[ looks ... ]  No, what we have to conclude is that the write-related
suppressions in src/tools/valgrind.supp need to be replaced or augmented
with pwrite-related ones.



The attached patch fixes this for me.

Unfortunately pwrite* doesn't work for the pwrite64(buf) line.

Best regards,
 Jesper
diff --git a/src/tools/valgrind.supp b/src/tools/valgrind.supp
index af03051260..2f3b602773 100644
--- a/src/tools/valgrind.supp
+++ b/src/tools/valgrind.supp
@@ -52,9 +52,10 @@
 {
 	padding_XLogRecData_write
 	Memcheck:Param
-	write(buf)
+	pwrite64(buf)
 
-...
+	...
+	fun:pwrite
 	fun:XLogWrite
 }
 


Re: Speeding up INSERTs and UPDATEs to partitioned tables

2018-11-07 Thread Jesper Pedersen

Hi David,

On 11/7/18 6:46 AM, David Rowley wrote:

I've attached a patch which does this.  It adds a new struct named
PartitionRoutingInfo into ResultRelInfo and pulls 3 of the 4 arrays
out of PartitionTupleRouting. There are fields for each of what these
arrays used to store inside the PartitionRoutingInfo struct.

While doing this I kept glancing back over at ModifyTableState and at
the mt_per_subplan_tupconv_maps array. I wondered if it would be
better to make the PartitionRoutingInfo a bit more generic, perhaps
call it TupleConversionInfo and have fields like ti_ToGeneric and
ti_FromGeneric, with the idea that "generic" would be the root
partition or the first subplan for inheritance updates. This would
allow us to get rid of a good chunk of code inside nodeModifyTable.c.
However, I ended up not doing this and left PartitionRoutingInfo to be
specifically for Root to Partition conversion.



Yeah, it doesn't necessarily have to be part of this patch.


Also, on the topic about what to name the conversion maps from a few
days ago; After looking at this a bit more I decided that having them
named ParentToChild and ChildToParent is misleading.  If the child is
the child of some sub-partitioned table then the parent that the map
is talking about is not the partition's parent, but the root
partitioned table. So really RootToPartition and PartitionToRoot seem
like much more accurate names for the maps.



Agreed.


I made a few other changes along the way; I thought that
ExecFindPartition() would be a good candidate to take on the
responsibility of validating the partition is valid for INSERTs when
it uses a partition out of the subplan_resultrel_hash. I thought it
was better to check this once when we're in the code path of grabbing
the ResultRelInfo out that hash table rather than in a code path that
must check if it's been done already each time we route a tuple into a
partition. It also allowed me to get rid of
ri_PartitionReadyForRouting. I also moved the call to
ExecInitRoutingInfo() into ExecFindPartition() which allowed me to
make that function static, which could result in the generation of
slightly more optimal compiled code.

Please find attached the v14 patch.



Passes check-world, and has detailed documentation about the changes :)

Best regards,
 Jesper



Re: speeding up planning with partitions

2018-11-09 Thread Jesper Pedersen

Hi Amit,

On 11/9/18 3:55 AM, Amit Langote wrote:

And here are patches structured that way.  I've addressed some of the
comments posted by Imai-san upthread.  Also, since David's patch to
postpone PlannerInfo.total_pages calculation went into the tree earlier
this week, it's no longer included in this set.



Thanks for doing the split this way. The patch passes check-world.

I ran a SELECT test using hash partitions, and got

   Masterv5
64:6k59k
1024:  283   59k

The non-partitioned case gives 77k. The difference in TPS between the 
partition case vs. the non-partitioned case comes down to 
set_plain_rel_size() vs. set_append_rel_size() under 
set_base_rel_sizes(); flamegraphs for this sent off-list.


Best regards,
 Jesper



Re: Speeding up INSERTs and UPDATEs to partitioned tables

2018-11-13 Thread Jesper Pedersen

Hi,

On 11/12/18 6:17 PM, David Rowley wrote:

On 9 November 2018 at 19:18, Amit Langote  wrote:

I have a comment regarding how you chose to make
PartitionTupleRouting private.

Using the v14_to_v15 diff, I could quickly see that there are many diffs
changing PartitionTupleRouting to struct PartitionTupleRouting, but they
would be unnecessary if you had added the following in execPartition.h, as
my upthread had done.

-/* See execPartition.c for the definition. */
+/* See execPartition.c for the definitions. */
  typedef struct PartitionDispatchData *PartitionDispatch;
+typedef struct PartitionTupleRouting PartitionTupleRouting;


Okay, done that way. v16 attached.



Still passes check-world.

Best regards,
 Jesper




Re: Index Skip Scan

2018-11-16 Thread Jesper Pedersen

Hi,

On 11/15/18 6:41 AM, Alexander Kuzmenkov wrote:
But having this logic inside _bt_next means that it will make a 
non-skip index

only scan a bit slower, am I right?


Correct.


Well, it depends on how the skip scan is implemented. We don't have to 
make normal scans slower, because skip scan is just a separate thing.


My main point was that current implementation is good as a proof of 
concept, but it is inefficient for some data and needs some unreliable 
planner logic to work around this inefficiency. And now we also have 
execution-time fallback because planning-time fallback isn't good 
enough. This looks overly complicated. Let's try to design an algorithm 
that works well regardless of the particular data and doesn't need these 
heuristics. It should be possible to do so for btree.


Of course, I understand the reluctance to implement an entire new type 
of btree traversal. Anastasia Lubennikova suggested a tweak for the 
current method that may improve the performance for small groups of 
equal values. When we search for the next unique key, first check if it 
is contained on the current btree page using its 'high key'. If it is, 
find it on the current page. If not, search from the root of the tree 
like we do now. This should improve the performance for small equal 
groups, because there are going to be several such groups on the page. 
And this is exactly where we have the regression compared to unique + 
index scan.




Robert suggested something similar in [1]. I'll try and look at that 
when I'm back from my holiday.


By the way, what is the data for which we intend this feature to work? 
Obviously a non-unique btree index, but how wide are the tuples, and how 
big the equal groups? It would be good to have some real-world examples.




Although my primary use-case is int I agree that we should test the 
different data types, and tuple widths.


[1] 
https://www.postgresql.org/message-id/CA%2BTgmobb3uN0xDqTRu7f7WdjGRAXpSFxeAQnvNr%3DOK5_kC_SSg%40mail.gmail.com


Best regards,
 Jesper



Re: JIT compiling with LLVM v10.1

2018-02-19 Thread Jesper Pedersen

Hi,

On 02/14/2018 01:17 PM, Andres Freund wrote:

On 2018-02-07 06:54:05 -0800, Andres Freund wrote:

I've pushed v10.0. The big (and pretty painful to make) change is that
now all the LLVM specific code lives in src/backend/jit/llvm, which is
built as a shared library which is loaded on demand.



I thought

https://db.in.tum.de/~leis/papers/adaptiveexecution.pdf?lang=en

was relevant for this thread.

Best regards,
 Jesper



Re: [HACKERS] Runtime Partition Pruning

2018-02-22 Thread Jesper Pedersen

Hi David,

On 02/21/2018 04:06 AM, David Rowley wrote:

I've attached v11 of the patch.



Are UPDATE and DELETE suppose to be supported ?

With

-- test.sql --
CREATE TABLE test (a integer NOT NULL, b integer) PARTITION BY HASH(a);
CREATE TABLE test_p00 PARTITION OF test FOR VALUES WITH (MODULUS 2, 
REMAINDER 0);
CREATE TABLE test_p01 PARTITION OF test FOR VALUES WITH (MODULUS 2, 
REMAINDER 1);

CREATE INDEX idx_test_a ON test (a);
CREATE INDEX idx_test_b ON test (b);

INSERT INTO test (SELECT i,i FROM generate_series(1, 100) AS i);

ANALYZE;
-- test.sql --

and

UPDATE test SET b = 1 WHERE a = ?
DELETE FROM test WHERE a = ?

both shows that all partitions are scanned;

 Update on test
   Update on test_p00
   Update on test_p01
   ->  Index Scan using test_p00_a_idx on test_p00
 Index Cond: (a = 1)
   ->  Index Scan using test_p01_a_idx on test_p01
 Index Cond: (a = 1)

Using prune_v32 and runtime_v11 with conflicts resolved.

Best regards,
 Jesper



Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA

2018-03-01 Thread Jesper Pedersen

Hi,

On 01/02/2018 11:09 AM, Jesper Pedersen wrote:

Oh... there were stupid error in previos file.
Attached fixed version.



As the patch still applies, make check-world passes and I believe that 
Yura has provided feedback for Andres' comments I'll leave this entry in 
"Ready for Committer".




The patch from November 27, 2017 still applies (with hunks), passes 
"make check-world" and shows performance improvements.


Keeping it in "Ready for Committer".

Best regards,
 Jesper



Re: [HACKERS] Runtime Partition Pruning

2018-03-01 Thread Jesper Pedersen

Hi David,

On 03/01/2018 08:04 AM, David Rowley wrote:

I've also split the patch out a bit more into logical parts in the
hope it makes things easier to review.



A small typo in 0001:

+ * leftmost_ons_pos[x] gives the bit number (0-7) of the leftmost one 
bit in a


..."_one_"...

0004 fails "make check-world" due to

pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 670; 1259 49954 TABLE 
boolp_f jpedersen
pg_restore: [archiver (db)] could not execute query: ERROR:  syntax 
error at or near "false"

LINE 24: ..." ATTACH PARTITION "public"."boolp_f" FOR VALUES IN (false);

Do you require https://commitfest.postgresql.org/17/1410/ as well ?

I'll look more at 0002-0005 over the coming days.

Thanks for working on this !

Best regards,
 Jesper



Re: [HACKERS] Runtime Partition Pruning

2018-03-09 Thread Jesper Pedersen

Hi David,

On 03/01/2018 08:29 PM, David Rowley wrote:

0004 fails "make check-world" due to

pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 670; 1259 49954 TABLE
boolp_f jpedersen
pg_restore: [archiver (db)] could not execute query: ERROR:  syntax error at
or near "false"
LINE 24: ..." ATTACH PARTITION "public"."boolp_f" FOR VALUES IN (false);


The tests seem to have stumbled on a pg_dump bug which causes it to
produce syntax that's not valid (currently)

I should be able to stop my patch failing the test by dropping that
table, which I should have been doing anyway.



I've added that thread to the Open Items list.


Thanks for the review and in advance for the future review.

I'll delay releasing a new patch as there's some discussion over on
the faster partition pruning thread which affects this too [1]

[1] 
https://www.postgresql.org/message-id/CA+Tgmoa4D1c4roj7L8cx8gkkeBWAZD=mtcxkxtwbnslrhd3...@mail.gmail.com



Sure, 0003-0005 depends on that thread. 0002 is refactoring so that one 
is ready.


In 0004 should we add a HASH based test case,

-- test.sql --
-- verify pruning in hash partitions
create table hashp (a int primary key, b int) partition by hash (a);
create table hashp_0 partition of hashp for values with (modulus 2, 
remainder 0);
create table hashp_1 partition of hashp for values with (modulus 2, 
remainder 1);

insert into hashp values (0,0), (1,1), (2,2), (3,3);
prepare q1 (int) as select * from hashp where a = $1;
execute q1 (1);
execute q1 (1);
execute q1 (1);
execute q1 (1);
execute q1 (1);
explain (analyze, costs off, summary off, timing off)  execute q1 (1);
explain (analyze, costs off, summary off, timing off)  execute q1 (3);
deallocate q1;
drop table hashp;
-- test.sql --

Also, should 0004 consider the "Parallel Append" case, aka

-- parallel.sql --
CREATE TABLE t1 (
a integer NOT NULL,
b integer NOT NULL
) PARTITION BY HASH (b);

CREATE TABLE t1_p00 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, 
REMAINDER 0);
CREATE TABLE t1_p01 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, 
REMAINDER 1);
CREATE TABLE t1_p02 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, 
REMAINDER 2);
CREATE TABLE t1_p03 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, 
REMAINDER 3);

INSERT INTO t1 (SELECT i, i FROM generate_series(1, 100) AS i);
PREPARE q1 (int) AS SELECT * FROM t1 WHERE a = $1;
EXECUTE q1 (5432);
EXECUTE q1 (5432);
EXECUTE q1 (5432);
EXECUTE q1 (5432);
EXECUTE q1 (5432);
EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF)  EXECUTE q1 (5432);
DEALLOCATE q1;
DROP TABLE t1;
-- parallel.sql --

Best regards,
 Jesper



Re: [HACKERS] path toward faster partition pruning

2018-03-13 Thread Jesper Pedersen

Hi Amit,

On 03/13/2018 07:37 AM, Amit Langote wrote:

I will continue working on improving the comments / cleaning things up and
post a revised version soon, but until then please look at the attached.



Passes check-world.

Some minor comments:

0001: Ok

0002: Ok

0003:
* Trailing white space
* pruning.c
  - partkey_datum_from_expr
* "Add more expression types..." -- Are you planning to add more of 
these ? Otherwise change the comment

  - get_partitions_for_null_keys
* Documentation for method
* 'break;' missing for _HASH and default case
  - get_partitions_for_keys
* 'break;'s are outside of the 'case' blocks
* The 'switch(opstrategy)'s could use some {} blocks
  * 'break;' missing from default
  - perform_pruning_combine_step
* Documentation for method
* nodeFuncs.c
  - Missing 'break;'s to follow style

0004: Ok

Best regards,
 Jesper



EXPLAIN of Parallel Append

2018-03-14 Thread Jesper Pedersen

Hi,

Given

-- test.sql --
CREATE TABLE t1 ( 


a integer NOT NULL,
b integer NOT NULL
) PARTITION BY HASH (b);
CREATE TABLE t1_p00 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, 
REMAINDER 0);
CREATE TABLE t1_p01 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, 
REMAINDER 1);
CREATE TABLE t1_p02 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, 
REMAINDER 2);
CREATE TABLE t1_p03 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, 
REMAINDER 3);

INSERT INTO t1 (SELECT i, i FROM generate_series(1, 100) AS i);
ANALYZE;
-- test.sql --

Running

EXPLAIN (ANALYZE) SELECT * FROM t1 WHERE a = 5432;

gives

 Gather  (cost=1000.00..12780.36 rows=4 width=8) (actual 
time=61.270..61.309 rows=1 loops=1)

   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Append  (cost=0.00..11779.96 rows=4 width=8) (actual 
time=38.915..57.209 rows=0 loops=3)
 ->  Parallel Seq Scan on t1_p01  (cost=0.00..2949.00 rows=1 
width=8) (actual time=38.904..38.904 rows=0 loops=1)

   Filter: (a = 5432)
   Rows Removed by Filter: 250376
 ->  Parallel Seq Scan on t1_p03  (cost=0.00..2948.07 rows=1 
width=8) (actual time=0.369..47.909 rows=1 loops=1)

   Filter: (a = 5432)
   Rows Removed by Filter: 250248
 ->  Parallel Seq Scan on t1_p02  (cost=0.00..2942.66 rows=1 
width=8) (actual time=11.354..11.354 rows=0 loops=3)

   Filter: (a = 5432)
   Rows Removed by Filter: 83262
 ->  Parallel Seq Scan on t1_p00  (cost=0.00..2940.21 rows=1 
width=8) (actual time=50.745..50.745 rows=0 loops=1)

   Filter: (a = 5432)
   Rows Removed by Filter: 249589
 Planning time: 0.381 ms
 Execution time: 62.810 ms
(18 rows)

Parallel Append's ntuples is 1, but given nloops is 3 you end up with 
the slightly confusing "(actual ... *rows=0* loops=3)".


Using master (3b7ab438).

Thoughts ?

Best regards,
 Jesper



Re: PostgreSQL 17 Release Management Team & Feature Freeze

2024-04-08 Thread Jesper Pedersen

Hi,

On 4/8/24 14:15, Tomas Vondra wrote:

I think we need to
fix & improve that - not to rework/push it at the very end.



This is going to be very extreme...

Either a patch is ready for merge or it isn't - when 2 or more 
Committers agree on it then it can be merged - the policy have to be 
discussed of course.


However, development happens all through out the year, so having to wait 
for potential feedback during CommitFests windows can stop development 
during the other months - I'm talking about non-Committers here...


Having patches on -hackers@ is best, but maybe there is a hybrid model 
where they exists in pull requests, tested through CfBot, and merged 
when ready - no matter what month it is.


Pull requests could still have labels that ties them to a "CommitFest" 
to "high-light" them, but it would make it easier to have a much clearer 
cut-off date for feature freeze.


And, pull request labels are easily changed.

March is seeing a lot of last-minute changes...

Just something to think about.

Best regards,
 Jesper





Re: GSOC2023

2023-01-04 Thread Jesper Pedersen

Hi,

On 12/28/22 21:10, diaa wrote:

*Hi Sir.*
I’m Computer engineering student from Egypt Interested in Database Management
Systems.
Can I know details about the list of ideas for 2023 projects or how to prepare
myself to be ready with the required knowledge?



Thanks for your interest in GSoC 2023 !


The GSoC timeline is described here [1], and the PostgreSQL projects - 
if approved - will be listed here [2].



So, check back in February and see which projects has been proposed.


[1] https://developers.google.com/open-source/gsoc/timeline

[2] https://wiki.postgresql.org/wiki/GSoC_2023


Best regards,

 Jesper






Re: New committers: Nathan Bossart, Amit Langote, Masahiko Sawada

2023-04-24 Thread Jesper Pedersen

On 4/20/23 13:40, Tom Lane wrote:

The Core Team would like to extend our congratulations to
Nathan Bossart, Amit Langote, and Masahiko Sawada, who have
accepted invitations to become our newest Postgres committers.

Please join me in wishing them much success and few reverts.



Huge congrats !


Best regards,

 Jesper






Re: MDAM techniques and Index Skip Scan patch

2022-03-24 Thread Jesper Pedersen

On 3/23/22 18:22, Dmitry Dolgov wrote:


The CF item could be set to RwF, what would you say, Jesper?



We want to thank the community for the feedback that we have received 
over the years for this feature. Hopefully a future implementation can 
use Tom's suggestions to get closer to a committable solution.


Here is the last CommitFest entry [1] for the archives.

RwF

[1] https://commitfest.postgresql.org/37/1741/

Best regards,
 Dmitry & Jesper





Re: Index Skip Scan (new UniqueKeys)

2022-03-24 Thread Jesper Pedersen

On 6/9/20 06:22, Dmitry Dolgov wrote:

Here is a new version of index skip scan patch, based on v8 patch for
UniqueKeys implementation from [1]. I want to start a new thread to
simplify navigation, hopefully I didn't forget anyone who actively
participated in the discussion.



This CommitFest entry has been closed with RwF at [1].

Thanks for all the feedback given !

[1] 
https://www.postgresql.org/message-id/ab8636e7-182f-886a-3a39-f3fc279ca45d%40redhat.com


Best regards,
 Dmitry & Jesper





Re: GSoC: pgmoneta: Write-Ahead Log (WAL) infrastructure (2022)

2022-04-01 Thread Jesper Pedersen

Hi Jinlang,

On 4/1/22 12:39, Jinlang Wang wrote:

To whom it may concern:

Here is my proposal 
(https://docs.google.com/document/d/1KKKDU6iP0GOkAMSdGRyxFJRgW964JFVROnpKkbzWyNw/edit?usp=sharing
 
)
 for GSoC. Please check!



Thanks for your proposal to Google Summer of Code 2022 !


We'll follow up off-list to get this finalized.


Best regards,

 Jesper






Re: GSoC: pgmoneta, storage API

2022-04-04 Thread Jesper Pedersen

Hi Mariam,

On 4/4/22 09:16, Mariam Fahmy wrote:

Hope you’re having a good day!
I am Mariam Fahmy, A senior computer and systems engineering student at
faculty of engineering, AinShams university.

I am interested in working with pgmoneta during GSoC 2022.

Here is a link to the draft proposal for implementing storage engine in
pgmoneta:
https://docs.google.com/document/d/1EbRzgfZCDWG6LCD0puil8bUt10aehT4skfZ1YFlbNKk/edit?usp=drivesdk

I would be grateful if you can have a look at it and give me your feedback.



Thanks for your proposal to Google Summer of Code 2022 !


We'll follow up off-list to get this finalized.


Best regards,
 Jesper






Re: GSoC 2022: Proposal of pgmoneta on-disk encryption

2022-04-13 Thread Jesper Pedersen

Hi Jichen,

On 4/13/22 10:40, Solo Kyo wrote:

To whom it may concern:

Hi, I am Jichen Xu. I am a first year master computer science student at
Waseda University.
Here is a link to my proposal:
https://docs.google.com/document/d/1vdgPY5wvhjhrX9aSUw5mTDOnXwEQNcr4cxfzuSHMWlg
Looking forward to working with you in next few months.



Thanks for your proposal to Google Summer of Code 2022 !


We'll follow up off-list to get this finalized.


Best regards,

 Jesper





Re: GSoC: pgmoneta: Write-Ahead Log (WAL) infrastructure (2022)

2022-04-16 Thread Jesper Pedersen

Hi,

On 4/16/22 13:06, dl x wrote:

My name is Donglin Xie, a MSc. student at Zhejiang University, in China. I
am interested in the project *pgmoneta: Write-Ahead Log (WAL)
infrastructure.*

The proposal is attached  to this email. Looking forward to the suggestions!



Thanks for your proposal to Google Summer of Code 2022 !

We'll follow up off-list to get this finalized.

Best regards,
 Jesper






Re: GsoC: pgexporter: Custom file

2022-04-17 Thread Jesper Pedersen

Hi,

On 4/17/22 04:43, dl x wrote:

My name is Donglin Xie, a MSc. student at Zhejiang University, in China. I
am interested in the project *pgexporter: Custom_file.*

The proposal is attached  to this email. Looking forward to the suggestions!




Thanks for your proposal to Google Summer of Code 2022 !

We'll follow up off-list to get this finalized.

Best regards,
 Jesper






  1   2   3   >