date:20171215

autoprewarm is fogetting to register a tranche.

2017-12-15 Thread Kyotaro HORIGUCHI

Hello, I noticed while an investigation that pg_prewarm is
forgetting to register a tranche.

Before the second parameter of LWLockRegisterTranche() became
char * in 3761fe3, that would lead to a crash for a
--enable-dtrace build, but currently it not likely.

On the other hand as far as reading a comment in lwlock.h, we
ought to register a tranche obtained by LWLockNewTrancheId() and
it is still convincing. So I made autoprewarm.c register
it. (patch 0002)

> * There is another, more flexible method of obtaining lwlocks. First, call
> * LWLockNewTrancheId just once to obtain a tranche ID; this allocates from a
> * shared counter. Next, LWLockInitialize should be called just once per
> * lwlock, passing the tranche ID as an argument. Finally, each individual
> * process using the tranche should call LWLockRegisterTranche() or
> * LWLockRegisterTrancheOfLock() to associate that tranche ID with a name.

In passing, I found it hard to register a tranche in
apw_init_shmem() because a process using the initialized shared
memory struct cannot find the tranche id (without intruding into
the internal of LWLock strcut). So I'd like to propose another
tranche registering interface LWLockRegisterTrancheOfLock(LWLock
*, int). The interface gets rid of the necessity of storing 
tranche id separately from LWLock. (in patch 0001)

The comment cited above looks a bit different from how the
interface is actually used. So I rearranged the comment following
a typical use I have seen in several cases. (in patch 0001)

+ * There is another, more flexible method of obtaining lwlocks. First, call
+ * LWLockNewTrancheId just once to obtain a tranche ID; this allocates from a
+ * shared counter. Next, LWLockInitialize should be called just once per
+ * lwlock, passing the tranche ID as an argument. Finally, each individual
+ * process using the tranche should call LWLockRegisterTranche() or
+ * LWLockRegisterTrancheOfLock() to associate that tranche ID with a name.

This might seem somewhat odd but things should happen in the
order in real codes since tranche registration usulally happens
after a lock initializing block.


The final patch 0003 should be arguable, or anticipated to be
rejected. It cannot be detect that a tranche is not registered
until its name is actually accessed (or many eewon't care even if
the name were printed as '(null)' in an error message that is the
result of the developer's own bug). On the other hand there's no
chance to detect that before the lock is actually used. By the
last patch I added an assertion in LWLockAcquire() but it must
hit performance in certain dgree (even if it is only in
--enable-cassert build) so I don't insisit that it must be there.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 3ac3de48c9c6992bf9137dc65362ada502100c3b Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 15 Dec 2017 14:08:18 +0900
Subject: [PATCH 1/3] Add a new tranche register API

In a typical usage of named tranches, it would be useful if we can
register a tranche not needing remember the tranche id separately in
shared memory. This new function accepts LWLock itself to specify the
tranche id.
---
 src/backend/storage/lmgr/lwlock.c | 15 +++
 src/include/storage/lwlock.h  | 11 ++-
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 46f5c42..db197a6 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -619,6 +619,21 @@ LWLockRegisterTranche(int tranche_id, const char *tranche_name)
 }
 
 /*
+ * Register the tranche ID of the specified lock in the lookup table for the
+ * current process.
+ */
+void
+LWLockRegisterTrancheOfLock(LWLock *lock, const char *tranche_name)
+{
+	/*
+	 * Different from LWLockRegisterTranche, users can easily give a LWLock of
+	 * builtin tranches.
+	 */
+	Assert(lock->tranche >= LWTRANCHE_FIRST_USER_DEFINED);
+	LWLockRegisterTranche(lock->tranche, tranche_name);
+}
+
+/*
  * RequestNamedLWLockTranche
  *		Request that extra LWLocks be allocated during postmaster
  *		startup.
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index 460843d..9f02329 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -172,11 +172,11 @@ extern LWLockPadded *GetNamedLWLockTranche(const char *tranche_name);
 
 /*
  * There is another, more flexible method of obtaining lwlocks. First, call
- * LWLockNewTrancheId just once to obtain a tranche ID; this allocates from
- * a shared counter.  Next, each individual process using the tranche should
- * call LWLockRegisterTranche() to associate that tranche ID with a name.
- * Finally, LWLockInitialize should be called just once per lwlock, passing
- * the tranche ID as an argument.
+ * LWLockNewTrancheId just once to obtain a tranche ID; this allocates from a
+ * shared counter. Next, LWLockInitialize should be called just once per
+ * lwlo

Re: [HACKERS] Surjective functional indexes

2017-12-15 Thread Konstantin Knizhnik




On 15.12.2017 01:21, Michael Paquier wrote:

On Fri, Dec 15, 2017 at 6:15 AM, Alvaro Herrera  wrote:

Konstantin Knizhnik wrote:

If you still thing that additional 16 bytes per relation in statistic is too
high overhead, then I will also remove autotune.

I think it's pretty clear that these additional bytes are excessive.

The bar to add new fields in PgStat_TableCounts in very high, and one
attempt to tackle its scaling problems with many relations is here by
Horiguchi-san:
https://www.postgresql.org/message-id/20171211.201523.24172046.horiguchi.kyot...@lab.ntt.co.jp
His patch may be worth a look if you need more fields for your
feature. So it seems to me that the patch as currently presented has
close to zero chance to be committed as long as you keep your changes
to pgstat.c.



Ok, looks like everybody think that autotune based on statistic is bad idea.
Attached please find patch without autotune.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 0255375..c4ee15e 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -294,8 +294,33 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] 
 The optional WITH clause specifies storage
 parameters for the index.  Each index method has its own set of allowed
-storage parameters.  The B-tree, hash, GiST and SP-GiST index methods all
-accept this parameter:
+storage parameters.  All indexes accept the following parameter:
+   
+
+   
+   
+recheck_on_update
+
+ 
+   Functional index is based on on projection function: function which extract subset of its argument.
+   In mathematic such functions are called non-injective. For injective function if any attribute used in the indexed
+   expression is changed, then value of index expression is also changed. So to check that index is affected by the
+   update, it is enough to check the set of changed fields. By default this parameters is assigned true value and function is considered
+   as non-injective.
+   In this case change of any of indexed key doesn't mean that value of the function is changed. For example, for
+   the expression expression(bookinfo->>'isbn') defined
+   for column of JSON type is changed only when ISBN is changed, which rarely happen. The same is true for most
+   functional indexes. For non-injective functions, Postgres compares values of indexed expression for old and updated tuple and updates
+   index only when function results are different. It allows to eliminate index update and use HOT update.
+   But there are extra evaluations of the functions. So if function is expensive or probability that change of indexed column will not effect
+   the function value is small, then marking index as recheck_on_update may increase update speed.
+
+
+   
+   
+
+   
+ The B-tree, hash, GiST and SP-GiST index methods all accept this parameter:

 

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index aa9c0f1..1aaab78 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -131,6 +131,15 @@ static relopt_bool boolRelOpts[] =
 	},
 	{
 		{
+			"recheck_on_update",
+			"Evaluate functional index expression on update to check if its values is changed",
+			RELOPT_KIND_INDEX,
+			AccessExclusiveLock
+		},
+		true
+	},
+	{
+		{
 			"security_barrier",
 			"View acts as a row security barrier",
 			RELOPT_KIND_VIEW,
@@ -1311,7 +1320,7 @@ fillRelOptions(void *rdopts, Size basesize,
 break;
 			}
 		}
-		if (validate && !found)
+		if (validate && !found && options[i].gen->kinds != RELOPT_KIND_INDEX)
 			elog(ERROR, "reloption \"%s\" not found in parse table",
  options[i].gen->name);
 	}
@@ -1468,6 +1477,40 @@ index_reloptions(amoptions_function amoptions, Datum reloptions, bool validate)
 }
 
 /*
+ * Parse generic options for all indexes.
+ *
+ *	reloptions	options as text[] datum
+ *	validate	error flag
+ */
+bytea *
+index_generic_reloptions(Datum reloptions, bool validate)
+{
+	int  numoptions;
+	GenericIndexOpts *idxopts;
+	relopt_value *options;
+	static const relopt_parse_elt tab[] = {
+		{"recheck_on_update", RELOPT_TYPE_BOOL, offsetof(GenericIndexOpts, recheck_on_update)}
+	};
+
+	options = parseRelOptions(reloptions, validate,
+			  RELOPT_KIND_INDEX,
+			  &numoptions);
+
+	/* if none set, we're done */
+	if (numoptions == 0)
+		return NULL;
+
+	idxopts = allocateReloptStruct(sizeof(GenericIndexOpts), options, numoptions);
+
+	fillRelOptions((void *)idxopts, sizeof(GenericIndexOpts), options, numoptions,
+   validate, tab, lengthof(tab));
+
+	pfree(options);
+
+	return (bytea*) idxopts;
+}
+
+/*
  * Option parser for attribute reloptions
  */
 bytea *
diff --git a/src/backend/acces

Re: GSoC 2018

2017-12-15 Thread Aleksander Alekseev

Hi Stephen,

> New entries are certainly welcome and encouraged, just be sure to note
> them as '2018' when you add it.

I proposed a few ideas:

* High availability / failover based on logical replication
* Thrift datatype support

Hope these ideas are good enough for GSoC.

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature

Re: GSoC 2018

2017-12-15 Thread Stefan Keller

Hi,

2017-12-15 4:14 GMT+01:00 Stephen Frost :
> Unsurprisingly, we'll need to have an Ideas page again, so I've gone
> ahead and created one (copying last year's):

What about adding "Learned Index" as project task [*]?
This type of index looks promising for certain properties.

:Stefan

[*] "The Case for Learned Index Structures" Kraska et al. (Dec 2017)
https://arxiv.org/abs/1712.01208


2017-12-15 4:14 GMT+01:00 Stephen Frost :
> Greetings -hackers,
>
> Google Summer of Code 2018 was announced back in September and they've
> changed up the timeline a bit [1].  Specifically, they moved up the
> dates for things like the mentoring organization application deadline,
> so it's time to start working on our Idea's page for 2018 in earnest.
>
> The deadline for Mentoring organizations to apply is: January 23.
>
> The list of accepted organization will be published around February 12.
>
> Unsurprisingly, we'll need to have an Ideas page again, so I've gone
> ahead and created one (copying last year's):
>
> https://wiki.postgresql.org/wiki/GSoC_2018
>
> Google discusses what makes a good "Ideas" list here:
>
> https://google.github.io/gsocguides/mentor/defining-a-project-ideas-list.html
>
> I marked all the entries with '2017' to indicate they were pulled from
> last year.  If the project from last year is still relevant, please
> update it to be '2018' and make sure to update all of the information
> (in particular, make sure to list yourself as a mentor and remove the
> other mentors, as appropriate).
>
> New entries are certainly welcome and encouraged, just be sure to note
> them as '2018' when you add it.
>
> Projects from last year which were worked on but have significant
> follow-on work to be completed are absolutely welcome as well- simply
> update the description appropriately and mark it as being for '2018'.
>
> When we get closer to actually submitting our application, I'll clean
> out the '2017' entries that didn't get any updates.
>
> As a reminder, each idea on the page should be in the format that the
> other entries are in and should include:
>
> - Project title/one-line description
> - Brief, 2-5 sentence, description of the project (remember, these are
>   12-week projects)
> - Description of programming skills needed and estimation of the
>   difficulty level
> - List of potential mentors
> - Expected Outcomes
>
> As with last year, please consider PostgreSQL to be an "Umbrella"
> project and that anything which would be considered "PostgreSQL Family"
> per the News/Announce policy [2] is likely to be acceptable as a
> PostgreSQL GSoC project.
>
> In other words, if you're a contributor or developer on barman,
> pgBackRest, the PostgreSQL website (pgweb), the PgEU/PgUS website code
> (pgeu-website), pgAdmin4, PostgresXL, pgbouncer, Citus, pldebugger, the
> PG RPMs (pgrpms), the JDBC driver, the ODBC driver, or any of the many
> other PG Family projects, please feel free to add a project for
> consideration!  If we get quite a few, we can organize the page further
> based on which project or maybe what skills are needed or similar.
>
> Let's have another great year of GSoC with PostgreSQL!
>
> Thanks!
>
> Stephen
>
> [1]: https://developers.google.com/open-source/gsoc/timeline
> [2]: https://wiki.postgresql.org/wiki/NewsEventsApproval

Reproducible builds: genbki.pl vs schemapg.h

2017-12-15 Thread Christoph Berg

Hi,

Debian's reproducible builds project has revealed that the full build
path gets embedded into server/catalog/schemapg.h:

/*-
 *
 * schemapg.h
 *Schema_pg_xxx macros for use by relcache.c
 *
 * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
 * NOTES
 *  **
 *  *** DO NOT EDIT THIS FILE! ***
 *  **
 *
 *  It has been GENERATED by 
/build/postgresql-11-05gRdu/build/../src/backend/catalog/genbki.pl
 *
 *-
 */

This information will vary on rebuilding the binaries, needlessly
causing different checksums.

I'm proposing to strip that down to "It has been GENERATED genbki.pl".
Patch attached.

Christoph
diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
new file mode 100644
index 256a9c9..30153e6
*** a/src/backend/catalog/genbki.pl
--- b/src/backend/catalog/genbki.pl
*** print $schemapg <

Re: GSoC 2018

2017-12-15 Thread Andrey Borodin

Hi!

> 15 дек. 2017 г., в 15:03, Stefan Keller  написал(а):
> What about adding "Learned Index" as project task [*]?
> This type of index looks promising for certain properties.

I have no high expectation from this idea. But feel that it is necessary at 
least to validate the idea. I'd sign up as co-mentor for this. Moreover this 
will attract students from huge pool of ML hackers.

Also, I'm planning to add my previous-year project about sorting. Or add 
something WAL-G-related under Postgres umbrella (probably, WAL-scanning for 
delta-backups).

Best regards, Andrey Borodin&

Package version in PG_VERSION and version()

2017-12-15 Thread Christoph Berg

To be able to identify more easily which package a connected server is
coming from, I would like to embed the (Debian) package version in the
version() output which is coming from PG_VERSION. It is fairly easy to
do that, but it requires patching configure(.in):

$ ./configure VENDOR_VERSION="Debian 10.1-2"

# select version();
  version
══
 PostgreSQL 10.1 on x86_64-pc-linux-gnu (Debian 10.1-2), compiled by gcc 
(Debian 7.2.0-17) 7.2.1 20171205, 64-bit

PoC patch against HEAD attached - if the approach is deemed
acceptable, I'll also update the relevant documentation bits.

Thanks,
Christoph
diff --git a/configure b/configure
new file mode 100755
index 58eafd3..d68fecd
*** a/configure
--- b/configure
*** fi
*** 16773,16779 
  
  
  cat >>confdefs.h <<_ACEOF
! #define PG_VERSION_STR "PostgreSQL $PG_VERSION on $host, compiled by $cc_string, `expr $ac_cv_sizeof_void_p \* 8`-bit"
  _ACEOF
  
  
--- 16773,16779 
  
  
  cat >>confdefs.h <<_ACEOF
! #define PG_VERSION_STR "PostgreSQL $PG_VERSION on $host${VENDOR_VERSION:+ ($VENDOR_VERSION)}, compiled by $cc_string, `expr $ac_cv_sizeof_void_p \* 8`-bit"
  _ACEOF
  
  
diff --git a/configure.in b/configure.in
new file mode 100644
index 5245899..dccf5b2
*** a/configure.in
--- b/configure.in
*** else
*** 2185,2191 
  fi
  
  AC_DEFINE_UNQUOTED(PG_VERSION_STR,
!["PostgreSQL $PG_VERSION on $host, compiled by $cc_string, `expr $ac_cv_sizeof_void_p \* 8`-bit"],
 [A string containing the version number, platform, and C compiler])
  
  # Supply a numeric version string for use by 3rd party add-ons
--- 2185,2191 
  fi
  
  AC_DEFINE_UNQUOTED(PG_VERSION_STR,
!["PostgreSQL $PG_VERSION on $host${VENDOR_VERSION:+ ($VENDOR_VERSION)}, compiled by $cc_string, `expr $ac_cv_sizeof_void_p \* 8`-bit"],
 [A string containing the version number, platform, and C compiler])
  
  # Supply a numeric version string for use by 3rd party add-ons

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-15 Thread Michael Paquier

On Fri, Dec 15, 2017 at 11:30 AM, Andres Freund  wrote:
> Alvaro, Michael, Peter, and everyone else I'd greatly appreciate if you
> could have a look at the backported version, just about everything but
> v10 had conflicts, some of them not insubstantial.

I have gone through the backpatched versions for the fixes in tuple
pruning, running some tests on the way and those look good to me. I
have not taken the time to go through the ones changing the assertions
to ereport() though...
-- 
Michael

Re: Package version in PG_VERSION and version()

2017-12-15 Thread Michael Paquier

On Fri, Dec 15, 2017 at 7:46 PM, Christoph Berg
 wrote:
> To be able to identify more easily which package a connected server is
> coming from, I would like to embed the (Debian) package version in the
> version() output which is coming from PG_VERSION. It is fairly easy to
> do that, but it requires patching configure(.in):

Why reinventing the wheel when there is already --with-extra-version
that you can use for the same purpose? And I think that I can see a
bug here on HEAD, src/tools/msvc/Solution.pm correctly uses
--with-extra-version in PG_VERSION_STR but ./configure is forgetting
it. If you want to push for this proposal anyway, I think that you
should update the msvc scripts as well.
-- 
Michael

Re: GSoC 2018

2017-12-15 Thread Stephen Frost

Stefan,

* Stefan Keller (sfkel...@gmail.com) wrote:
> 2017-12-15 4:14 GMT+01:00 Stephen Frost :
> > Unsurprisingly, we'll need to have an Ideas page again, so I've gone
> > ahead and created one (copying last year's):
> 
> What about adding "Learned Index" as project task [*]?
> This type of index looks promising for certain properties.

If you can provide a *specific* description with expected outcomes and
we have someone who can mentor such a project, then I think it would be
acceptable.

We would need to get any potential students on the mailing list before
the project is submitted to try and fill out the specifics around what
such an index would actually look like too, I think.  Without that
guideance, the student proposal would be pretty hard to accept I
suspect.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: Package version in PG_VERSION and version()

2017-12-15 Thread Christoph Berg

Re: Michael Paquier 2017-12-15 

> On Fri, Dec 15, 2017 at 7:46 PM, Christoph Berg
>  wrote:
> > To be able to identify more easily which package a connected server is
> > coming from, I would like to embed the (Debian) package version in the
> > version() output which is coming from PG_VERSION. It is fairly easy to
> > do that, but it requires patching configure(.in):
> 
> Why reinventing the wheel when there is already --with-extra-version
> that you can use for the same purpose?

That modifies the PG version number as such, as what psql is showing
on connect. I'd think that is too intrusive.

And it doesn't work anyway: $ ./configure --with-extra-version ' (Debian 
10.1-2)'
configure: WARNING: you should use --build, --host, --target
configure: WARNING: invalid host type:  (Debian 10.1-2)
configure: error: argument required for --with-extra-version option

> And I think that I can see a
> bug here on HEAD, src/tools/msvc/Solution.pm correctly uses
> --with-extra-version in PG_VERSION_STR but ./configure is forgetting
> it. If you want to push for this proposal anyway, I think that you
> should update the msvc scripts as well.

configure.in looks right, it includes the extra version right in
PG_VERSION:

PGAC_ARG_REQ(with, extra-version, [STRING], [append STRING to version],
 [PG_VERSION="$PACKAGE_VERSION$withval"],
 [PG_VERSION="$PACKAGE_VERSION"])
AC_DEFINE_UNQUOTED(PG_VERSION, "$PG_VERSION", [PostgreSQL version as a string])

AC_DEFINE_UNQUOTED(PG_VERSION_STR,
   ["PostgreSQL $PG_VERSION on $host, compiled by $cc_string, 
`expr $ac_cv_sizeof_void_p \* 8`-bit"],
   [A string containing the version number, platform, and C 
compiler])

I'll update the msvc scripts once we figure out where my idea of
"vendor package version" is best placed.

Mit freundlichen Grüßen,
Christoph Berg
-- 
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE

Re: GSoC 2018

2017-12-15 Thread Stephen Frost

Aleksander,

* Aleksander Alekseev (a.aleks...@postgrespro.ru) wrote:
> > New entries are certainly welcome and encouraged, just be sure to note
> > them as '2018' when you add it.
> 
> I proposed a few ideas:

Thanks!

> * High availability / failover based on logical replication
> * Thrift datatype support
> 
> Hope these ideas are good enough for GSoC.

The main things are to have a detailed enough description that someone
can write a project plan about what they're going to do over the summer
on that project and the project is of a reasonable size.

HA/fail-over is a very broad topic, with a lot of pieces that need to be
done such that I'm not sure it's really viable, but perhaps a precursor
project (synchronous logical replication seems like a prereq, no?) would
make more sense.  Or, perhaps, a different piece of the HA question, but
solving the whole thing in a summer strikes me as unlikely to be
reasonable.

Regarding the thrift data type, that seems like a pretty good GSoC
project, though I'm not sure why you suggest having pl/pgsql functions
for accessing data from .thrift files- plpgsql can't directly access the
filesystem and the input routines for .thrift-style data would certainly
be best in C.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: GSoC 2018

2017-12-15 Thread Aleksander Alekseev

Hi Stephen,

> HA/fail-over is a very broad topic, with a lot of pieces that need to be
> done such that I'm not sure it's really viable, but perhaps a precursor
> project (synchronous logical replication seems like a prereq, no?) would
> make more sense.  Or, perhaps, a different piece of the HA question, but
> solving the whole thing in a summer strikes me as unlikely to be
> reasonable.

Frankly I got the impression that logical replication in PostgreSQL 10
is as synchronous as physical replication in terms that it treats
synchronous_commit the same way and gives exactly same guarantees. Is
there a problem with current implementation of logical replication I'm
not aware of? Or by synchronous logical replication you meant something
different?

Regarding the difficulty of the project - in fact it's not that
difficult. Particularly this project can rely on external tools, e.g.
use Consul for service discovery and leader election based on
leader-lease approach (implementation [1]). Having this the only thing
is missing is automatic replica promoting and (optionally)
re-configuring of HAProxy / pgbouncer / whatever. Yes, and lots of
Jepsen-like test of course. I believe it's not such a complicated
project.

> Regarding the thrift data type, that seems like a pretty good GSoC
> project, though I'm not sure why you suggest having pl/pgsql functions
> for accessing data from .thrift files- plpgsql can't directly access the
> filesystem and the input routines for .thrift-style data would certainly
> be best in C.

What I meant was generating PL/pgSQL procedures from .thift files, like
`MyStructure_getFieldX(bytea) returns text`. It took me a few hours to
write pg_protobuf so I decided the project would be way to easy without
implementing such tool. I changed the text on wiki, hopefully it's
better now.

[1]: https://github.com/afiskon/go-elector

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature

Re: [HACKERS] UPDATE of partition key

2017-12-15 Thread Robert Haas

On Wed, Dec 13, 2017 at 5:18 AM, Amit Khandekar  wrote:
> Amit Langote informed me off-list, - along with suggestions for
> changes - that my patch needs a rebase. Attached is the rebased
> version. I have also bumped the patch version number (now v29),
> because this as additional changes, again, suggested by Amit L :
> Because  ExecSetupPartitionTupleRouting() has mtstate parameter now,
> no need to pass update_rri and num_update_rri, since they can be
> retrieved from mtstate.
>
> Also, the preparatory patch is also rebased.

Reviewing the preparatory patch:

+ PartitionTupleRouting *partition_tuple_routing;
+ /* Tuple-routing support info */

Something's wrong with the formatting here.

-PartitionDispatch **pd,
-ResultRelInfo ***partitions,
-TupleConversionMap ***tup_conv_maps,
-TupleTableSlot **partition_tuple_slot,
-int *num_parted, int *num_partitions)
+PartitionTupleRouting **partition_tuple_routing)

Since we're consolidating all of ExecSetupPartitionTupleRouting's
output parameters into a single structure, I think it might make more
sense to have it just return that value.  I think it's only done with
output parameter today because there are so many different things
being produced, and we can't return them all.

+ PartitionTupleRouting *ptr = mtstate->mt_partition_tuple_routing;

This is just nitpicking, but I don't find "ptr" to be the greatest
variable name; it looks too much like "pointer".  Maybe we could use
"routing" or "proute" or something.

It seems to me that we could improve things here by adding a function
ExecCleanupTupleRouting(PartitionTupleRouting *) which would do the
various heap_close(), ExecDropSingleTupleTableSlot(), and
ExecCloseIndices() operations which are currently performed in
CopyFrom() and, by separate code, in ExecEndModifyTable().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: GSoC 2018

2017-12-15 Thread Stephen Frost

Aleksander,

* Aleksander Alekseev (a.aleks...@postgrespro.ru) wrote:
> > HA/fail-over is a very broad topic, with a lot of pieces that need to be
> > done such that I'm not sure it's really viable, but perhaps a precursor
> > project (synchronous logical replication seems like a prereq, no?) would
> > make more sense.  Or, perhaps, a different piece of the HA question, but
> > solving the whole thing in a summer strikes me as unlikely to be
> > reasonable.
> 
> Frankly I got the impression that logical replication in PostgreSQL 10
> is as synchronous as physical replication in terms that it treats
> synchronous_commit the same way and gives exactly same guarantees. Is
> there a problem with current implementation of logical replication I'm
> not aware of? Or by synchronous logical replication you meant something
> different?

synchronous_commit isn't the same as synchronous_standby_names ...

We could have used better names for those GUCs, I suspect.

> Regarding the difficulty of the project - in fact it's not that
> difficult. Particularly this project can rely on external tools, e.g.
> use Consul for service discovery and leader election based on
> leader-lease approach (implementation [1]). Having this the only thing
> is missing is automatic replica promoting and (optionally)
> re-configuring of HAProxy / pgbouncer / whatever. Yes, and lots of
> Jepsen-like test of course. I believe it's not such a complicated
> project.

What you're talking about is rebuilding Patroni, but adding more into it
than even Patroni tries to do, building it on Logical Replication
instead of physical replication, and calling it simple and something
that could get into core PostgreSQL over a 12 week GSoC project.  I've
certainly got doubts about that, even if we decide that it'd be an
external-to-PG project (like Patroni).

What might be interesting is seeing if Logical Replication could be
added to Patroni as an option and then building on that..  Having
someone involved in the Patroni project would be the right way to go
about proposing that though to see what they think of it.  That would
also be much more sensible as a GSoC project, since it'd be an addition
to an existing project and not more-or-less starting a whole new
project.

> > Regarding the thrift data type, that seems like a pretty good GSoC
> > project, though I'm not sure why you suggest having pl/pgsql functions
> > for accessing data from .thrift files- plpgsql can't directly access the
> > filesystem and the input routines for .thrift-style data would certainly
> > be best in C.
> 
> What I meant was generating PL/pgSQL procedures from .thift files, like
> `MyStructure_getFieldX(bytea) returns text`. It took me a few hours to
> write pg_protobuf so I decided the project would be way to easy without
> implementing such tool. I changed the text on wiki, hopefully it's
> better now.

Wouldn't it make more sense to have something like what we have for
JSONB where there's operators that are used to pull out specific fields
instead of generated pl/pgsql procedures..?

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: pearltidy source code has been removed (pgindent)

2017-12-15 Thread Andrew Dunstan



On 12/14/2017 08:55 PM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> On 12/14/2017 08:37 PM, Jordan Deitch wrote:
>>> I am unable to build pgindent as it appears the pearltidy source has
>>> been removed from sourceforge:
>>> https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/tools/pgindent/README;hb=HEAD
>> I just downloaded it from there.
> The 20090616 release we recommend?  I don't see it there either:
>
> https://sourceforge.net/projects/perltidy/files/
>
> Maybe it's time to update to a less hoary version?
>
> In any case, you don't need perltidy unless you intend to reindent
> the Perl code .


Oh, oops, got the wrong version, you're right, sorry for the noise.

Yeah, seems like a good time to update it.

I'm currently using 20170501 for the buildfarm code.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] pgbench more operators & functions

2017-12-15 Thread Teodor Sigaev


2) In makeVariableValue():
if (pg_strcasecmp(var->svalue, "null") == 0)
...
else if (pg_strncasecmp(var->svalue, "true", slen)

mixing of pg_strcasecmp and pg_strNcasecmp. And, IMHO, result of
pg_strncasecmp("tru", "true", 1) will  be 0.


Yep, but it cannot be called like that because slen == strlen(var->svalue).

sorry, mistyped
pg_strncasecmp("tru", "true", 3) will  be 0.



It may be good for 't' of 'f' but it seems too free grammar to accept 'tr' or 
'fa' or even 'o' which actually not known to be on or off.


Yes, it really works like that. I tried to make something clearer than 
"src/bin/psql/variable.c". Maybe I did not succeed.
Ok, I see. Current coding accepts truexxx, falsexxx, yesxx, noxxx but doesn't 
accept offxxx and onxxx. Not so consistent as it could be. Also it doesn't 
accept 1 and 0 as psql does, but it's obviously why.


I used "PGBT_NO_VALUE" which seemed clearer otherwise a variable may be set and 
its value would not "not set" which would look strange.

Agree


Sorry, but I found more notices:
1) '~' and unary '-' should be commented, it's not so easy to guess about how 
they actually implemented (-1 XOR value, remember that -1 is 0xf)


2)
-   | expr '%' expr { $$ = make_op(yyscanner, "%", $1, $3); }
+   | expr '%' expr { $$ = make_op(yyscanner, "mod", $1, $3); }

why is MOD operation renamed? Do I miss something in thread?



Looking to psql and pgbench scripting implementation, isn't it better to 
integrate lua in psql & pgbench?


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

Re: GSoC 2018

2017-12-15 Thread Aleksander Alekseev

Hi Stephen,

> synchronous_commit isn't the same as synchronous_standby_names ...

What about synchronous_standby_names? Logical replication ignores it and
doesn't wait for ack's from replicas or something?

> What might be interesting is seeing if Logical Replication could be
> added to Patroni as an option and then building on that..  Having
> someone involved in the Patroni project would be the right way to go
> about proposing that though to see what they think of it.  That would
> also be much more sensible as a GSoC project, since it'd be an addition
> to an existing project and not more-or-less starting a whole new
> project.

Completely agree, this project can be an improvement for Stolon (or
Patroni, but I personally never tested or used it, also I got a feeling
that Google guys will prefer a project that is written in Go). This
would make much more sense.

> Wouldn't it make more sense to have something like what we have for
> JSONB where there's operators that are used to pull out specific fields
> instead of generated pl/pgsql procedures..?

I don't see why we can't have both procedures and operators like:

```
x -> 'field1' -- returns int
x ->> 'field2' -- returns string
```

The reason why some users may prefer procedures to operators is
that it's easy to make a typo in a field name. Procedures are safer in
this regard.

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature

[HACKERS] wrong t_bits alignment in pageinspect

2017-12-15 Thread Maksim Milyutin


Hi!


I found out the problem in exposing values of t_bits field from 
heap_page_items function.


When the number of attributes in table is multiple of eight, t_bits 
column shows double number of bits in which data fields are included.


For example:

# create table tbl(f1 int, f2 int, f3 int, f4 int, f5 int, f6 int, f7 
int, f8 int);

# insert into tbl(f1, f8) values (x'f1'::int, 0);

# select t_bits, t_data from heap_page_items(get_raw_page('tbl', 0));

   t_bits  |   t_data
   --+
 10011000 | \xf100

I suppose the prefix 1001 corresponds to real value of t_bits, the 
rest part 1000 - to the lower byte of f1 field of tbl.



Attached patch fixes this issue.


--
Regards,
Maksim Milyutin

diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index ca4d3f5..fe53a1a 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -234,7 +234,7 @@ heap_page_items(PG_FUNCTION_ARGS)
 	int			bits_len;
 
 	bits_len =
-		((tuphdr->t_infomask2 & HEAP_NATTS_MASK) / 8 + 1) * 8;
+		TYPEALIGN(8, (tuphdr->t_infomask2 & HEAP_NATTS_MASK));
 	values[11] = CStringGetTextDatum(
 	 bits_to_text(tuphdr->t_bits, bits_len));
 }

Re: GSoC 2018

2017-12-15 Thread Arseny Sher

Aleksander Alekseev  writes:

> Hi Stephen,
>
>> synchronous_commit isn't the same as synchronous_standby_names ...
>
> What about synchronous_standby_names? Logical replication ignores it and
> doesn't wait for ack's from replicas or something?

It actually works, see [1]:

The name of a standby server for this purpose is the application_name
setting of the standby, as set in the standby's connection
information. In case of a physical replication standby, this should be
set in the primary_conninfo setting in recovery.conf; the default is
walreceiver. For logical replication, this can be set in the connection
information of the subscription, and it defaults to the subscription
name.

[1] 
https://www.postgresql.org/docs/current/static/runtime-config-replication.html

--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: GSoC 2018

2017-12-15 Thread Alex Kliukin

Hello,

On Fri, Dec 15, 2017, at 14:30, Stephen Frost wrote:
> Aleksander,
> 
> * Aleksander Alekseev (a.aleks...@postgrespro.ru) wrote:
> 
> > Regarding the difficulty of the project - in fact it's not that
> > difficult. Particularly this project can rely on external tools, e.g.
> > use Consul for service discovery and leader election based on
> > leader-lease approach (implementation [1]). Having this the only thing
> > is missing is automatic replica promoting and (optionally)
> > re-configuring of HAProxy / pgbouncer / whatever. Yes, and lots of
> > Jepsen-like test of course. I believe it's not such a complicated
> > project.

Does it make sense to address the  limitations of the logical
replication first, i.e. inability to replicate DDL, sequences and so on?

> 
> What you're talking about is rebuilding Patroni, but adding more into it
> than even Patroni tries to do, building it on Logical Replication
> instead of physical replication, and calling it simple and something
> that could get into core PostgreSQL over a 12 week GSoC project.  I've
> certainly got doubts about that, even if we decide that it'd be an
> external-to-PG project (like Patroni).
> 
> What might be interesting is seeing if Logical Replication could be
> added to Patroni as an option and then building on that..  Having
> someone involved in the Patroni project would be the right way to go
> about proposing that though to see what they think of it.  That would
> also be much more sensible as a GSoC project, since it'd be an addition
> to an existing project and not more-or-less starting a whole new
> project.

Right now logical replication and  physical replication-based HA tools
don't work together nicely, since logical replication position is not
propagated to the promoted replica (I think Craig Ringer has been
tackling this issue for a few releases already, the latest set of
patches I could find is https://commitfest.postgresql.org/15/788/).
Perhaps there is opportunity for a GSoC student to help fixing it. Until
then  we cannot use logical replication for HA, and even doing something
simpler like automating creation of logical replicas in Patroni makes
little sense, as they are doomed to be reinitialized on every failover.

-- 
Sincerely,
Alex

Re: GSoC 2018

2017-12-15 Thread Stephen Frost

Alex,

* Alex Kliukin (al...@hintbits.com) wrote:
> On Fri, Dec 15, 2017, at 14:30, Stephen Frost wrote:
> > * Aleksander Alekseev (a.aleks...@postgrespro.ru) wrote:
> > 
> > > Regarding the difficulty of the project - in fact it's not that
> > > difficult. Particularly this project can rely on external tools, e.g.
> > > use Consul for service discovery and leader election based on
> > > leader-lease approach (implementation [1]). Having this the only thing
> > > is missing is automatic replica promoting and (optionally)
> > > re-configuring of HAProxy / pgbouncer / whatever. Yes, and lots of
> > > Jepsen-like test of course. I believe it's not such a complicated
> > > project.
> 
> Does it make sense to address the  limitations of the logical
> replication first, i.e. inability to replicate DDL, sequences and so on?

This is what I was trying to get at earlier, which was mostly just
around trying to scope the project down to something a bit more
reasonable since the initial proposal sounded like it could be a whole
new and independent OSS project on its own.

> > What you're talking about is rebuilding Patroni, but adding more into it
> > than even Patroni tries to do, building it on Logical Replication
> > instead of physical replication, and calling it simple and something
> > that could get into core PostgreSQL over a 12 week GSoC project.  I've
> > certainly got doubts about that, even if we decide that it'd be an
> > external-to-PG project (like Patroni).
> > 
> > What might be interesting is seeing if Logical Replication could be
> > added to Patroni as an option and then building on that..  Having
> > someone involved in the Patroni project would be the right way to go
> > about proposing that though to see what they think of it.  That would
> > also be much more sensible as a GSoC project, since it'd be an addition
> > to an existing project and not more-or-less starting a whole new
> > project.
> 
> Right now logical replication and  physical replication-based HA tools
> don't work together nicely, since logical replication position is not
> propagated to the promoted replica (I think Craig Ringer has been
> tackling this issue for a few releases already, the latest set of
> patches I could find is https://commitfest.postgresql.org/15/788/).
> Perhaps there is opportunity for a GSoC student to help fixing it. Until
> then  we cannot use logical replication for HA, and even doing something
> simpler like automating creation of logical replicas in Patroni makes
> little sense, as they are doomed to be reinitialized on every failover.

This also sounds like a good project for a GSoC student.

To be clear, we can certainly have project ideas on the Ideas page which
build on things that aren't yet committed or to help with existing
projects that are already underway.  The project needs to be the right
level of effort (remember- this will be the student's job for 12 weeks)
and needs to be something that a student would be able to come up to
speed on and make serious and meaningful progress on during their
summer.

When it comes to timeline, student proposals start being accepted in
March.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: GSoC 2018

2017-12-15 Thread Alex Kliukin



On Fri, Dec 15, 2017, at 14:52, Aleksander Alekseev wrote:

> Completely agree, this project can be an improvement for Stolon (or
> Patroni, but I personally never tested or used it, also I got a feeling
> that Google guys will prefer a project that is written in Go). This
> would make much more sense.

I don't believe Google will reject a project based on the fact that it
is written in Python (in fact, Python Software Foundation has
successfully participated in GSoC for many years). Based on the github
statistics, Patroni has started earlier and has more contributors than
Stolon (including those contributed more than one patch/pull-request.)
-- 
Sincerely,
Alex

Re: GSoC 2018

2017-12-15 Thread Stephen Frost

Alex,

* Alex Kliukin (al...@hintbits.com) wrote:
> On Fri, Dec 15, 2017, at 14:52, Aleksander Alekseev wrote:
> > Completely agree, this project can be an improvement for Stolon (or
> > Patroni, but I personally never tested or used it, also I got a feeling
> > that Google guys will prefer a project that is written in Go). This
> > would make much more sense.
> 
> I don't believe Google will reject a project based on the fact that it
> is written in Python (in fact, Python Software Foundation has
> successfully participated in GSoC for many years). Based on the github
> statistics, Patroni has started earlier and has more contributors than
> Stolon (including those contributed more than one patch/pull-request.)

I am 100% certain that Google will be more than happy to accept a
Python-based project and that they won't favor a Go project over a
Python one.

That said, I don't think we need to get into a discussion about which
project would be best- I'd be happy to have them both listed on the
Ideas page (perhaps independently would be best though) provided there
are mentors for each.

Also, remember, these are just our ideas- students are welcome to pick
which they're interested in or to choose their own.  When we get to the
point of evaluating the student proposals, it will be a discussion among
the GSoC admins and the individuals who have signed up to be mentors.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] pgbench more operators & functions

2017-12-15 Thread Pavel Stehule

2017-12-15 14:47 GMT+01:00 Teodor Sigaev :

> 2) In makeVariableValue():
>>> if (pg_strcasecmp(var->svalue, "null") == 0)
>>> ...
>>> else if (pg_strncasecmp(var->svalue, "true", slen)
>>>
>>> mixing of pg_strcasecmp and pg_strNcasecmp. And, IMHO, result of
>>> pg_strncasecmp("tru", "true", 1) will  be 0.
>>>
>>
>> Yep, but it cannot be called like that because slen ==
>> strlen(var->svalue).
>>
> sorry, mistyped
> pg_strncasecmp("tru", "true", 3) will  be 0.
>
>
>> It may be good for 't' of 'f' but it seems too free grammar to accept
>>> 'tr' or 'fa' or even 'o' which actually not known to be on or off.
>>>
>>
>> Yes, it really works like that. I tried to make something clearer than
>> "src/bin/psql/variable.c". Maybe I did not succeed.
>>
> Ok, I see. Current coding accepts truexxx, falsexxx, yesxx, noxxx but
> doesn't accept offxxx and onxxx. Not so consistent as it could be. Also it
> doesn't accept 1 and 0 as psql does, but it's obviously why.
>
> I used "PGBT_NO_VALUE" which seemed clearer otherwise a variable may be
>> set and its value would not "not set" which would look strange.
>>
> Agree
>
>
> Sorry, but I found more notices:
> 1) '~' and unary '-' should be commented, it's not so easy to guess about
> how they actually implemented (-1 XOR value, remember that -1 is
> 0xf)
>
> 2)
> -   | expr '%' expr { $$ = make_op(yyscanner, "%", $1, $3); }
> +   | expr '%' expr { $$ = make_op(yyscanner, "mod", $1, $3); }
>
> why is MOD operation renamed? Do I miss something in thread?
>
>
>
> Looking to psql and pgbench scripting implementation, isn't it better to
> integrate lua in psql & pgbench?
>

I don't think - I like Lua language, but it means no small new dependency.
These changes are only few lines and nobody expect building complex
applications in pgbench or psql.

Regards

Pavel

>
> --
> Teodor Sigaev   E-mail: teo...@sigaev.ru
>WWW:
> http://www.sigaev.ru/
>

Re: Package version in PG_VERSION and version()

2017-12-15 Thread Tom Lane

Christoph Berg  writes:
> Re: Michael Paquier 2017-12-15 
> 
>> Why reinventing the wheel when there is already --with-extra-version
>> that you can use for the same purpose?

> That modifies the PG version number as such, as what psql is showing
> on connect. I'd think that is too intrusive.

I'm really pretty much -1 on having two different ways to do very nearly
the same thing, with the differences determined only by somebody's
arbitrary choices of where they think the modified version should be
exposed.  IMO, either you think the Debian package version is important
enough to show, or you don't.  (I'd incline to the "don't" side anyway.)

regards, tom lane

Re: Reproducible builds: genbki.pl vs schemapg.h

2017-12-15 Thread Tom Lane

Christoph Berg  writes:
> Debian's reproducible builds project has revealed that the full build
> path gets embedded into server/catalog/schemapg.h:

genbki.pl is hardly our only script that prints its $0 ...

regards, tom lane

[PROPOSAL] bracketed-paste support for psql

2017-12-15 Thread Geoff Winkless

Hi

It occurred to me the other day while people were talking about
pasting blocks of text creating problems, especially with tabs, that
xterm bracketed-paste support (also works in at least putty and
probably others) that would block curses handling and just paste as-is
would be a useful (and I'm guessing relatively simple) thing to add.

Is anyone already working on this?

If not, does anyone foresee any problems with the idea?

Geoff

Re: [PROPOSAL] bracketed-paste support for psql

2017-12-15 Thread Peter Eisentraut

On 12/15/17 11:22, Geoff Winkless wrote:
> It occurred to me the other day while people were talking about
> pasting blocks of text creating problems, especially with tabs, that
> xterm bracketed-paste support (also works in at least putty and
> probably others) that would block curses handling and just paste as-is
> would be a useful (and I'm guessing relatively simple) thing to add.

You need to put

set enable-bracketed-paste on

into ~/.inputrc, then it works.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] replace GrantObjectType with ObjectType

2017-12-15 Thread Peter Eisentraut

On 12/13/17 02:35, Michael Paquier wrote:
> Patch 0001 is simply removing EventTriggerSupportsGrantObjectType(),
> but shouldn't we keep it and return an error for objects that have no
> GRANT support? Returning conditionally true looks like a trap waiting
> to take someone in.

I don't understand the motivation for this.  It would just be two lists
for the same thing.  I think the potential for omission would be much
greater that way.

> Similarly, not using default in the switches of
> stringify_adefprivs_objtype() and stringify_grantobjtype() would be
> nice to grab warnings during compilation. And patch 0002 is doing it
> the correct way in aclcheck_error().

I'll fix that.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] replace GrantObjectType with ObjectType

2017-12-15 Thread Peter Eisentraut

On 12/14/17 22:59, Rushabh Lathia wrote:
> I noted that no_priv_msg and not_owner_msg array been removed
> and code fitted the code into aclcheck_error().  Actually that
> makes the code more complex then what it used to be.  I would
> prefer the array rather then code been fitted into the function.

There is an argument for having a big array versus the switch/case
approach.  But most existing code around object addresses uses the
switch/case approach, so it's better to align it that way, I think.
It's weird to have to maintain two different styles.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] bracketed-paste support for psql

2017-12-15 Thread Geoff Winkless

On 15 December 2017 at 17:13, Peter Eisentraut
 wrote:
> You need to put
>
> set enable-bracketed-paste on
>
> into ~/.inputrc, then it works.

Hmm, looks like that requires a newer version of readline (v7) than I
have here.

Oh well, if support is already there (albeit unavailable) then I'll
leave it. No point duplicating effort.

Geoff

Re: Reproducible builds: genbki.pl vs schemapg.h

2017-12-15 Thread Christoph Berg

Re: Tom Lane 2017-12-15 <9616.1513351...@sss.pgh.pa.us>
> Christoph Berg  writes:
> > Debian's reproducible builds project has revealed that the full build
> > path gets embedded into server/catalog/schemapg.h:
> 
> genbki.pl is hardly our only script that prints its $0 ...

As per
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/postgresql-10.html,
that's the only place that makes it into the resulting binary.
I wouldn't be sending a patch if it didn't fix the issue.

Christoph
-- 
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-15 Thread Peter Geoghegan

On Thu, Dec 14, 2017 at 6:30 PM, Andres Freund  wrote:
> Pushed this way.  Moved some more relfrozenxid/relminmxid tests outside
> of the cutoff changes, polished some error messages.
>
>
> Alvaro, Michael, Peter, and everyone else I'd greatly appreciate if you
> could have a look at the backported version, just about everything but
> v10 had conflicts, some of them not insubstantial.

I have one minor piece of feedback on the upgrading of assertions to
ereport()s with ERRCODE_DATA_CORRUPTION: It would be nice if you could
upgrade the raw elog() "can't happen" error within
IndexBuildHeapRangeScan() to be an ereport() with
ERRCODE_DATA_CORRUPTION. I'm referring to the "failed to find parent
tuple for heap-only tuple" error, which I think merits being a real
user-visible error, just like the relfrozenxid/relminmxid tests are
now. As you know, the enhanced amcheck will sometimes detect
corruption due to this bug by hitting that error.

It would be nice if we could tighten up the number of errcodes that
can be involved in an error that amcheck detects. I know that elog()
implicitly has an errcode, that could be included in the errcodes to
check when verification occurs in an automated fashion across a large
number of databases. However, this is a pretty esoteric point, and I'd
prefer to just try to limit it to ERRCODE_DATA_CORRUPTION and
ERRCODE_INDEX_CORRUPTED, insofar as that's practical. When we ran
amcheck on the Heroku fleet, back when I worked there, there were all
kinds of non-interesting errors that could occur that needed to be
filtered out. I want to try to make that process somewhat less painful.

-- 
Peter Geoghegan

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-15 Thread Andres Freund

On 2017-12-15 20:25:22 +0900, Michael Paquier wrote:
> On Fri, Dec 15, 2017 at 11:30 AM, Andres Freund  wrote:
> > Alvaro, Michael, Peter, and everyone else I'd greatly appreciate if you
> > could have a look at the backported version, just about everything but
> > v10 had conflicts, some of them not insubstantial.
> 
> I have gone through the backpatched versions for the fixes in tuple
> pruning, running some tests on the way and those look good to me.

Thanks.


> I have not taken the time to go through the ones changing the
> assertions to ereport() though...

Those were the ones with a lot of conflicts tho - I'd temporarily broken
freezing for 9.3, but both review and testing caught it...

Greetings,

Andres Freund

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-15 Thread Andres Freund

On 2017-12-15 10:46:05 -0800, Peter Geoghegan wrote:
> On Thu, Dec 14, 2017 at 6:30 PM, Andres Freund  wrote:
> > Pushed this way.  Moved some more relfrozenxid/relminmxid tests outside
> > of the cutoff changes, polished some error messages.
> >
> >
> > Alvaro, Michael, Peter, and everyone else I'd greatly appreciate if you
> > could have a look at the backported version, just about everything but
> > v10 had conflicts, some of them not insubstantial.
> 
> I have one minor piece of feedback on the upgrading of assertions to
> ereport()s with ERRCODE_DATA_CORRUPTION: It would be nice if you could
> upgrade the raw elog() "can't happen" error within
> IndexBuildHeapRangeScan() to be an ereport() with
> ERRCODE_DATA_CORRUPTION. I'm referring to the "failed to find parent
> tuple for heap-only tuple" error, which I think merits being a real
> user-visible error, just like the relfrozenxid/relminmxid tests are
> now. As you know, the enhanced amcheck will sometimes detect
> corruption due to this bug by hitting that error.

I'm not opposed to that, it just seems independent from this thread. Not
sure I really want to go around and backpatch such a change, that code
has changed a bit between branches. Happy to do so on master.

Greetings,

Andres Freund

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-15 Thread Peter Geoghegan

On Fri, Dec 15, 2017 at 10:58 AM, Andres Freund  wrote:
>> I have one minor piece of feedback on the upgrading of assertions to
>> ereport()s with ERRCODE_DATA_CORRUPTION: It would be nice if you could
>> upgrade the raw elog() "can't happen" error within
>> IndexBuildHeapRangeScan() to be an ereport() with
>> ERRCODE_DATA_CORRUPTION. I'm referring to the "failed to find parent
>> tuple for heap-only tuple" error, which I think merits being a real
>> user-visible error, just like the relfrozenxid/relminmxid tests are
>> now. As you know, the enhanced amcheck will sometimes detect
>> corruption due to this bug by hitting that error.
>
> I'm not opposed to that, it just seems independent from this thread. Not
> sure I really want to go around and backpatch such a change, that code
> has changed a bit between branches. Happy to do so on master.

The elog(), which was itself upgraded from a simple Assert by commit
d70cf811, appears in exactly the same form in 9.3+. Things did change
there, but they were kept in sync.

-- 
Peter Geoghegan

Re: Top-N sorts verses parallelism

2017-12-15 Thread Jeff Janes

On Thu, Dec 14, 2017 at 5:12 PM, Thomas Munro  wrote:

> >
> > This looks like a costing bug.  The estimated cost of sorting 416,667
> > estimated tuples in one parallel worker is almost identical to the
> estimated
> > cost of sorting 1,000,000 tuples when parallelism is turned off.  Adding
> > some logging to the cost_sort function, it looks like the limit does not
> get
> > sent down for the parallel estimate:
> >
> > NOTICE:  JJ cost_sort tuples 100.00, limit 61.00, sort_mem
> 65536
> > NOTICE:  JJ cost_sort tuples 416667.00, limit -1.00, sort_mem
> 65536
> >
> > So the LIMIT gets passed down for the execution step, but not for the
> > planning step.
>
> Oh, well spotted.  I was looking in the wrong place.  Maybe the fix is
> as simple as the attached?  It certainly helps in the cases I tested,
> including with wider tables.
>

I had hit on the same change.  And was also surprised that it was located
where it was.  With the change, it uses the parallel plan all the way down
to LIMIT 1.

With the patch, it still satisfies make check, so if it introduces errors
they are subtle ones.  If we can't actually do this and it needs to stay
-1, then I think we need a comment to explain why.

Cheers,

Jeff

Re: [HACKERS] replace GrantObjectType with ObjectType

2017-12-15 Thread Robert Haas

On Fri, Dec 15, 2017 at 12:40 PM, Peter Eisentraut
 wrote:
> On 12/14/17 22:59, Rushabh Lathia wrote:
>> I noted that no_priv_msg and not_owner_msg array been removed
>> and code fitted the code into aclcheck_error().  Actually that
>> makes the code more complex then what it used to be.  I would
>> prefer the array rather then code been fitted into the function.
>
> There is an argument for having a big array versus the switch/case
> approach.  But most existing code around object addresses uses the
> switch/case approach, so it's better to align it that way, I think.
> It's weird to have to maintain two different styles.

Eh, really?  What about the two big arrays at the top of objectaddress.c?

(This is just a drive-by comment; I haven't looked at the details of
this patch.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-15 Thread Peter Geoghegan

On Fri, Dec 15, 2017 at 11:03 AM, Peter Geoghegan  wrote:
> The elog(), which was itself upgraded from a simple Assert by commit
> d70cf811, appears in exactly the same form in 9.3+. Things did change
> there, but they were kept in sync.

BTW, if you're going to do it, I would target the similar error within
validate_index_heapscan(), too. That was also added by d70cf811.

-- 
Peter Geoghegan

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-15 Thread Andres Freund

On 2017-12-15 11:15:47 -0800, Peter Geoghegan wrote:
> On Fri, Dec 15, 2017 at 11:03 AM, Peter Geoghegan  wrote:
> > The elog(), which was itself upgraded from a simple Assert by commit
> > d70cf811, appears in exactly the same form in 9.3+. Things did change
> > there, but they were kept in sync.
> 
> BTW, if you're going to do it, I would target the similar error within
> validate_index_heapscan(), too. That was also added by d70cf811.

Please send a patch for master on a *new* thread.

Greetings,

Andres Freund

[sqlsmith] Parallel worker executor crash on master

2017-12-15 Thread Andreas Seltenreich

Hi,

sqlsmith just crashed a parallel worker while testing master at
699bf7d05c.  I can reproduce it with the following recipe on a fresh
regression database.  Backtrace and query plan below as well.

regards,
Andreas

--8<---cut here---start->8---
set min_parallel_table_scan_size = '8kB';
set parallel_setup_cost = 0;
set parallel_tuple_cost = 0;

select *
  from
public.prt1_m_p3 as ref_3
inner join pg_catalog.pg_class as sample_1 tablesample bernoulli 
(2.1)
on (ref_3.c = sample_1.relpages )
  inner join public.rtest_view4 as ref_4
  on (ref_4.b @@ (select keyword from public.test_tsquery limit 1 
offset 2)
),
lateral (select
  ref_3.a as c0
from
  public.lseg_tbl as ref_5
where (select f1 from public.timetz_tbl limit 1 offset 5)
 > (select pg_catalog.min(timetzcol) from public.brintest)
limit 18) as subq_2
  where true
  limit 102;
--8<---cut here---end--->8---


QUERY PLAN
---
 Limit  (cost=107.69..1421.69 rows=102 width=315)
   InitPlan 1 (returns $1)
 ->  Limit  (cost=0.35..0.52 rows=1 width=32)
   ->  Gather  (cost=0.00..1.04 rows=6 width=32)
 Workers Planned: 1
 ->  Parallel Seq Scan on test_tsquery  (cost=0.00..1.04 rows=4 
width=32)
   ->  Nested Loop  (cost=107.17..5775.39 rows=440 width=315)
 Join Filter: (ref_3.c = sample_1.relpages)
 ->  Nested Loop  (cost=107.17..5416.40 rows=250 width=16)
   ->  Gather  (cost=0.00..1.29 rows=50 width=12)
 Workers Planned: 1
 ->  Parallel Seq Scan on prt1_m_p3 ref_3  (cost=0.00..1.29 
rows=29 width=12)
   ->  Limit  (cost=107.17..108.20 rows=5 width=4)
 InitPlan 2 (returns $4)
   ->  Limit  (cost=0.45..0.54 rows=1 width=12)
 ->  Gather  (cost=0.00..1.07 rows=12 width=12)
   Workers Planned: 1
   ->  Parallel Seq Scan on timetz_tbl  
(cost=0.00..1.07 rows=7 width=12)
 InitPlan 3 (returns $5)
   ->  Aggregate  (cost=106.62..106.64 rows=1 width=12)
 ->  Seq Scan on brintest  (cost=0.00..106.30 
rows=130 width=12)
 ->  Gather  (cost=0.00..1.03 rows=5 width=4)
   Workers Planned: 1
   Params Evaluated: $4, $5
   ->  Result  (cost=0.00..1.03 rows=3 width=0)
 One-Time Filter: ($4 > $5)
 ->  Parallel Seq Scan on lseg_tbl ref_5  
(cost=0.00..1.03 rows=3 width=0)
 ->  Materialize  (cost=0.00..194.10 rows=44 width=299)
   ->  Nested Loop  (cost=0.00..193.88 rows=44 width=299)
 ->  Gather  (cost=0.00..140.00 rows=1 width=40)
   Workers Planned: 2
   Params Evaluated: $1
   ->  Parallel Seq Scan on rtest_view4 ref_4  
(cost=0.00..140.00 rows=1 width=40)
 Filter: (b @@ $1)
 ->  Sample Scan on pg_class sample_1  (cost=0.00..53.44 
rows=44 width=259)
   Sampling: bernoulli ('2.1'::real)


Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x55c9dba2d7f8 in timetz_gt (fcinfo=0x55c9dd7295a0) at date.c:2183
2183TimeTzADT  *time2 = PG_GETARG_TIMETZADT_P(1);
#1  0x55c9db8ae8b2 in ExecInterpExpr (state=0x55c9dd729f00, 
econtext=0x55c9dd7299b8, isnull=) at execExprInterp.c:692
#2  0x55c9db8d6753 in ExecEvalExprSwitchContext (isNull=0x7ffd2f99e55f "", 
econtext=0x55c9dd7299b8, state=) at 
../../../src/include/executor/executor.h:300
#3  ExecQual (econtext=0x55c9dd7299b8, state=) at 
../../../src/include/executor/executor.h:369
#4  ExecResult (pstate=0x55c9dd729c38) at nodeResult.c:84
#5  0x55c9db8b21ea in ExecProcNode (node=0x55c9dd729c38) at 
../../../src/include/executor/executor.h:242
#6  ExecutePlan (execute_once=, dest=0x55c9dd6fff78, 
direction=, numberTuples=18, sendTuples=, 
operation=CMD_SELECT, use_parallel_mode=, 
planstate=0x55c9dd729c38, estate=0x55c9dd729118) at execMain.c:1718
#7  standard_ExecutorRun (queryDesc=0x55c9dd742b48, direction=, 
count=18, execute_once=) at execMain.c:361
#8  0x55c9db8b6e99 in ParallelQueryMain (seg=0x55c9dd6a8ea8, 
toc=0x7f9109496000) at execParallel.c:1294
#9  0x55c9db7911d9 in ParallelWorkerMain (main_arg=) at 
parallel.c:1150
#10 0x55c9db975a63 in StartBackgroundWorker () at bgworker.c:841
#11 0x55c9db982015 in do_start_bgworker (rw=0x55c9dd6a0380)

Bug: Ambiguous Column Reference Allowed When Joining to pg_roles.oid

2017-12-15 Thread Matthew Kelly

I recently fell afoul of a weird edge case while writing an extension.  It 
seems Postgres allows for an ambiguous column reference to oid in the where 
clause when joining to pg_roles.  It just arbitrarily chooses pg_roles.oid and 
ignores the conflicting name.  Example:


postgres=# CREATE TABLE t_demo();
CREATE TABLE
postgres=# SELECT r.rolname FROM pg_class c JOIN pg_roles r ON (c.relowner = 
r.oid) WHERE oid = 't_demo'::regclass;
 rolname
-
(0 rows)

postgres=# SELECT r.rolname FROM pg_class c JOIN pg_roles r ON (c.relowner = 
r.oid) WHERE c.oid = 't_demo'::regclass;
 rolname
--
 postgres
(1 row)

postgres=# SELECT r.rolname FROM pg_class c JOIN pg_roles r ON (c.relowner = 
r.oid) WHERE r.oid = 't_demo'::regclass;
 rolname
-
(0 rows)


It seems like ambiguous oid references generally hit a different error message 
than normal ambiguous column references.


postgres=# CREATE TABLE t1(x int, y int) WITH OIDS;
CREATE TABLE
postgres=# CREATE TABLE t2(x int, y int) WITH OIDS;
CREATE TABLE
postgres=# SELECT * FROM t1 JOIN t2 ON (t1.x = t2.x) WHERE y = 5;
ERROR:  column reference "y" is ambiguous
LINE 1: SELECT * FROM t1 JOIN t2 ON (t1.x = t2.x) WHERE y = 5;
postgres=# SELECT * FROM t1 JOIN t2 ON (t1.x = t2.x) WHERE oid = 5;
ERROR:  column "oid" does not exist
LINE 1: SELECT * FROM t1 JOIN t2 ON (t1.x = t2.x) WHERE oid = 5;
^
HINT:  There is a column named "oid" in table "t1", but it cannot be referenced 
from this part of the query.



It’s clear that oids are getting to another code path normally and I suspected 
that it is related to the fact that pg_roles is a view with an explicit oid 
column.  So I tried this test:


postgres=# CREATE VIEW v1 AS SELECT x, y, oid FROM t1;
CREATE VIEW
postgres=# SELECT * FROM v1 JOIN t2 ON (v1.x = t2.x) WHERE oid = 5;
 x | y | oid | x | y
---+---+-+---+---
(0 rows)


It would appear that unqualified oid columns do not make it all the way to 
where clause evaluation, whereas columns that happen to be named oid do survive 
that far.  Therefore, postgres does not realize that is has an ambiguous column 
reference on its hands and binds to column presented by the view.  I could 
definitely be wrong because I haven’t looked a the code but that is what the 
behavior looks like.

This bug was first found on 9.3.19, but I just tested this against 10.1 as well.

- Matt K

genomic locus

2017-12-15 Thread Gene Selkov

Greetings everyone,

I need a data type to represent genomic positions, which will consist of a
string and a pair of integers with interval logic and access methods. Sort
of like my seg type, but more straightforward.

I noticed somebody took a good care of seg while I was away for the last 20
years, and I am extremely grateful for that. I have been using it. In the
meantime, things have changed and now I am almost clueless about how you
deal with contributed modules and what steps I should take once I get it to
work. Also, is it a good idea to clone and fix seg for this purpose, or is
there a more appropriate template? Or maybe just building it from scratch
will be a better idea?

I have seen a lot of bit rot in other extensions (never contributed) that I
have not maintained since 2009 and I now I am unable to fix some of them,
so I wonder how much of old knowledge is still applicable. In other words,
is what I see in new code just a change of macros or the change of
principles?

Thanks,

--Gene

Re: Top-N sorts verses parallelism

2017-12-15 Thread Robert Haas

On Fri, Dec 15, 2017 at 2:10 PM, Jeff Janes  wrote:
> I had hit on the same change.  And was also surprised that it was located
> where it was.  With the change, it uses the parallel plan all the way down
> to LIMIT 1.
>
> With the patch, it still satisfies make check, so if it introduces errors
> they are subtle ones.  If we can't actually do this and it needs to stay -1,
> then I think we need a comment to explain why.

Interesting.  I suspect this is correct now, but would not have been
before commit 3452dc5240da43e833118484e1e9b4894d04431c.  AFAICS, this
doesn't affect any execution-time behavior, just the cost estimate.
And, prior to that commit, the execution-time behavior was different:
there would not have been any way for the worker to do a top-N sort,
because the LIMIT was not pushed through the Gather.

Does that sound right, or am I still confused?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: portal pinning

2017-12-15 Thread Peter Eisentraut

On 12/12/17 10:34, Peter Eisentraut wrote:
> But I also wonder whether we shouldn't automatically pin/unpin portals
> in SPI_cursor_open() and SPI_cursor_close().  This makes sense if you
> consider "pinned" to mean "internally generated".  I don't think there
> is a scenario in which user code should directly operate on a portal
> created by SPI.

Here is a patch for this option.

The above sentence was not quite correct.  Only unnamed portals should
be pinned automatically.  Named portals are of course possible to be
passed around as refcursors for example.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From f9aaa47cf4e46aac973e532a790ac2099d017523 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 15 Dec 2017 15:24:10 -0500
Subject: [PATCH] Move portal pinning from PL/pgSQL to SPI

PL/pgSQL "pins" internally generated (unnamed) portals so that user code
cannot close them by guessing their names.  This logic is also useful in
other languages and really for any code.  So move that logic into SPI.
An unnamed portal obtained through SPI_cursor_open() and related
functions is now automatically pinned, and SPI_cursor_close()
automatically unpins a portal that is pinned.
---
 src/backend/executor/spi.c   | 9 +
 src/pl/plpgsql/src/pl_exec.c | 8 
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index f3da2ddd08..1f0a07ce0b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1175,6 +1175,12 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr 
plan,
{
/* Use a random nonconflicting name */
portal = CreateNewPortal();
+
+   /*
+* Make sure the portal doesn't get closed by the user 
statements we
+* execute.
+*/
+   PinPortal(portal);
}
else
{
@@ -1413,6 +1419,9 @@ SPI_cursor_close(Portal portal)
if (!PortalIsValid(portal))
elog(ERROR, "invalid portal in SPI cursor operation");
 
+   if (portal->portalPinned)
+   UnpinPortal(portal);
+
PortalDrop(portal, false);
 }
 
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index fa4d573e50..243396abd1 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -5330,12 +5330,6 @@ exec_for_query(PLpgSQL_execstate *estate, 
PLpgSQL_stmt_forq *stmt,
/* Fetch loop variable's datum entry */
var = (PLpgSQL_variable *) estate->datums[stmt->var->dno];
 
-   /*
-* Make sure the portal doesn't get closed by the user statements we
-* execute.
-*/
-   PinPortal(portal);
-
/*
 * Fetch the initial tuple(s).  If prefetching is allowed then we grab a
 * few more rows to avoid multiple trips through executor startup
@@ -5450,8 +5444,6 @@ exec_for_query(PLpgSQL_execstate *estate, 
PLpgSQL_stmt_forq *stmt,
 */
SPI_freetuptable(tuptab);
 
-   UnpinPortal(portal);
-
/*
 * Set the FOUND variable to indicate the result of executing the loop
 * (namely, whether we looped one or more times). This must be set last 
so
-- 
2.15.1

Re: Bug: Ambiguous Column Reference Allowed When Joining to pg_roles.oid

2017-12-15 Thread Tom Lane

Matthew Kelly  writes:
> I recently fell afoul of a weird edge case while writing an extension.  It 
> seems Postgres allows for an ambiguous column reference to oid in the where 
> clause when joining to pg_roles.  It just arbitrarily chooses pg_roles.oid 
> and ignores the conflicting name.  Example:
> postgres=# CREATE TABLE t_demo();
> CREATE TABLE
> postgres=# SELECT r.rolname FROM pg_class c JOIN pg_roles r ON (c.relowner = 
> r.oid) WHERE oid = 't_demo'::regclass;
>  rolname
> -
> (0 rows)

I do not think that's a bug exactly.  There's only one column named "oid"
exposed by the join, and once you're above the join it hides the column(s)
supplied by the input relations --- were that not so, you could never
reference a join output column without qualifying it.  If you try this
with just regular OID columns, you get

regression=# create table t1 (f1 int);
CREATE TABLE
regression=# create table t2 (f2 int) with oids;
CREATE TABLE
regression=# select * from t1 join t2 on (f1=f2) where oid = 42;
ERROR:  column "oid" does not exist
LINE 1: select * from t1 join t2 on (f1=f2) where oid = 42;
  ^
HINT:  There is a column named "oid" in table "t2", but it cannot be referenced 
from this part of the query.

which indicates that you have to qualify the table's column if you
want to reference it above the join.  But if there's a matching
user-defined column in the join output then that doesn't happen.

It definitely is a bit unfortunate that the pg_roles view exposes
a user-defined column named "oid", but we felt we had to do that
to avoid breaking user queries that date from when pg_roles was
a plain table.

regards, tom lane

Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-15 Thread Alvaro Herrera

Hmm, so I'm now unsure what the actual proposals for handling pg_dump
are.  We seem to have the following three proposals:

1. Alvaro: use CREATE INDEX ON ONLY  (not recursive ), followed
   by CREATE INDEX ON , followed by ALTER INDEX 
   ATTACH PARTITION .  I provide an ALTER INDEX DETACH
   PARTITION for symmetry and because it can be used to replace the
   index.

   Pros: the database is always restored identically to what was in the
   original.
   Con:  The index hierarchy might be "partial", that is, lack a
   component index on some partition.

2. David's: use CREATE INDEX ON , followed by CREATE INDEX ON
   .  This will use the matching mechanism to automatically
   attach the index on partition to index on parent.  If some partition
   lacks a matching index, one is created automatically by the creation
   on parent.

   If you want to replace the index on a partition, use a new (as yet
   unimplemented) ALTER INDEX REPLACE.

   No need to add ONLY to the table name in CREATE INDEX, since the
   command always recurses.  (This seems good to me, because I 

   Pro: the index is never "partial" (missing a partition).
   Con: the matching mechanism might choose a different index on restore
   than what was selected in the database being dumped.

3. Robert's: use CREATE INDEX ON ONLY , which creates a shell
   index on parent only (no recursion), followed by CREATE INDEX ON
   .  DETACH is not provided.  If you ATTACH an index for a
   partition that already has one index attached, then (1) the newly
   attached one replaces the original (i.e. effectively REPLACE) or (2)
   you get an error and we implement a separate ALTER INDEX REPLACE
   command.  It's not clear to me how or when the shell index becomes a
   real index.


Robert, can you please clarify the terms of your proposal?  How is it
better than mine?  Is David's concern about a "partial" index (i.e. an
index that doesn't exist in some partition) solved by it?

I have code for proposals 1 and 2.  I don't like proposal 2, and David &
Ashutosh don't like (1).  Maybe if we all understand (3) we can agree on
using that one?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Top-N sorts verses parallelism

2017-12-15 Thread Thomas Munro

On Sat, Dec 16, 2017 at 9:13 AM, Robert Haas  wrote:
> On Fri, Dec 15, 2017 at 2:10 PM, Jeff Janes  wrote:
>> I had hit on the same change.  And was also surprised that it was located
>> where it was.  With the change, it uses the parallel plan all the way down
>> to LIMIT 1.
>>
>> With the patch, it still satisfies make check, so if it introduces errors
>> they are subtle ones.  If we can't actually do this and it needs to stay -1,
>> then I think we need a comment to explain why.
>
> Interesting.  I suspect this is correct now, but would not have been
> before commit 3452dc5240da43e833118484e1e9b4894d04431c.  AFAICS, this
> doesn't affect any execution-time behavior, just the cost estimate.
> And, prior to that commit, the execution-time behavior was different:
> there would not have been any way for the worker to do a top-N sort,
> because the LIMIT was not pushed through the Gather.
>
> Does that sound right, or am I still confused?

Looks right to me.  Commit 3452dc52 just forgot to tell the planner.
I'm pleased about that because it makes this a slam-dunk bug-fix and
not some confusing hard to justify costing problem.

BTW with this patch, the same test using a smaller foo table with 250k
rows limit 1 runs a parallel plan in 45ms here, but 220k rows limit 1
runs a non-parallel plan in 80ms, but that's just a regular costing
problem where the point at which the curves cross over will be hard to
get right due to all kinds of other variables, so I guess that kind of
thing is expected.

PS Oops, I do actually know how to spell "versus".  Typos are so much
more embarrassing in subject lines!

-- 
Thomas Munro
http://www.enterprisedb.com

Re: [HACKERS] UPDATE of partition key

2017-12-15 Thread Robert Haas

On Fri, Dec 15, 2017 at 7:58 AM, Robert Haas  wrote:
> Reviewing the preparatory patch:

I started another review pass over the main patch, so here are some
comments about that.  This is unfortunately not a complete review,
however.

- map = ptr->partition_tupconv_maps[leaf_part_index];
+ map = ptr->parentchild_tupconv_maps[leaf_part_index];

I don't think there's any reason to rename this.  In previous patch
versions, you had multiple arrays of tuple conversion maps in this
structure, but the refactoring eliminated that.

Likewise, I'm not sure I get the point of mt_transition_tupconv_maps
-> mt_childparent_tupconv_maps.  That seems like it could similarly be
left alone.

+ /*
+ * If transition tables are the only reason we're here, return. As
+ * mentioned above, we can also be here during update tuple routing in
+ * presence of transition tables, in which case this function is called
+ * separately for oldtup and newtup, so either can be NULL, not both.
+ */
  if (trigdesc == NULL ||
  (event == TRIGGER_EVENT_DELETE && !trigdesc->trig_delete_after_row) ||
  (event == TRIGGER_EVENT_INSERT && !trigdesc->trig_insert_after_row) ||
- (event == TRIGGER_EVENT_UPDATE && !trigdesc->trig_update_after_row))
+ (event == TRIGGER_EVENT_UPDATE && !trigdesc->trig_update_after_row) ||
+ (event == TRIGGER_EVENT_UPDATE && ((oldtup == NULL) ^ (newtup == NULL

I guess this is correct, but it seems awfully fragile.  Can't we have
some more explicit signaling about whether we're only here for
transition tables, rather than deducing it based on exactly one of
oldtup and newtup being NULL?

+ /* Initialization specific to update */
+ if (mtstate && mtstate->operation == CMD_UPDATE)
+ {
+ ModifyTable *node = (ModifyTable *) mtstate->ps.plan;
+
+ is_update = true;
+ update_rri = mtstate->resultRelInfo;
+ num_update_rri = list_length(node->plans);
+ }

I guess I don't see why we need a separate "if" block for this.
Neither is_update nor update_rri nor num_update_rri are used until we
get to the block that begins with "if (is_update)".  Why not just
change that block to test if (mtstate && mtstate->operation ==
CMD_UPDATE)" and put the rest of these initializations inside that
block?

+ int num_update_rri = 0,
+ update_rri_index = 0;
...
+ update_rri_index = 0;

It's already 0.

+ leaf_part_rri = &update_rri[update_rri_index];
...
+ leaf_part_rri = leaf_part_arr + i;

These are doing the same kind of thing, but using different styles.  I
prefer the former style, so I'd change the second one to
&leaf_part_arr[i]. Alternatively, you could change the first one to
update_rri + update_rri_indx.  But it's strange to see the same
variable initialized in two different ways just a few lines apart.

+ if (!partrel)
+ {
+ /*
+ * We locked all the partitions above including the leaf
+ * partitions. Note that each of the newly opened relations in
+ * *partitions are eventually closed by the caller.
+ */
+ partrel = heap_open(leaf_oid, NoLock);
+ InitResultRelInfo(leaf_part_rri,
+   partrel,
+   resultRTindex,
+   rel,
+   estate->es_instrument);
+ }

Hmm, isn't there a problem here?  Before, we opened all the relations
here and the caller closed them all.  But now, we're only opening some
of them.  If the caller closes them all, then they will be closing
some that we opened and some that we didn't.  That seems quite bad,
because the reference counts that are incremented and decremented by
opening and closing should all end up at 0.  Maybe I'm confused
because it seems like this would break in any scenario where even 1
relation was already opened and surely you must have tested that
case... but if there's some reason this works, I don't know what it
is, and the comment doesn't tell me.

+static HeapTuple
+ConvertPartitionTupleSlot(ModifyTableState *mtstate,
+   TupleConversionMap *map,
+   HeapTuple tuple,
+   TupleTableSlot *new_slot,
+   TupleTableSlot **p_my_slot)

This function doesn't use the mtstate argument at all.

+ * (Similarly we need to add the deleted row in OLD TABLE).  We need to do

The period should be before, not after, the closing parenthesis.

+ * Now that we have already captured NEW TABLE row, any AR INSERT
+ * trigger should not again capture it below. Arrange for the same.

A more American style would be something like "We've already captured
the NEW TABLE row, so make sure any AR INSERT trigger fired below
doesn't capture it again."  (Similarly for the other case.)

+ /* The delete has actually happened, so inform that to the caller */
+ if (tuple_deleted)
+ *tuple_deleted = true;

In the US, we inform the caller, not inform that to the caller.  In
other words, here the direct object of "inform" is the person or thing
getting the information (in this case, "the caller"), not the
information being conveyed (in this case, "that").  I realize your
usage is probably typical for your country...

+ Assert(mtstate->mt_is_tupconv_perpart == true);

We usually just Assert(thing_that_should_be_true), not
Assert(thing_that_shou

Re: [HACKERS] pgbench more operators & functions

2017-12-15 Thread Fabien COELHO



Hello Teodor,

It may be good for 't' of 'f' but it seems too free grammar to accept 'tr' 
or 'fa' or even 'o' which actually not known to be on or off.


Yes, it really works like that. I tried to make something clearer than 
"src/bin/psql/variable.c". Maybe I did not succeed.


Ok, I see. Current coding accepts truexxx, falsexxx, yesxx, noxxx but doesn't 
accept offxxx and onxxx. Not so consistent as it could be.


I've checked, but truexxx is not accepted as true. I have added a test 
case which fails on "malformed variable", i.e. it went up to scanning a 
double. When comparing ("truexxx", "true", 7) the fifth char is different, 
so it is != 0. Or I'm missing something.



Also it doesn't accept 1 and 0 as psql does, but it's obviously why.


Yep. I have added a comment that it will be an int, and if a boolean is 
needed it would work as expected.



Sorry, but I found more notices:
1) '~' and unary '-' should be commented, it's not so easy to guess about how 
they actually implemented (-1 XOR value, remember that -1 is 0xf)


Ok, I agree that it looks strange. I have added comments for both. I have 
replaced -1 by 0x so that the code is hopefully clearer.



2)
-   | expr '%' expr { $$ = make_op(yyscanner, "%", $1, $3); }
+   | expr '%' expr { $$ = make_op(yyscanner, "mod", $1, $3); }

why is MOD operation renamed? Do I miss something in thread?


Because I have added MOD as an explicit function to match SQL, so now % is 
just a shorthand for calling it. Before the patch there was only the '%' 
operator. Changing the name is enough for the function to be found.


  pg> SELECT 11 % 3, MOD(11, 3);
  2 | 2

Looking to psql and pgbench scripting implementation, isn't it better to 
integrate lua in psql & pgbench?


Hmmm... if it starts on this slope, everyone will have its opinion (lua, 
tcl, python, ruby, perl, insert-script-name-here...) and it must interact 
with SQL, I'm not sure how to embed SQL & another language cleanly. So the 
idea is just to extend backslash command capabilities of psql & pgbench, 
preferably consistently, when need (i.e. use cases) arises.


Attached a new version with these changes.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 4431fc3..ea8f305 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -904,14 +904,32 @@ pgbench  options  d
  
   Sets variable varname to a value calculated
   from expression.
-  The expression may contain integer constants such as 5432,
+  The expression may contain the NULL constant,
+  boolean constants TRUE and FALSE,
+  integer constants such as 5432,
   double constants such as 3.14159,
   references to variables :variablename,
-  unary operators (+, -) and binary operators
-  (+, -, *, /,
-  %) with their usual precedence and associativity,
-  function calls, and
-  parentheses.
+  operators
+  with their usual SQL precedence and associativity,
+  function calls,
+  SQL CASE generic conditional
+  expressions and parentheses.
+ 
+
+ 
+  Functions and most operators return NULL on
+  NULL input.
+ 
+
+ 
+  For conditional purposes, non zero numerical values are
+  TRUE, zero numerical values and NULL
+  are FALSE.
+ 
+
+ 
+  When no final ELSE clause is provided to a
+  CASE, the default value is NULL.
  
 
  
@@ -920,6 +938,7 @@ pgbench  options  d
 \set ntellers 10 * :scale
 \set aid (1021 * random(1, 10 * :scale)) % \
(10 * :scale) + 1
+\set divx CASE WHEN :x <> 0 THEN :y/:x ELSE NULL END
 
 

@@ -996,6 +1015,177 @@ pgbench  options  d
   
  
 
+ 
+  Built-In Operators
+
+  
+   The arithmetic, bitwise, comparison and logical operators listed in
+are built into pgbench
+   and may be used in expressions appearing in
+   \set.
+  
+
+  
+   pgbench Operators by increasing precedence
+   
+
+ 
+  Operator
+  Description
+  Example
+  Result
+ 
+
+
+ 
+  OR
+  logical or
+  5 or 0
+  TRUE
+ 
+ 
+  AND
+  logical and
+  3 and 0
+  FALSE
+ 
+ 
+  NOT
+  logical not
+  not false
+  TRUE
+ 
+ 
+  IS [NOT] (NULL|TRUE|FALSE)
+  value tests
+  1 is null
+  FALSE
+ 
+ 
+  ISNULL|NOTNULL
+  null tests
+  1 notnull
+  TRUE
+ 
+ 
+  =
+  is equal
+  5 = 4
+  FALSE
+ 
+ 
+  <>
+  is not equal
+  5 <> 4
+  TRUE
+ 
+ 
+  !=
+  is not equal
+  5 != 5
+  FALSE
+ 
+ 
+  <
+  lower than
+  5 < 4
+  FALSE
+ 
+ 
+  <=
+  lower or equal
+  5 <= 4
+  FALSE
+ 
+ 
+  >
+  greater than
+  5 > 4
+  TRUE
+ 
+ 
+  >=
+  greater or equal
+  5 >= 4
+  TRUE
+ 
+ 
+  |
+  in

Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-15 Thread Alvaro Herrera

Alvaro Herrera wrote:

> 3. Robert's: use CREATE INDEX ON ONLY , which creates a shell
>index on parent only (no recursion), followed by CREATE INDEX ON
>.  DETACH is not provided.  If you ATTACH an index for a
>partition that already has one index attached, then (1) the newly
>attached one replaces the original (i.e. effectively REPLACE) or (2)
>you get an error and we implement a separate ALTER INDEX REPLACE
>command.  It's not clear to me how or when the shell index becomes a
>real index.

As I understand the whole purpose of this design is that there is no
point during the restore at which the index lacks indexes on partitions:
it is either complete, or it doesn't exist yet.  If we create the index
on parent first, and later the indexes on partitions, that condition is
not satisfied.  To solve this, we could create the children indexes
first and then the parent; but how do we indicate to the parent creation
which are the indexes that oughta be marked as children?  ALTER INDEX
ATTACH PARTITION doesn't cut it, because it occurs after the index on
parent is created, which is too late.  We would do something like

CREATE INDEX ON parent (columns) [other stuff]
ATTACH idx_on_child1, idx_on_child2, idx_on_child3;

but this seems mighty ugly.

Anyway if we do that, what is the point of ALTER INDEX ATTACH PARTITION?
We might as leave it out and just keep the door open for a later ALTER
INDEX REPLACE.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-15 Thread Robert Haas

On Fri, Dec 15, 2017 at 4:02 PM, Alvaro Herrera  wrote:
> 3. Robert's: use CREATE INDEX ON ONLY , which creates a shell
>index on parent only (no recursion), followed by CREATE INDEX ON
>.  DETACH is not provided.  If you ATTACH an index for a
>partition that already has one index attached, then (1) the newly
>attached one replaces the original (i.e. effectively REPLACE) or (2)
>you get an error and we implement a separate ALTER INDEX REPLACE
>command.  It's not clear to me how or when the shell index becomes a
>real index.

With this proposal, I think the index can be invalid initially, but
once you've attached an index for every child partition, it becomes
irrevocably valid.  After that, the only supported operation is
REPLACE, which preserves validity.

> Robert, can you please clarify the terms of your proposal?  How is it
> better than mine?  Is David's concern about a "partial" index (i.e. an
> index that doesn't exist in some partition) solved by it?

I think the perceived advantage is that, once valid, the index can't
subsequently become not-valid.  That seemed to be David's big concern
(which is not without foundation).

> I have code for proposals 1 and 2.  I don't like proposal 2, and David &
> Ashutosh don't like (1).  Maybe if we all understand (3) we can agree on
> using that one?

Yes, it would be nice to achieve some sort of consensus and I think
(3) gives everyone a little of what they want.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-15 Thread Alvaro Herrera

Robert Haas wrote:
> On Thu, Nov 30, 2017 at 7:02 AM, Alvaro Herrera  
> wrote:
> > Great question.  So you're thinking that the planner might have an
> > interest in knowing what indexes are defined at the parent table level
> > for planning purposes; but for that to actually have any effect we would
> > need to change the planner and executor also.  And one more point, also
> > related to something you said before: we currently (I mean after my
> > patch) don't mark partitioned-table-level indexes as valid or not valid
> > depending on whether all its children exist, so trying to use that in
> > the planner without having a flag could cause invalid plans to be
> > generated (i.e. ones that would cause nonexistent indexes to be
> > referenced).
> 
> Did you do it this way due to locking concerns?

No -- just because since the index-on-parent is a different kind of
object (RELKIND_PARTITIONED_INDEX) it is not considered for anything in
the planner anyway, so there's no need for indisvalid to be "correct".
Changing the flag in the parent index only needs to examine state on the
children, not modify them, so I don't think there would be any serious
locking problem.

By the way, my implementation of ALTER INDEX ATTACH PARTITION was prone
to deadlocks because it needs locks on parent table, index-on-parent,
partition, and index-on-partition; and there was no consideration to the
ordering in which these were being acquired.  I wrote a callback for
RangeVarGetRelidExtended to be called in ATExecAttachPartitionIdx() that
tries to acquire locks in the right order -- but I recently realized
that even that is not sufficient, because (unless I misread it) ALTER
INDEX itself does not worry about locking the containing table before
the index.  I think some tweaking needs to be done in
RangeVarCallbackForAlterRelation to lock the table if an index is being
altered (conditionally, depending on the precise ALTER TABLE operation).
It surprises me that we don't need to do that yet, but I haven't looked
into it any further.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-15 Thread Alvaro Herrera

Robert Haas wrote:
> On Fri, Dec 15, 2017 at 4:02 PM, Alvaro Herrera  
> wrote:
> > 3. Robert's: use CREATE INDEX ON ONLY , which creates a shell
> >index on parent only (no recursion), followed by CREATE INDEX ON
> >.  DETACH is not provided.  If you ATTACH an index for a
> >partition that already has one index attached, then (1) the newly
> >attached one replaces the original (i.e. effectively REPLACE) or (2)
> >you get an error and we implement a separate ALTER INDEX REPLACE
> >command.  It's not clear to me how or when the shell index becomes a
> >real index.
> 
> With this proposal, I think the index can be invalid initially, but
> once you've attached an index for every child partition, it becomes
> irrevocably valid.  After that, the only supported operation is
> REPLACE, which preserves validity.

Sounds okay to me.  (I had already deleted ALTER INDEX DETACH from my
patch earlier today.)  I admit I had also deleted the ONLY clause from
CREATE INDEX, and I don't like having to put it back :-)  But on the
whole, having it sounds better than the alternatives.

We have two options for marking valid:

1. after each ALTER INDEX ATTACH, verify whether the set of partitions
that contain the index is complete; if so, mark it valid, otherwise do
nothing.  This sucks because we have to check that over and over for
every index that we attach

2. We invent yet another command, say
ALTER INDEX  VALIDATE

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

"failed to find parent tuple for heap-only tuple" error as an ERRCODE_DATA_CORRUPTION ereport()

2017-12-15 Thread Peter Geoghegan

Commit d70cf811, from 2014, promoted an Assert() within
IndexBuildHeapScan() to a "can't happen" elog() error, in order to
detect when a parent tuple cannot be found for some heap-only tuple --
if this happens, then it indicates corruption. I think that we should
make it a full ereport(), with an errcode of ERRCODE_DATA_CORRUPTION,
to match what Andres just added to code that deals with freezing (he
promoted Assert()s to errors, just like the 2014 commit, though he
went as far as making them ereport()s to begin with). Attached patch
does this.

I propose a backpatch to 9.3, partially for the sake of tools like
amcheck, where users may only be on the lookout for
ERRCODE_DATA_CORRUPTION and ERRCODE_INDEX_CORRUPTED.

FWIW, an old MultiXact/recovery bug, alluded to by the commit message
of d70cf811 [1] (and fixed by 6bfa88acd) was the cause of some déjà vu
for me while looking into the "freeze the dead" issues. Because the
enhanced amcheck [2] actually raised this error when I went to verify
the first "freeze the dead" bugfix [3], it's clearly effective as a
test for certain types of corruption. If CREATE
INDEX/IndexBuildHeapScan() didn't already perform this check, then it
would probably be necessary for amcheck to implement it on its own.
What heap_get_root_tuples() does for us here is ideally suited to
finding inconsistencies in HOT chains, because it matches xmin against
xmax, looks at line pointer bits/redirects, and consults pg_multixact
if necessary. The only thing that it *doesn't* do is make sure that
hint bits accurately reflect what it says in the CLOG -- we'll need to
find another way to do that, by directly targeting heap relations with
their own function. In short, it does an awful lot for tools like
amcheck, and I want to make sure that we get the full benefit of that.

[1] 
https://www.postgresql.org/message-id/CAM3SWZTMQiCi5PV5OWHb+bYkUcnCk=o67w0csswpvv7xfuc...@mail.gmail.com
[2] 
https://github.com/petergeoghegan/amcheck#optional-heapallindexed-verification
[3] 
https://postsgr.es/m/cah2-wznm4rcrhfaiwkpwtpew2bxdtgrozk7jwwgucxeh3d1...@mail.gmail.com
-- 
Peter Geoghegan
From 0a37ceea53736e39d0f1af72ec9bc07d98833370 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Fri, 15 Dec 2017 14:05:16 -0800
Subject: [PATCH] Promote "HOT parent tuple" elog to an ereport.

Commit d70cf811, from 2014, promoted an assert within
IndexBuildHeapScan() (and another similar assert) to a "can't happen"
elog() error, in order to detect when a parent tuple cannot be found for
some heap-only tuple during CREATE INDEX/REINDEX.  When the error is
raised, it indicates heap corruption.  This has proven useful as a
corruption smoke-test while investigating recent corruption-inducing
bugs in pruning/freezing.

This commit promotes those elog() errors to ereport() errors.  The
errors are emitted with ereport rather than elog, despite being "should
never happen" messages, so a proper error code is emitted.  To avoid
superfluous translations, mark messages as internal.

This is done primarily for the benefit of tools like amcheck, that may
want to piggy-back on IndexBuildHeapScan() to perform integrity
checking.  It's worth spelling out the nature of the problem for users
of these tools.

Author: Peter Geoghegan
Reviewed-By: Andres Freund
Backpatch: 9.3-
---
 src/backend/catalog/index.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 0125c18..e43482d 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2595,9 +2595,12 @@ IndexBuildHeapRangeScan(Relation heapRelation,
 			offnum = ItemPointerGetOffsetNumber(&heapTuple->t_self);
 
 			if (!OffsetNumberIsValid(root_offsets[offnum - 1]))
-elog(ERROR, "failed to find parent tuple for heap-only tuple at (%u,%u) in table \"%s\"",
-	 ItemPointerGetBlockNumber(&heapTuple->t_self),
-	 offnum, RelationGetRelationName(heapRelation));
+ereport(ERROR,
+		(errcode(ERRCODE_DATA_CORRUPTED),
+		 errmsg_internal("failed to find parent tuple for heap-only tuple at (%u,%u) in table \"%s\"",
+		 ItemPointerGetBlockNumber(&heapTuple->t_self),
+		 offnum,
+		 RelationGetRelationName(heapRelation;
 
 			ItemPointerSetOffsetNumber(&rootTuple.t_self,
 	   root_offsets[offnum - 1]);
@@ -3060,10 +3063,12 @@ validate_index_heapscan(Relation heapRelation,
 		{
 			root_offnum = root_offsets[root_offnum - 1];
 			if (!OffsetNumberIsValid(root_offnum))
-elog(ERROR, "failed to find parent tuple for heap-only tuple at (%u,%u) in table \"%s\"",
-	 ItemPointerGetBlockNumber(heapcursor),
-	 ItemPointerGetOffsetNumber(heapcursor),
-	 RelationGetRelationName(heapRelation));
+ereport(ERROR,
+		(errcode(ERRCODE_DATA_CORRUPTED),
+		 errmsg_internal("failed to find parent tuple for heap-only tuple at (%u,%u) in table \"%s\"",
+		 ItemPointerGetBlockNumber(heapcursor),
+		 ItemPointerGet

Re: [HACKERS] replace GrantObjectType with ObjectType

2017-12-15 Thread Michael Paquier

On Sat, Dec 16, 2017 at 2:39 AM, Peter Eisentraut
 wrote:
> On 12/13/17 02:35, Michael Paquier wrote:
>> Patch 0001 is simply removing EventTriggerSupportsGrantObjectType(),
>> but shouldn't we keep it and return an error for objects that have no
>> GRANT support? Returning conditionally true looks like a trap waiting
>> to take someone in.
>
> I don't understand the motivation for this.  It would just be two lists
> for the same thing.

Not really. What grant supports is a subset of what event triggers do.

> I think the potential for omission would be much greater that way.

That's the whole point of not having "default" in the switches, no?
Any object additions would be caught at compilation time.
-- 
Michael

Re: genomic locus

2017-12-15 Thread Michael Paquier

On Sat, Dec 16, 2017 at 4:49 AM, Gene Selkov  wrote:
> I noticed somebody took a good care of seg while I was away for the last 20
> years, and I am extremely grateful for that. I have been using it. In the
> meantime, things have changed and now I am almost clueless about how you
> deal with contributed modules and what steps I should take once I get it to
> work. Also, is it a good idea to clone and fix seg for this purpose, or is
> there a more appropriate template? Or maybe just building it from scratch
> will be a better idea?

Wah. You are the author of the seg module and your name is listed in a
set of commits from 2000:
https://www.postgresql.org/docs/devel/static/seg.html
seg is now shaped as what is called an extension (see
https://www.postgresql.org/docs/9.6/static/external-extensions.html),
which is made roughly a library and a set of SQL commands that the
server can load automatically after enabling them with the command
CREATE EXTENSION. If you wish to fix seg in some way, you could always
patch them. But I am not sure what you are trying to fix, so more
details would be welcome.

> I have seen a lot of bit rot in other extensions (never contributed) that I
> have not maintained since 2009 and I now I am unable to fix some of them, so
> I wonder how much of old knowledge is still applicable. In other words, is
> what I see in new code just a change of macros or the change of principles?

APIs in Postgres are usually stable. You should be able to update your
own extensions. If you want to discuss about a couple of things in
particular, don't hesitate!
-- 
Michael

Re: Reproducible builds: genbki.pl vs schemapg.h

2017-12-15 Thread Michael Paquier

On Sat, Dec 16, 2017 at 3:13 AM, Christoph Berg
 wrote:
> Re: Tom Lane 2017-12-15 <9616.1513351...@sss.pgh.pa.us>
>> Christoph Berg  writes:
>> > Debian's reproducible builds project has revealed that the full build
>> > path gets embedded into server/catalog/schemapg.h:
>>
>> genbki.pl is hardly our only script that prints its $0 ...
>
> As per
> https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/postgresql-10.html,
> that's the only place that makes it into the resulting binary.
> I wouldn't be sending a patch if it didn't fix the issue.

Why not fixing that? Reproducible builds are a trend of these days,
and what's proposed here is really simple to make PG more compliant
with this purpose in mind.
-- 
Michael

Re: Reproducible builds: genbki.pl vs schemapg.h

2017-12-15 Thread Andres Freund

On 2017-12-16 07:52:41 +0900, Michael Paquier wrote:
> On Sat, Dec 16, 2017 at 3:13 AM, Christoph Berg
>  wrote:
> > Re: Tom Lane 2017-12-15 <9616.1513351...@sss.pgh.pa.us>
> >> Christoph Berg  writes:
> >> > Debian's reproducible builds project has revealed that the full build
> >> > path gets embedded into server/catalog/schemapg.h:
> >>
> >> genbki.pl is hardly our only script that prints its $0 ...
> >
> > As per
> > https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/postgresql-10.html,
> > that's the only place that makes it into the resulting binary.
> > I wouldn't be sending a patch if it didn't fix the issue.
> 
> Why not fixing that? Reproducible builds are a trend of these days,
> and what's proposed here is really simple to make PG more compliant
> with this purpose in mind.

It's not like $0 instead of a hardcoded name in the header actually buys
us anything afaict.

Greetings,

Andres Freund

Re: Reproducible builds: genbki.pl vs schemapg.h

2017-12-15 Thread Peter Geoghegan

On Fri, Dec 15, 2017 at 3:21 PM, Andres Freund  wrote:
>> Why not fixing that? Reproducible builds are a trend of these days,
>> and what's proposed here is really simple to make PG more compliant
>> with this purpose in mind.
>
> It's not like $0 instead of a hardcoded name in the header actually buys
> us anything afaict.

+1.

I think that reproducible builds are a worthwhile goal, and I welcome
Christoph's continued work on them.

-- 
Peter Geoghegan

Re: Reproducible builds: genbki.pl vs schemapg.h

2017-12-15 Thread Tom Lane

Andres Freund  writes:
> On 2017-12-16 07:52:41 +0900, Michael Paquier wrote:
>> On Sat, Dec 16, 2017 at 3:13 AM, Christoph Berg
>>  wrote:
>>> Re: Tom Lane 2017-12-15 <9616.1513351...@sss.pgh.pa.us>
 genbki.pl is hardly our only script that prints its $0 ...

>>> As per
>>> https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/postgresql-10.html,
>>> that's the only place that makes it into the resulting binary.

I'm fairly confused by this claim.  Since the string in question is in a
comment, it really shouldn't affect built binaries at all.  I can believe
that it would affect the non-binary contents of the finished package,
because we include schemapg.h as one of the installed headers.  However,
we also include fmgroids.h, which also has an expanded version of $0
in it.  So if schemapg.h is affected by build path, why isn't fmgroids.h?

In my build, neither one of these files contains any path information;
I speculate that you need to use a VPATH build to have an issue, or
maybe Debian's build environment does something even weirder.  But
I feel confident in saying that if indeed fmgroids.h is invariant for
you today, that's a phase-of-the-moon behavior that will break someday
if we don't make a similar change in Gen_fmgrtab.pl.  Or else we should
propagate what's preventing it back to genbki.pl.

plperl's text2macro.pl is also emitting $0, and there may be other places
I missed (I grepped for 'DO NOT EDIT', not for $0 per se).  We seem not
to install the output file of that one, perlchunks.h, but that's a
decision that somebody might change too.

> It's not like $0 instead of a hardcoded name in the header actually buys
> us anything afaict.

Agreed so far as the script name goes.  However, two out of three of these
scripts also print their input file names, and I'm suspicious that that
output is also gonna change in a VPATH build.  I'm a little less inclined
to buy the claim that we're not losing anything if we suppress that :-(

regards, tom lane

Re: Backfill bgworker Extension?

2017-12-15 Thread Jeremy Finzel

On Tue, Dec 12, 2017 at 2:26 PM Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 12/12/17 13:03, Jeremy Finzel wrote:
> > To be clear, what I mean is batch updating a large set of data in small
> > pieces so as to avoid things like lock contention and replication lags.
> > Sometimes these have a driving table that has the source data to update
> > in a destination table based on a key column, but sometimes it is
> > something like setting just a single specific value for a huge table.
> >
> > I would love instead to have a Postgres extension that uses postgres
> > background workers to accomplish this, especially if it were part of
> > core.  Before I venture into exploring writing something like this as an
> > extension, would this ever be considered something appropriate as an
> > extension in Postgres core?  Would that be appropriate?
>
> I don't see what the common ground between different variants of this
> use case would be.  Aren't you basically just looking to execute a
> use-case-specific stored procedure in the background?
>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

The common ground is some column in some table needs to be bulk updated. I
may not be explaining well, but in our environment we have done hundreds of
these using a generic framework to build a backfill. So I’m not sure what
you are questioning about the need? We have had to build a worker to
accomplish this because it can’t be done as a sql script alone.

I’m not sure what you mean by a stored procedure in the background. Since
it would not be a single transaction, it doesn’t fit as a stored procedure
at least in Postgres when a function is 1 transaction.

Sorry if I’m misunderstanding.

Thanks,
Jeremy

Re: [HACKERS] Runtime Partition Pruning

2017-12-15 Thread David Rowley

On 13 December 2017 at 00:33, Beena Emerson  wrote:
> PFA the updated patch, this can be applied over the v13 patches [1]
> over commit 487a0c1518af2f3ae2d05b7fd23d636d687f28f3

Hi Beena,

Thanks for posting an updated patch.

I've been looking over this and I think that the use of the
PartitionDispatch in set_append_subplan_indexes is not correct. What
we need here is the index of the Append's subnode and that's not what
RelationGetPartitionDispatchInfo() gives you. Remember that some
partitions could have been pruned away already during planning.

This quick example shows that the partition selection is not correct.

create table p (a int, b int) partition by range (a);

create table p_a_neg partition of p for values from (minvalue) to (0)
partition by range (b);
create table p_a_pos partition of p for values from (0) to (maxvalue)
partition by range (b);

create table p_a_neg_b_neg partition of p_a_neg for values from
(minvalue) to (0);
create table p_a_neg_b_pos partition of p_a_neg for values from (0) to
(maxvalue);

create table p_a_pos_b_neg partition of p_a_pos for values from
(minvalue) to (0);
create table p_a_pos_b_pos partition of p_a_pos for values from (0) to
(maxvalue);

prepare q1 (int, int) as select * from p where a = $1 and b = $1;

explain analyze execute q1 (-1,-1); -- this works.
  QUERY PLAN
--
 Append  (cost=0.00..175.60 rows=4 width=8) (actual time=1.099..1.099
rows=0 loops=1)
   Runtime Partition Pruning: ((a = $1) AND (b = $1))
   ->  Seq Scan on p_a_neg_b_neg  (cost=0.00..43.90 rows=1 width=8)
(actual time=0.023..0.023 rows=0 loops=1)
 Filter: ((a = $1) AND (b = $1))
   ->  Seq Scan on p_a_neg_b_pos  (cost=0.00..43.90 rows=1 width=8)
(never executed)
 Filter: ((a = $1) AND (b = $1))
   ->  Seq Scan on p_a_pos_b_neg  (cost=0.00..43.90 rows=1 width=8)
(never executed)
 Filter: ((a = $1) AND (b = $1))
   ->  Seq Scan on p_a_pos_b_pos  (cost=0.00..43.90 rows=1 width=8)
(never executed)
 Filter: ((a = $1) AND (b = $1))
(12 rows)

explain analyze execute q1 (-1,1); -- should scan p_a_neg_b_pos, but does not.
  QUERY PLAN
--
 Append  (cost=0.00..175.60 rows=4 width=8) (actual
time=758996.359..758996.359 rows=0 loops=1)
   Runtime Partition Pruning: ((a = $1) AND (b = $1))
   ->  Seq Scan on p_a_neg_b_neg  (cost=0.00..43.90 rows=1 width=8)
(actual time=0.056..0.056 rows=0 loops=1)
 Filter: ((a = $1) AND (b = $1))
   ->  Seq Scan on p_a_neg_b_pos  (cost=0.00..43.90 rows=1 width=8)
(never executed)
 Filter: ((a = $1) AND (b = $1))
   ->  Seq Scan on p_a_pos_b_neg  (cost=0.00..43.90 rows=1 width=8)
(never executed)
 Filter: ((a = $1) AND (b = $1))
   ->  Seq Scan on p_a_pos_b_pos  (cost=0.00..43.90 rows=1 width=8)
(never executed)
 Filter: ((a = $1) AND (b = $1))
(12 rows)

So, I started to look at what the best way to put this right might be.
I see that since Parallel Append was committed that the subnodes are
now sorted in cost order inside create_append_path(), so likely we'll
want to delay figuring out the subpath list indexes until after that's
done since sorting would scramble our index arrays. We could simply
look at the subpaths at the end of create_append_path() and create
some sort of new matrix type that can accept the output of Amit's
get_partitions_from_clauses() and translate that Bitmapset into the
subpath indexes (another Bitmapset). This will also need to work for
sub-partitions too, so this matrix must be some sort of tree that we
can descend into when we see that get_partitions_from_clauses returned
a bit for a sub-partition instead of a leaf-partition.

I bashed this idea around a bit and I came up with the attached. It's
very far from complete and in a very WIP state. I've not really done
anything to make the correct clause list available in nodeAppend.c
yet, but I think the code that's there is worthy of a look. I've not
done that much work on the new choose_next_subplan* functions in
nodeAppend.c. I just modified choose_next_subplan_locally to show how
this set of functions need to take into account the subnode bitmap set
of valid partitions to scan. Perhaps some special case is needed to
have these functions ignore the Bitmapset when runtime pruning is
disabled (perhaps a completely new set of the functions is needed to
support the selection of the next non-pruned partition). Although,
probably that can be debated a bit later as it's a fairly minor detail
for now.

My patch also lacks any means to extract the Params during
match_clauses_to_partkey(), or at least most of the cases. I've just
added 1 case there. I did this because I thought it was better to
extract the ParamIds rather than a bool to

Re: [sqlsmith] Parallel worker executor crash on master

2017-12-15 Thread Amit Kapila

On Sat, Dec 16, 2017 at 12:57 AM, Andreas Seltenreich
 wrote:
> Hi,
>
> sqlsmith just crashed a parallel worker while testing master at
> 699bf7d05c.  I can reproduce it with the following recipe on a fresh
> regression database.  Backtrace and query plan below as well.
>

This seems to be another symptom of the problem related to
es_query_dsa for which Thomas has sent a patch on a different thread
[1].  After applying that patch, I am not able to see the problem.  I
think due to the wrong usage of dsa across nodes, it can lead to
sending some wrong values for params to workers.

[1] - 
https://www.postgresql.org/message-id/CAEepm%3D0Mv9BigJPpribGQhnHqVGYo2%2BkmzekGUVJJc9Y_ZVaYA%40mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

65 matches

Mail list logo