date:20180401

On Sat, Mar 31, 2018 at 5:38 PM, Tomas Vondra 
wrote:

> On 03/31/2018 05:05 PM, Magnus Hagander wrote:
> > On Sat, Mar 31, 2018 at 4:21 PM, Tomas Vondra
> > mailto:tomas.von...@2ndquadrant.com>>
> wrote:
> >
> > ...
> >
> > I do think just waiting for all running transactions to complete is
> > fine, and it's not the first place where we use it - CREATE
> SUBSCRIPTION
> > does pretty much exactly the same thing (and CREATE INDEX
> CONCURRENTLY
> > too, to some extent). So we have a precedent / working code we can
> copy.
> >
> >
> > Thinking again, I don't think it should be done as part of
> > BuildRelationList(). We should just do it once in the launcher before
> > starting, that'll be both easier and cleaner. Anything started after
> > that will have checksums on it, so we should be fine.
> >
> > PFA one that does this.
> >
>
> Seems fine to me. I'd however log waitforxid, not the oldest one. If
> you're a DBA and you want to make the checksumming to proceed, knowing
> the oldest running XID is useless for that. If we log waitforxid, it can
> be used to query pg_stat_activity and interrupt the sessions somehow.
>

Yeah, makes sense. Updated.



> > > And if you try this with a temporary table (not hidden in
> transaction,
> > > so the bgworker can see it), the worker will fail with this:
> > >
> > >   ERROR:  cannot access temporary tables of other sessions
> > >
> > > But of course, this is just another way how to crash without
> updating
> > > the result for the launcher, so checksums may end up being
> enabled
> > > anyway.
> > >
> > >
> > > Yeah, there will be plenty of side-effect issues from that
> > > crash-with-wrong-status case. Fixing that will at least make things
> > > safer -- in that checksums won't be enabled when not put on all
> pages.
> > >
> >
> > Sure, the outcome with checksums enabled incorrectly is a
> consequence of
> > bogus status, and fixing that will prevent that. But that wasn't my
> main
> > point here - not articulated very clearly, though.
> >
> > The bigger question is how to handle temporary tables gracefully, so
> > that it does not terminate the bgworker like this at all. This might
> be
> > even bigger issue than dropped relations, considering that temporary
> > tables are pretty common part of applications (and it also includes
> > CREATE/DROP).
> >
> > For some clusters it might mean the online checksum enabling would
> > crash+restart infinitely (well, until reaching MAX_ATTEMPTS).
> >
> > Unfortunately, try_relation_open() won't fix this, as the error comes
> > from ReadBufferExtended. And it's not a matter of simply creating a
> > ReadBuffer variant without that error check, because temporary tables
> > use local buffers.
> >
> > I wonder if we could just go and set the checksums anyway, ignoring
> the
> > local buffers. If the other session does some changes, it'll
> overwrite
> > our changes, this time with the correct checksums. But it seems
> pretty
> > dangerous (I mean, what if they're writing stuff while we're updating
> > the checksums? Considering the various short-cuts for temporary
> tables,
> > I suspect that would be a boon for race conditions.)
> >
> > Another option would be to do something similar to running
> transactions,
> > i.e. wait until all temporary tables (that we've seen at the
> beginning)
> > disappear. But we're starting to wait on more and more stuff.
> >
> > If we do this, we should clearly log which backends we're waiting
> for,
> > so that the admins can go and interrupt them manually.
> >
> >
> >
> > Yeah, waiting for all transactions at the beginning is pretty simple.
> >
> > Making the worker simply ignore temporary tables would also be easy.
> >
> > One of the bigger issues here is temporary tables are *session* scope
> > and not transaction, so we'd actually need the other session to finish,
> > not just the transaction.
> >
> > I guess what we could do is something like this:
> >
> > 1. Don't process temporary tables in the checksumworker, period.
> > Instead, build a list of any temporary tables that existed when the
> > worker started in this particular database (basically anything that we
> > got in our scan). Once we have processed the complete database, keep
> > re-scanning pg_class until those particular tables are gone (search by
> oid).
> >
> > That means that any temporary tables that are created *while* we are
> > processing a database are ignored, but they should already be receiving
> > checksums.
> >
> > It definitely leads to a potential issue with long running temp tables.
> > But as long as we look at the *actual tables* (by oid), we should be
> > able to handle long-running sessions once they have dropped their temp
> > tables.
> >
> > Does that sound workable to you?
> >
>
> Yes, that's pretty much what I meant b

Diagonal storage model

2018-04-01 Thread Konstantin Knizhnik


Hi hackers,

Vertical (columnar) storage mode is most optimal for analytic and this is why 
it is widely used in databases oriented on OLAP, such as Vertica, HyPer,KDB,...
In Postgres we have cstore extension which is not able to provide all benefits 
of vertical model because of lack of support of vector operations in executor.
Situation can be changed if we will have pluggable storage API with support of 
vectorized execution.

But veritcal model is not so good for updates and load of data (because data is 
mostly imported in horizontal format).
This is why in most of the existed systems data is presentin both formats (at 
least for some time).

I want to announce new model, "diagonal storage" which combines benefits of 
both approaches.
The idea is very simple: we first store column 1 of first record, then column 2 
of second record, ... and so on until we reach the last column.
After it we store second column of first record, third column of the second 
record,...

Profiling of TPC-H queries shows that mode of the time of query exectution 
(about 17%) is spent is heap_deform_tuple.
New format will allow to significantly reduce time of heap deforming, because 
there is just of column if the particular record in each tile.
Moreover over we can perform deforming of many tuples in parallel, which ids 
especially efficient at quantum computers.

Attach please find patch with first prototype implementation. It provides about 
3.14 times improvement of performance at most of TPC-H queries.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



diagonal.patch.gz
Description: GNU Zip compressed data

Re: new function for tsquery creartion

2018-04-01 Thread Aleksandr Parfenov


Hello hackers,

On 2018-03-28 12:21, Aleksander Alekseev wrote:

It doesn't sound right to me to accept any input as a general rule but
sometimes return errors nevertheless. That API would be complicated for
the users. Thus I suggest to accept any garbage and try our best to
interpret it.


I agree with Aleksander about silencing all errors in 
websearch_to_tsquery().


In the attachment is a revised patch with the attempt to introduce an 
ability to ignore syntax errors in gettoken_tsvector().
I'm also read through the patch and all the code looks good to me except 
one thing.
The name of enum ts_parsestate looks more like a name of the function 
than a name of a type.
In my version, it renamed to QueryParserState, but you can fix it if I'm 
wrong.


--
Aleksandr Parfenov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/src/backend/tsearch/to_tsany.c b/src/backend/tsearch/to_tsany.c
index ea5947a3a8..bdf05236cf 100644
--- a/src/backend/tsearch/to_tsany.c
+++ b/src/backend/tsearch/to_tsany.c
@@ -390,7 +390,8 @@ add_to_tsvector(void *_state, char *elem_value, int elem_len)
  * and different variants are ORed together.
  */
 static void
-pushval_morph(Datum opaque, TSQueryParserState state, char *strval, int lenval, int16 weight, bool prefix)
+pushval_morph(Datum opaque, TSQueryParserState state, char *strval, int lenval,
+			  int16 weight, bool prefix, bool force_phrase)
 {
 	int32		count = 0;
 	ParsedText	prs;
@@ -423,7 +424,12 @@ pushval_morph(Datum opaque, TSQueryParserState state, char *strval, int lenval,
 	/* put placeholders for each missing stop word */
 	pushStop(state);
 	if (cntpos)
-		pushOperator(state, data->qoperator, 1);
+	{
+		if (force_phrase)
+			pushOperator(state, OP_PHRASE, 1);
+		else
+			pushOperator(state, data->qoperator, 1);
+	}
 	cntpos++;
 	pos++;
 }
@@ -464,7 +470,10 @@ pushval_morph(Datum opaque, TSQueryParserState state, char *strval, int lenval,
 			if (cntpos)
 			{
 /* distance may be useful */
-pushOperator(state, data->qoperator, 1);
+if (force_phrase)
+	pushOperator(state, OP_PHRASE, 1);
+else
+	pushOperator(state, data->qoperator, 1);
 			}
 
 			cntpos++;
@@ -490,6 +499,7 @@ to_tsquery_byid(PG_FUNCTION_ARGS)
 	query = parse_tsquery(text_to_cstring(in),
 		  pushval_morph,
 		  PointerGetDatum(&data),
+		  false,
 		  false);
 
 	PG_RETURN_TSQUERY(query);
@@ -520,7 +530,8 @@ plainto_tsquery_byid(PG_FUNCTION_ARGS)
 	query = parse_tsquery(text_to_cstring(in),
 		  pushval_morph,
 		  PointerGetDatum(&data),
-		  true);
+		  true,
+		  false);
 
 	PG_RETURN_POINTER(query);
 }
@@ -551,7 +562,8 @@ phraseto_tsquery_byid(PG_FUNCTION_ARGS)
 	query = parse_tsquery(text_to_cstring(in),
 		  pushval_morph,
 		  PointerGetDatum(&data),
-		  true);
+		  true,
+		  false);
 
 	PG_RETURN_TSQUERY(query);
 }
@@ -567,3 +579,36 @@ phraseto_tsquery(PG_FUNCTION_ARGS)
 		ObjectIdGetDatum(cfgId),
 		PointerGetDatum(in)));
 }
+
+Datum
+websearch_to_tsquery_byid(PG_FUNCTION_ARGS)
+{
+	text	   *in = PG_GETARG_TEXT_PP(1);
+	MorphOpaque	data;
+	TSQuery		query = NULL;
+
+	data.cfg_id = PG_GETARG_OID(0);
+
+	data.qoperator = OP_AND;
+
+	query = parse_tsquery(text_to_cstring(in),
+		  pushval_morph,
+		  PointerGetDatum(&data),
+		  false,
+		  true);
+
+	PG_RETURN_TSQUERY(query);
+}
+
+Datum
+websearch_to_tsquery(PG_FUNCTION_ARGS)
+{
+	text	   *in = PG_GETARG_TEXT_PP(0);
+	Oid			cfgId;
+
+	cfgId = getTSCurrentConfig(true);
+	PG_RETURN_DATUM(DirectFunctionCall2(websearch_to_tsquery_byid,
+		ObjectIdGetDatum(cfgId),
+		PointerGetDatum(in)));
+
+}
diff --git a/src/backend/utils/adt/tsquery.c b/src/backend/utils/adt/tsquery.c
index 1ccbf79030..00e6218691 100644
--- a/src/backend/utils/adt/tsquery.c
+++ b/src/backend/utils/adt/tsquery.c
@@ -32,12 +32,24 @@ const int	tsearch_op_priority[OP_COUNT] =
 	3			/* OP_PHRASE */
 };
 
+/*
+ * parser's states
+ */
+typedef enum
+{
+	WAITOPERAND = 1,
+	WAITOPERATOR = 2,
+	WAITFIRSTOPERAND = 3,
+	WAITSINGLEOPERAND = 4,
+	INQUOTES = 5 /* for quoted phrases in web search */
+} QueryParserState;
+
 struct TSQueryParserStateData
 {
 	/* State for gettoken_query */
 	char	   *buffer;			/* entire string we are scanning */
 	char	   *buf;			/* current scan point */
-	int			state;
+	QueryParserState state;
 	int			count;			/* nesting count, incremented by (,
  * decremented by ) */
 
@@ -57,12 +69,6 @@ struct TSQueryParserStateData
 	TSVectorParseState valstate;
 };
 
-/* parser's states */
-#define WAITOPERAND 1
-#define WAITOPERATOR	2
-#define WAITFIRSTOPERAND 3
-#define WAITSINGLEOPERAND 4
-
 /*
  * subroutine to parse the modifiers (weight and prefix flag currently)
  * part, like ':AB*' of a query.
@@ -197,6 +203,21 @@ err:
 	return buf;
 }
 
+/*
+ * Parse OR operator used in websearch_to_tsquery().
+ */
+static bool
+parse_or_operator(char

Re: Add default role 'pg_access_server_files'

2018-04-01 Thread Stephen Frost

Greetings,

* Michael Paquier (mich...@paquier.xyz) wrote:
> On Sun, Mar 25, 2018 at 09:43:25PM -0400, Stephen Frost wrote:
> > * Michael Paquier (mich...@paquier.xyz) wrote:
> >> On Thu, Mar 08, 2018 at 10:15:11AM +0900, Michael Paquier wrote:
> >> > Other than that the patch looks in pretty good shape to me.
> >> 
> >> The regression tests of file_fdw are blowing up because of an error
> >> string patch 2 changes.
> > 
> > Fixed in the attached.
> 
> Thanks for the updated version.  This test is fixed.

Thanks for checking.  Attached is an updated version which also includes
the changes for adminpack, done in a similar manner to how pgstattuple
was updated, as discussed.  Regression tests updated and extended a bit,
doc updates also included.

If you get a chance to take a look, that'd be great.  I'll do my own
review of it again also after stepping away for a day or so.

Thanks!

Stephen
From 296b407863a7259a04e5e8cfc19f9b8ea124777c Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Wed, 7 Mar 2018 06:42:42 -0500
Subject: [PATCH 1/3] Remove explicit superuser checks in favor of ACLs

This removes the explicit superuser checks in the various file-access
functions in the backend, specifically pg_ls_dir(), pg_read_file(),
pg_read_binary_file(), and pg_stat_file().  Instead, EXECUTE is REVOKE'd
from public for these, meaning that only a superuser is able to run them
by default, but access to them can be GRANT'd to other roles.

Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net
---
 src/backend/catalog/system_views.sql | 14 ++
 src/backend/utils/adt/genfile.c  | 20 
 2 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5e6e8a64f6..559610b12f 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1149,6 +1149,20 @@ REVOKE EXECUTE ON FUNCTION lo_export(oid, text) FROM public;
 REVOKE EXECUTE ON FUNCTION pg_ls_logdir() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_ls_waldir() FROM public;
 
+REVOKE EXECUTE ON FUNCTION pg_read_file(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_read_file(text,bigint,bigint) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_read_file(text,bigint,bigint,boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text,bigint,bigint) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text,bigint,bigint,boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_stat_file(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_file(text,boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_ls_dir(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_ls_dir(text,boolean,boolean) FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index d9027fc688..a4c0f6d5ca 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -195,11 +195,6 @@ pg_read_file(PG_FUNCTION_ARGS)
 	char	   *filename;
 	text	   *result;
 
-	if (!superuser())
-		ereport(ERROR,
-(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- (errmsg("must be superuser to read files";
-
 	/* handle optional arguments */
 	if (PG_NARGS() >= 3)
 	{
@@ -236,11 +231,6 @@ pg_read_binary_file(PG_FUNCTION_ARGS)
 	char	   *filename;
 	bytea	   *result;
 
-	if (!superuser())
-		ereport(ERROR,
-(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- (errmsg("must be superuser to read files";
-
 	/* handle optional arguments */
 	if (PG_NARGS() >= 3)
 	{
@@ -313,11 +303,6 @@ pg_stat_file(PG_FUNCTION_ARGS)
 	TupleDesc	tupdesc;
 	bool		missing_ok = false;
 
-	if (!superuser())
-		ereport(ERROR,
-(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- (errmsg("must be superuser to get file information";
-
 	/* check the optional argument */
 	if (PG_NARGS() == 2)
 		missing_ok = PG_GETARG_BOOL(1);
@@ -399,11 +384,6 @@ pg_ls_dir(PG_FUNCTION_ARGS)
 	directory_fctx *fctx;
 	MemoryContext oldcontext;
 
-	if (!superuser())
-		ereport(ERROR,
-(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- (errmsg("must be superuser to get directory listings";
-
 	if (SRF_IS_FIRSTCALL())
 	{
 		bool		missing_ok = false;
-- 
2.14.1


From 76ba6f1eef402070ca1ff37f74e5dcfc639f6837 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Sun, 31 Dec 2017 14:01:12 -0500
Subject: [PATCH 2/3] Add default roles for file/program access

This patch adds new default roles names 'pg_read_server_files',
'pg_write_server_files', 'pg_execute_server_program' which
allow an administrator to GRANT to a non-superuser role the ability to
access server-side files or run programs through PostgreSQL (as the user
the database is running as).  Having one of these roles allows a
non-superuser to use server-side COPY to read, write, or with a program,
a

Re: [HACKERS] Re: Improve OR conditions on joined columns (common star schema problem)

2018-04-01 Thread David Rowley

On 3 February 2018 at 03:26, Tom Lane  wrote:
> Tomas Vondra  writes:
>> ISTM this patch got somewhat stuck as we're not quite sure the
>> transformation is correct in all cases. Is my impression correct?
>
> Yeah, that's the core issue.
>
>> If yes, how to we convince ourselves? Would some sort of automated testing
>> (generating data and queries) help? I'm willing to spend some cycles on
>> that, if considered helpful.
>
> I'm not sure if that would be enough to convince doubters.  On the other
> hand, if it found problems, that would definitely be useful.

I've read over this thread and as far as I can see there is concern
that the UNION on the ctids might not re-uniquify the rows again. Tom
mentioned a problem with FULL JOINs, but the concern appears to have
been invalidated due to wrong thinking about how join removals work.

As of now, I'm not quite sure who exactly is concerned. Tom thought
there was an issue but quickly corrected himself.

As far as I see it, there are a few cases we'd need to disable the optimization:

1. Query contains target SRFs (the same row might get duplicated, we
don't want to UNION these duplicates out again, they're meant to be
there)
2. Query has setops (ditto)
3. Any base rels are not RELKIND_RELATION (we need the ctid to
uniquely identify rows)
4. Query has volatile functions (don't want to evaluate volatile
functions more times than requested)

As far as the DISTINCT clause doing the right thing for joins, I see
no issues, even with FULL JOINs. In both branches of the UNION the
join condition will be the same so each side of the join always has
the same candidate row to join to.  I don't think the optimization is
possible if there are OR clauses in the join condition, but that's not
being proposed.

FULL JOINS appear to be fine as the row is never duplicated on a
non-match, so there will only be one version of t1.ctid, NULL::tid or
NULL::tid, t1.ctid and all ctids in the distinctClauses cannot all be
NULL at once.

I used the following SQL to help my brain think through this. There
are two versions of each query, one with DISTINCT and one without. If
the DISTINCT returns fewer rows than the one without then we need to
disable this optimization for that case. I've written queries for 3 of
the above 4 cases. I saw from reading the thread that case #4 is
already disabled:

drop table if exists t1,t2;
create table t1 (a int);
create table t2 (a int);

insert into t1 values(1),(1),(2),(4);
insert into t2 values(1),(1),(3),(3),(4),(4);

select t1.ctid,t2.ctid from t1 full join t2 on t1.a = t2.a;

select distinct t1.ctid,t2.ctid from t1 full join t2 on t1.a = t2.a;

-- case 1: must disable in face of tSRFs

select ctid from (select ctid,generate_Series(1,2) from t1) t;

select distinct ctid from (select ctid,generate_Series(1,2) from t1) t;

-- case 2: must disable optimization with setops.

select ctid from (select ctid from t1 union all select ctid from t1) t;

select distinct ctid from (select ctid from t1 union all select ctid from t1) t;

-- case 3: must disable if we join to anything other than a
RELKIND_RELATION (no ctid)

select ctid from (select t1.ctid from t1 inner join (values(1),(1))
x(x) on t1.a = x.x) t;

select distinct ctid from (select t1.ctid from t1 inner join
(values(1),(1)) x(x) on t1.a = x.x) t;

I've not read the patch yet, but I understand what it's trying to
achieve. My feelings about the patch are that it would be useful to
have. I think if someone needs this then they'll be very happythat
we've added it. I also think there should be a GUC to disable it, and
it should be enabled through the entire alpha/beta period, and we
should consider what the final value for it should be just before RC1.
It's a bit sad to exclude foreign tables, and I'm not too sure what
hurdles this leaves out for pluggable storage. No doubt we'll need to
disable the optimization for those too unless they can provide us with
some row identifier.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Re: Improve OR conditions on joined columns (common star schema problem)

2018-04-01 Thread David Rowley

On 30 March 2018 at 15:05, Andres Freund  wrote:
>> + * To allow join removal to happen, we can't reference the CTID column
>> + * of an otherwise-removable relation.
>
> A brief hint why wouldn't hurt.

Maybe something like:

/*
 * Join removal is only ever possible when no columns of the
to-be-removed relation
 * are referenced.  If we added the CTID here then we could
inadvertently disable join
 * removal.  We'll need to delay adding the CTID until after join
removal takes place.
 */

(I've not read the code, just the thread)

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Diagonal storage model

2018-04-01 Thread Дмитрий Воронин

Hi, Konstantin!

Thank you for working on new pluggable storage API.

Your patch in attachment is 505 bytes and contains only diff from explain.c. Is 
it right?

01.04.2018, 15:48, "Konstantin Knizhnik" :
> Hi hackers,
>
> Vertical (columnar) storage mode is most optimal for analytic and this is why 
> it is widely used in databases oriented on OLAP, such as Vertica, 
> HyPer,KDB,...
> In Postgres we have cstore extension which is not able to provide all 
> benefits of vertical model because of lack of support of vector operations in 
> executor.
> Situation can be changed if we will have pluggable storage API with support 
> of vectorized execution.
>
> But veritcal model is not so good for updates and load of data (because data 
> is mostly imported in horizontal format).
> This is why in most of the existed systems data is presentin both formats (at 
> least for some time).
>
> I want to announce new model, "diagonal storage" which combines benefits of 
> both approaches.
> The idea is very simple: we first store column 1 of first record, then column 
> 2 of second record, ... and so on until we reach the last column.
> After it we store second column of first record, third column of the second 
> record,...
>
> Profiling of TPC-H queries shows that mode of the time of query exectution 
> (about 17%) is spent is heap_deform_tuple.
> New format will allow to significantly reduce time of heap deforming, because 
> there is just of column if the particular record in each tile.
> Moreover over we can perform deforming of many tuples in parallel, which ids 
> especially efficient at quantum computers.
>
> Attach please find patch with first prototype implementation. It provides 
> about 3.14 times improvement of performance at most of TPC-H queries.
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company

-- 
Best regards, Dmitry Voronin

json(b)_to_tsvector with numeric values

2018-04-01 Thread Dmitry Dolgov

Hi,

We've just noticed, that current implementation of `json(b)_to_tsvector` can be
confusing sometimes, if the target document contains numeric values.
In this case
we just drop them, and only string values will contribute to the result:

select to_tsvector('english', '{"a": "The Fat Rats", "b": 123}'::jsonb);
   to_tsvector
-
 'fat':2 'rat':3
(1 row)

The result would be less surprising if all values, that can be converted to
string representation (so, strings and numeric values, nothing to do for null &
boolean), will take part in it:

select to_tsvector('english', '{"a": "The Fat Rats", "b": 123}'::jsonb);
   to_tsvector
-
 '123':5 'fat':2 'rat':3
(1 row)

Attached patch contains small fix that's necessary to get the described
behavior. This patch doesn't touch `ts_headline` though, because following the
same approach it would require changing the type of element in the resulting
json(b).

Any opinions about this suggestion? Can it be considered as a bug fix and
included into this release?


jsonb_to_tsvector_numeric_v1.patch
Description: Binary data

Re: [PATCH] Verify Checksums during Basebackups

On Sat, Mar 31, 2018 at 2:54 PM, Michael Banck 
wrote:

> Hi,
>
> On Fri, Mar 30, 2018 at 07:46:02AM -0400, Stephen Frost wrote:
> > * Magnus Hagander (mag...@hagander.net) wrote:
> > > On Fri, Mar 30, 2018 at 5:35 AM, David Steele 
> wrote:
> > >
> > > > On 3/24/18 10:32 AM, Michael Banck wrote:
> > > > > Am Freitag, den 23.03.2018, 17:43 +0100 schrieb Michael Banck:
> > > > >> Am Freitag, den 23.03.2018, 10:54 -0400 schrieb David Steele:
> > > > >>> In my experience actual block errors are relatively rare, so
> there
> > > > >>> aren't likely to be more than a few in a file.  More common are
> > > > >>> overwritten or transposed files, rogue files, etc.  These
> produce a lot
> > > > >>> of output.
> > > > >>>
> > > > >>> Maybe stop after five?
> > > > >
> > > > > The attached patch does that, and outputs the total number of
> > > > > verification failures of that file after it got sent.
> > > > >
> > > > >> I'm on board with this, but I have the feeling that this is not a
> very
> > > > >> common pattern in Postgres, or might not be project style at
> all.  I
> > > > >> can't remember even seen an error message like that.
> > > > >>
> > > > >> Anybody know whether we're doing this in a similar fashion
> elsewhere?
> > > > >
> > > > > I tried to have look around and couldn't find any examples, so I'm
> not
> > > > > sure that patch should go in. On the other hand, we abort on
> checksum
> > > > > failures usually (in pg_dump e.g.), so limiting the number of
> warnings
> > > > > does makes sense.
> > > > >
> > > > > I guess we need to see what others think.
> > > >
> > > > Well, at this point I would say silence more or less gives consent.
> > > >
> > > > Can you provide a rebased patch with the validation retry and warning
> > > > limiting logic added? I would like to take another pass through it
> but I
> > > > think this is getting close.
> > >
> > > I was meaning to mention it, but ran out of cycles.
> > >
> > > I think this is the right way to do it, except the 5 should be a
> #define
> > > and not an inline hardcoded value :) We could argue whether it should
> be "5
> > > total" or "5 per file". When I read the emails I thought it was going
> to be
> > > 5 total, but I see the implementation does 5 / file. In a
> super-damanged
> > > system that will still lead to horrible amounts of logging, but I think
> > > maybe if your system is in that bad shape, then it's a lost cause
> anyway.
> >
> > 5/file seems reasonable to me as well.
> >
> > > I also think the "total number of checksum errors" should be logged if
> > > they're >0, not >5. And I think *that* one should be logged at the end
> of
> > > the entire process, not per file. That'd be the kind of output that
> would
> > > be the most interesting, I think (e.g. if I have it spread out with 1
> block
> > > each across 4 files, I want that logged at the end because it's easy to
> > > otherwise miss one or two of them that may have happened a long time
> apart).
> >
> > I definitely like having a total # of checksum errors included at the
> > end, if there are any at all.  When someone is looking to see why the
> > process returned a non-zero exit code, they're likely to start looking
> > at the end of the log, so having that easily available and clear as to
> > why the backup failed is definitely valuable.
> >
> > > I don't think we have a good comparison elsewhere, and that is as David
> > > mention because other codepaths fail hard when they run into something
> like
> > > that. And we explicitly want to *not* fail hard, per previous
> discussion.
> >
> > Agreed.
>
> Attached is a new and rebased patch which does the above, plus
> integrates the suggested changes by David Steele. The output is now:
>
> $ initdb -k --pgdata=data1 1> /dev/null 2> /dev/null
> $ pg_ctl --pgdata=data1 --log=pg1.log start > /dev/null
> $ dd conv=notrunc oflag=seek_bytes seek=4000 bs=8 count=1 if=/dev/zero
> of=data1/base/12374/2610 2> /dev/null
> $ for i in 4000 13000 21000 29000 37000 43000; do dd conv=notrunc
> oflag=seek_bytes seek=$i bs=8 count=1 if=/dev/zero
> of=data1/base/12374/1259; done 2> /dev/null
> $ pg_basebackup -v -h /tmp --pgdata=data2
> pg_basebackup: initiating base backup, waiting for checkpoint to complete
> pg_basebackup: checkpoint completed
> pg_basebackup: write-ahead log start point: 0/260 on timeline 1
> pg_basebackup: starting background WAL receiver
> pg_basebackup: created temporary replication slot "pg_basebackup_13882"
> WARNING:  checksum verification failed in file "./base/12374/2610", block
> 0: calculated C2C9 but expected EC78
> WARNING:  checksum verification failed in file "./base/12374/1259", block
> 0: calculated 8BAE but expected 46B8
> WARNING:  checksum verification failed in file "./base/12374/1259", block
> 1: calculated E413 but expected 7701
> WARNING:  checksum verification failed in file "./base/12374/1259", block
> 2: calculated 5DA9 but expected D5AA
> WARNING:  checksum verification failed in file "./base/12374/1

Re: [PATCH] pg_hba.conf : new auth option : clientcert=verify-full

On Fri, Mar 23, 2018 at 3:45 PM, Julian Markwort <
julian.markw...@uni-muenster.de> wrote:

> On Sat, 2018-03-17 at 18:24 +0100, Magnus Hagander wrote:
>
> The error message "certificate authentication failed for user XYZ:
> client certificate contains no user name" is the result of calling
> CheckCertAuth when the user presented a certificate without a CN in it.
>
>
> That is arguably wrong, since it's actually password authentication that
> fails. That is the authentication type that was picked in pg_hba.conf. It's
> more "certificate validation" that failed.
>
>
> I think we got confused about this; maybe I didn't graps it fully before:
> CheckCertAuth is currently only called when auth method cert is used. So it
> actually makes sense to say that certificate authentication failed, I think.
>
> I agree that the log message is useful. Though it could be good to clearly
> indicate that it was caused specifically because of the verify-full, to
> differentiate it from other cases of the same message.
>
> I've modified my patch so it still uses CheckCertAuth, but now a different
> message is written to the log when clientcert=verify-full was used.
> For auth method cert, the function should behave as before.
>
> For example, what about the scenario where I use GSSAPI authentication and
> clientcert=verify-full. Then we suddenly have three usernames (gssapi,
> certificate and specified) -- how is the user supposed to know which one
> came from the cert and which one came from gssapi for example?
>
> The user will only see what's printed in the auth_failed() function in
> auth.c with the addition of the logdetail string, which I don't touch with
> this patch.
> As you said, it makes sense that more detailed information is only
> available in the server's log.
>
> I've attached an updated version of the patch.
>

I assume this is a patch that's intended to be applied on top of the
previous patch? If so, please submit the complete pach to make sure the
correct combination ends up actually being reviewed.



> I'm not sure if it is preferred to keep patches as short as possible
> (mostly with respect to the changed lines in the documentation) or to
> organize changes so that the text matches the surrounding column width und
> text flow? Also, I've omitted mentions of the current usage 'clientcert=1'
> - this is still supported in code, but I think telling new users only about
> 'clientcert=verify-ca' and 'clientcert=verify-full' is clearer. Or am I
> wrong on this one?
>
>
I have not had time to look at the updated verison of the patch yet, but I
wanted to get a response in for your last question here.

Keeping patches as short as possible is not a good thing itself. The
important part is that the resulting code and documentation is the best
possible. Sometimes you might want to turn it into two patches submitted at
the same time if one is clearly just reorganisation, but avoiding code
restructure just to keep the lines of patch down is not helpful.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: [PATCH] pg_hba.conf : new auth option : clientcert=verify-full

2018-04-01 Thread Julian Markwort

On 1. of April 2018 17:46:38 MESZ wrote Magnus Hagander :

>I assume this is a patch that's intended to be applied on top of the
>previous patch? If so, please submit the complete pach to make sure the
>correct combination ends up actually being reviewed.

The v02.patch attached to my last mail contains both source and documentation 
changes.

>Keeping patches as short as possible is not a good thing itself. The
>important part is that the resulting code and documentation is the best
>possible. Sometimes you might want to turn it into two patches
>submitted at
>the same time if one is clearly just reorganisation, but avoiding code
>restructure just to keep the lines of patch down is not helpful.

I think I've made the right compromises regarding readability of the 
documentation in my patch then.

A happy Easter, passover, or Sunday to you
Julian

Re: [PATCH] pg_hba.conf : new auth option : clientcert=verify-full

On Sun, Apr 1, 2018 at 6:01 PM, Julian Markwort <
julian.markw...@uni-muenster.de> wrote:

> On 1. of April 2018 17:46:38 MESZ wrote Magnus Hagander <
> mag...@hagander.net>:
>
> >I assume this is a patch that's intended to be applied on top of the
> >previous patch? If so, please submit the complete pach to make sure the
> >correct combination ends up actually being reviewed.
>
> The v02.patch attached to my last mail contains both source and
> documentation changes.
>

Hmm. I think I may have been looking at the wrong file. Sorry!


>Keeping patches as short as possible is not a good thing itself. The
> >important part is that the resulting code and documentation is the best
> >possible. Sometimes you might want to turn it into two patches
> >submitted at
> >the same time if one is clearly just reorganisation, but avoiding code
> >restructure just to keep the lines of patch down is not helpful.
>
> I think I've made the right compromises regarding readability of the
> documentation in my patch then.
>
> A happy Easter, passover, or Sunday to you
>

You, too!

(I shall return to reviewing it after the holidays are over)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-04-01 Thread Tomas Vondra

Hi,

The attached patch version modifies how the non-MCV selectivity is
computed, along the lines explained in the previous message.

The comments in statext_clauselist_selectivity() explain it in far more
detail, but we this:

1) Compute selectivity using the MCV (s1).

2) To compute the non-MCV selectivity (s2) we do this:

2a) See how many top-level equalities are there (and compute ndistinct
estimate for those attributes).

2b) If there is an equality on each column, we know there can only be a
single matching item. If we found it in the MCV (i.e. s1 > 0) we're
done, and 's1' is the answer.

2c) If only some columns have equalities, we estimate the selectivity
for equalities as

s2 = ((1 - mcv_total_sel) / ndistinct)

If there are no remaining conditions, we're done.

2d) To estimate the non-equality clauses (on non-MCV part only), we
either repeat the whole process by calling clauselist_selectivity() or
approximating s1 to the non-MCV part. This needs a bit of care to
prevent infinite loops.


Of course, with 0002 this changes slightly, because we may try using a
histogram to estimate the non-MCV part. But that's just an extra step
right before (2a).

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


0001-multivariate-MCV-lists-20180401.patch.gz
Description: application/gzip


0002-multivariate-histograms-20180401.patch.gz
Description: application/gzip

Re: Planning counters in pg_stat_statements

2018-04-01 Thread legrand legrand

Hello,

When testing this patch on my WIN1252 database with my java front end, using
11devel snapshot
 I get 
  org.posgresql.util.PSQLException: ERROR: character with byte sequence 0x90
in encoding 
  "WIN1252" has no equivalent in encoding "UTF8"

When using psql with client_encoding = WIN1252, query text are truncated:

postgres=# select pg_stat_statements_reset();
 pg_stat_statements_reset
--

(1 row)


postgres=# show client_encoding;
 client_encoding
-
 WIN1252
(1 row)


postgres=# select substr(query,1,20) from pg_stat_statements;
   substr

 tatements_reset();
 ding;
(2 rows)

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Re: Diagonal storage model

2018-04-01 Thread legrand legrand

Great Idea !
thank you Konstantin



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Re: Diagonal storage model

2018-04-01 Thread Alexander Korotkov

Hi!

On Sun, Apr 1, 2018 at 3:48 PM, Konstantin Knizhnik <
k.knizh...@postgrespro.ru> wrote:

> I want to announce new model, "diagonal storage" which combines benefits
> of both approaches.
> The idea is very simple: we first store column 1 of first record, then
> column 2 of second record, ... and so on until we reach the last column.
> After it we store second column of first record, third column of the
> second record,...
>

Sounds interesting.  Could "diagonal storages" be applied twice?  That is
could we apply
diagonal transformation to the result of another diagonal transformation?
I expect we
should get a "square diagonal" transformation...

Attach please find patch with first prototype implementation. It provides
> about 3.14 times improvement of performance at most of TPC-H queries.

Great, but with square diagonal transformation we should get 3.14^2 times
improvement,
which is even better!

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Optimize Arm64 crc32c implementation in Postgresql

Hi,

On 2018-03-06 02:44:35 +0800, Heikki Linnakangas wrote:
> On 02/03/18 06:42, Andres Freund wrote:
> > On 2018-03-02 11:37:52 +1300, Thomas Munro wrote:
> > > So... that stuff probably needs either a configure check for the
> > > getauxval function and/or those headers, or an OS check?
> > 
> > It'd probably be better to not rely on os specific headers, and instead
> > directly access the capabilities.
> 
> Anyone got an idea on how to do that? I googled around a bit, but couldn't
> find any examples.

Similar...


> * Use compiler intrinsics instead of inline assembly.

+many


> * I tested this on Linux, with gcc and clang, on an ARM64 virtual machine
> that I had available (not an emulator, but a VM on a shared ARM64 server).

Have you seen actual postgres performance benefits with the patch?

- Andres

Rethinking -L switch handling and construction of LDFLAGS

I noticed that if I build with --with-libxml on my Mac platforms,
"make installcheck" stops working for certain contrib modules such
as postgres_fdw. I finally got around to diagnosing the reason why,
and it goes like this:

1. --with-libxml causes configure to include

-L/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib
in the LDFLAGS value put into Makefile.global. That's because
"xml2-config --libs" emits that, and we do need it if we want to link
to the platform-supplied libxml2.

2. However, that directory also contains a symlink to the
platform-supplied libpq.

3. When we go to build postgres_fdw.so, the link command line looks like

ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
-Wformat-security -fno-strict-aliasing -fwrapv
-Wno-unused-command-line-argument -g -O2 -bundle -multiply_defined suppress -o
postgres_fdw.so postgres_fdw.o option.o deparse.o connection.o shippable.o
-L../../src/port -L../../src/common
-L/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib
-L/usr/local/ssl/lib -Wl,-dead_strip_dylibs -L../../src/interfaces/libpq
-lpq -bundle_loader ../../src/backend/postgres

The details of this might vary depending on your configure options,
but the key point is that the -L/Applications/... switch is before the
-L../../src/interfaces/libpq one. This means that the linker resolves
"-lpq" to the platform-supplied libpq, not the one in the build tree.
We can confirm that with

$ otool -L postgres_fdw.so
postgres_fdw.so:
/usr/lib/libpq.5.dylib (compatibility version 5.0.0, current version
5.6.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current
version 1252.50.4)

So, quite aside from any problems stemming from using a 9.3-vintage libpq
with HEAD client code, we are stuck with a libpq that uses Apple's idea of
the default socket location, rather than what the rest of our build uses.
That explains the failures seen in "make installcheck", which look like

2018-04-01 13:09:48.744 EDT [10758] ERROR: could not connect to server
"loopback"
2018-04-01 13:09:48.744 EDT [10758] DETAIL: could not connect to server: No
such file or directory
Is the server running locally and accepting
connections on Unix domain socket
"/var/pgsql_socket/.s.PGSQL.5432"?

Of course, /var/pgsql_socket is *not* where my postmaster is putting
its socket.

In short, we need to deal more honestly with the positioning of -L
switches in link commands. Somebody's idea that we could embed
both -L and -l into $(libpq), and then pay basically no attention to
where that ends up in the final link command, is just too simplistic.

I think that we want to establish an ironclad rule that -L switches
referencing directories in our own build tree must appear before -L
switches referencing external libraries.

I don't have a concrete patch to propose yet, but the design idea
I have in mind is to split LDFLAGS into two or more parts, so that
-L switches for the build tree are supposed to be put in the first
part and external -L switches in the second. It'd be sufficient
to have Makefile.global do something like

ifdef PGXS
LDFLAGS_INTERNAL = -L$(libdir)
else
LDFLAGS_INTERNAL = -L$(top_builddir)/src/port -L$(top_builddir)/src/common
endif
LDFLAGS = $(LDFLAGS_INTERNAL) @LDFLAGS@

and then teach relevant places that they need to add $(libpq) to
LDFLAGS_INTERNAL not LDFLAGS. (Perhaps "BUILD" would be a better keyword
than "INTERNAL" here?) Not sure how that would play exactly with
Makefile.shlib's SHLIB_LINK, but maybe we need SHLIB_LINK_INTERNAL along
with SHLIB_LINK. I'd also like to try to clean up the mess that is
$(libpq_pgport), though I'm not sure just how yet.

Or we could try to create a full separation between -L and -l switches,
ending up with three or more parts for LDFLAGS not just two. But I'm
not sure if that gains anything.

I have no idea whether the MSVC build infrastructure has comparable
problems, and would not be willing to fix it myself if it does.
But I am willing to try to fix this in the gmake infrastructure.

Comments, better ideas?

regards, tom lane

Re: Rethinking -L switch handling and construction of LDFLAGS

Hi,

On 2018-04-01 13:38:15 -0400, Tom Lane wrote:
> In short, we need to deal more honestly with the positioning of -L
> switches in link commands.  Somebody's idea that we could embed
> both -L and -l into $(libpq), and then pay basically no attention to
> where that ends up in the final link command, is just too simplistic.

Sounds right.


> I don't have a concrete patch to propose yet, but the design idea
> I have in mind is to split LDFLAGS into two or more parts, so that
> -L switches for the build tree are supposed to be put in the first
> part and external -L switches in the second.  It'd be sufficient
> to have Makefile.global do something like
> 
> ifdef PGXS
>   LDFLAGS_INTERNAL = -L$(libdir)
> else
>   LDFLAGS_INTERNAL = -L$(top_builddir)/src/port -L$(top_builddir)/src/common
> endif
> LDFLAGS = $(LDFLAGS_INTERNAL) @LDFLAGS@

I'm not sure I like doing this in Makefile.global. We've various files
that extend LDFLAGS in other places, and that's going to be hard if it's
already mushed together again. We end up re-building it from parts in
those files too.

Why don't we change the link commands to reference LDFLAGS_INTERNAL
explicitly?  That seems like it'd be cleaner.

Greetings,

Andres Freund

Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian a bit)

2018-04-01 Thread Yura Sokolov

17.03.2018 03:36, Tomas Vondra пишет:
> 
> On 03/17/2018 12:03 AM, Yura Sokolov wrote:
>> 16.03.2018 04:23, Tomas Vondra пишет:
>>>
>>> ...
>>>
>>> OK, a few more comments.
>>>
>>> 1) The code in ExtendXipSizeForHash seems somewhat redundant with
>>> my_log2 (that is, we could just call the existing function).
>>
>> Yes, I could call my_log2 from ExtendXipSizeForHash. But wouldn't one
>> more call be more expensive than loop itself?
>>
> 
> I very much doubt it there would be a measurable difference. Firstly,
> function calls are not that expensive, otherwise we'd be running around
> and removing function calls from the hot paths. Secondly, the call only
> happens with many XIDs, and in that case the cost should be out-weighted
> by faster lookups. And finally, the function does stuff that seems far
> more expensive than a single function call (for example allocation,
> which may easily trigger a malloc).
> 
> In fact, in the interesting cases it's pretty much guaranteed to hit a
> malloc, because the number of backend processes needs to be high enough
> (say, 256 or more), which means
> 
> GetMaxSnapshotSubxidCount()
> 
> will translate to something like
> 
> ((PGPROC_MAX_CACHED_SUBXIDS + 1) * PROCARRAY_MAXPROCS)
> = (64 + 1) * 256
> = 16640
> 
> and because XIDs are 4B each, that's ~65kB of memory (even ignoring the
> ExtendXipSizeForHash business). And aset.c treats chunks larger than 8kB
> as separate blocks, that are always malloc-ed independently.
> 
> But I may be missing something, and the extra call actually makes a
> difference. But I think the onus of proving that is on you, and the
> default should be not to duplicate code.
> 
>>> 2) Apparently xhlog/subxhlog fields are used for two purposes - to
>>> store log2 of the two XID counts, and to remember if the hash table
>>> is built. That's a bit confusing (at least it should be explained
>>> in a comment) but most importantly it seems a bit unnecessary.
>>
>> Ok, I'll add comment.
>>
>>> I assume it was done to save space, but I very much doubt that
>>> makes any difference. So we could easily keep a separate flag. I'm
>>> pretty sure there are holes in the SnapshotData struct, so we could
>>> squeeze it the flags in one of those.
>>
>> There's hole just right after xhlog. But it will be a bit strange to 
>> move fields around.
>>
> 
> Is SnapshotData really that sensitive to size increase? I have my doubts
> about that, TBH. The default stance should be to make the code easy to
> understand. That is, we should only move fields around if it actually
> makes a difference.
> 
>>> But do we even need a separate flag? We could just as easily keep 
>>> the log2 fields set to 0 and only set them after actually building 
>>> the hash table.
>>
>> There is a need to signal that there is space for hash. It is not
>> always allocated. iirc, I didn't cover the case where snapshot were
>> restored from file, and some other place or two.
>> Only if all places where snapshot is allocated are properly modified
>> to allocate space for hash, then flag could be omitted, and log2
>> itself used as a flag.
>>
> 
> Hmmm, that makes it a bit inconsistent, I guess ... why not to do the
> same thing on all those places?
> 
>>> Or even better, why not to store the mask so that XidInXip does not
>>> need to compute it over and over (granted, that's uint32 instead
>>> of just uint8, but I don't think SnapshotData is particularly
>>> sensitive to this, especially considering how much larger the xid
>>> hash table is).
>>
>> I don't like unnecessary sizeof struct increase. And I doubt that 
>> computation matters. I could be mistaken though, because it is hot
>> place. Do you think it will significantly improve readability?
>>
> 
> IMHO the primary goal is to make the code easy to read and understand,
> and only optimize struct size if it actually makes a difference. We have
> no such proof here, and I very much doubt you'll be able to show any
> difference because even with separate flags pahole says this:
> 
> struct SnapshotData {
> SnapshotSatisfiesFunc  satisfies;/* 0 8 */
> TransactionId  xmin; /* 8 4 */
> TransactionId  xmax; /*12 4 */
> TransactionId *xip;  /*16 8 */
> uint32 xcnt; /*24 4 */
> uint8  xhlog;/*28 1 */
> 
> /* XXX 3 bytes hole, try to pack */
> 
> TransactionId *subxip;   /*32 8 */
> int32  subxcnt;  /*40 4 */
> uint8  subxhlog; /*44 1 */
> bool   suboverflowed;/*45 1 */
> bool   takenDuringRecovery;  /*46 1 */
> bool   copied;   /*47 1 */
> CommandId

Re: Rethinking -L switch handling and construction of LDFLAGS

Andres Freund  writes:
> On 2018-04-01 13:38:15 -0400, Tom Lane wrote:
>> I don't have a concrete patch to propose yet, but the design idea
>> I have in mind is to split LDFLAGS into two or more parts, so that
>> -L switches for the build tree are supposed to be put in the first
>> part and external -L switches in the second.

> I'm not sure I like doing this in Makefile.global. We've various files
> that extend LDFLAGS in other places, and that's going to be hard if it's
> already mushed together again. We end up re-building it from parts in
> those files too.

Yeah, one of the things I'd like to fix is that some of the makefiles,
eg psql's, do

override LDFLAGS := -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) 
$(LDFLAGS)

which goes *directly* against this commandment in Makefile.global:

# We want -L for libpgport.a and libpgcommon.a to be first in LDFLAGS.  We
# also need LDFLAGS to be a "recursively expanded" variable, else adjustments
# to rpathdir don't work right.  So we must NOT do LDFLAGS := something,
# meaning this has to be done first and elsewhere we must only do LDFLAGS +=
# something.

It's a bit surprising that rpath works at all for these makefiles.
But with what I'm imagining here, I think we could replace that with

LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)

and thereby preserve the recursively-expanded virginity of both
LDFLAGS_INTERNAL and LDFLAGS.  But I've not tried to test anything yet.

> Why don't we change the link commands to reference LDFLAGS_INTERNAL
> explicitly?  That seems like it'd be cleaner.

I'm hesitant to do that because LDFLAGS is a name known to make's
default rules, and I don't want to bet that we're not relying on
those default rules anywhere.  I also disagree with the idea that using
"$(LDFLAGS_INTERNAL) $(LDFLAGS)" in every link command we have is better
or less error-prone than just "$(LDFLAGS)".  Especially not if we end up
with more than two parts.

regards, tom lane

Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian a bit)

2018-04-01 Thread Yura Sokolov

23.03.2018 17:59, Amit Kapila пишет:
> On Sat, Mar 10, 2018 at 7:41 AM, Yura Sokolov  wrote:
>> 08.03.2018 03:42, Tomas Vondra пишет:
>>> One reason against building the hash table in GetSnapshotData is that
>>> we'd build it even when the snapshot is never queried. Or when it is
>>> queried, but we only need to check xmin/xmax.
>>
>> Thank you for analyze, Tomas.
>>
>> Stephen is right about bug in snapmgr.c
>> Attached version fixes bug, and also simplifies XidInXip a bit.
>>
> 
> @@ -2167,8 +2175,7 @@ RestoreSnapshot(char *start_address)
>   /* Copy SubXIDs, if present. */
>   if (serialized_snapshot.subxcnt > 0)
>   {
> - snapshot->subxip = ((TransactionId *) (snapshot + 1)) +
> - serialized_snapshot.xcnt;
> + snapshot->subxip = ((TransactionId *) (snapshot + 1)) + xcnt;
>   memcpy(snapshot->subxip, serialized_xids + serialized_snapshot.xcnt,
> serialized_snapshot.subxcnt * sizeof(TransactionId));
>   }
> 
> 
> It is not clear why you want to change this in RestoreSnapshot when
> nothing related is changed in SerializeSnapshot?  Can you please add
> some comments to clarify it?
> 

I didn't change serialized format. Therefore is no need to change
SerializeSnapshot.
But in-memory representation were changed, so RestoreSnapshot is changed.

With regards,
Sokolov Yura.

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-01 Thread Thomas Munro

On Fri, Mar 30, 2018 at 10:18 AM, Thomas Munro
 wrote:
> ... on Linux only.

Apparently I was too optimistic.  I had looked only at FreeBSD, which
keeps the page around and dirties it so we can retry, but the other
BSDs apparently don't (FreeBSD changed that in 1999).  From what I can
tell from the sources below, we have:

Linux, OpenBSD, NetBSD: retrying fsync() after EIO lies
FreeBSD, Illumos: retrying fsync() after EIO tells the truth

Maybe my drive-by assessment of those kernel routines is wrong and
someone will correct me, but I'm starting to think you might be better
to assume the worst on all systems.  Perhaps a GUC that defaults to
panicking, so that users on those rare OSes could turn that off?  Even
then I'm not sure if the failure mode will be that great anyway or if
it's worth having two behaviours.  Thoughts?

http://mail-index.netbsd.org/netbsd-users/2018/03/30/msg020576.html
https://github.com/NetBSD/src/blob/trunk/sys/kern/vfs_bio.c#L1059
https://github.com/openbsd/src/blob/master/sys/kern/vfs_bio.c#L867
https://github.com/freebsd/freebsd/blob/master/sys/kern/vfs_bio.c#L2631
https://github.com/freebsd/freebsd/commit/e4e8fec98ae986357cdc208b04557dba55a59266
https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/os/bio.c#L441

-- 
Thomas Munro
http://www.enterprisedb.com

Re: Planning counters in pg_stat_statements

2018-04-01 Thread legrand legrand

I forgot to recompile core ...
now only utility statements (with 0 plans) seems truncated.

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Re: bulk typos

=?UTF-8?Q?F=C3=A9lix_GERZAGUET?=  writes:
> On Sat, Mar 31, 2018 at 12:56 PM, Justin Pryzby 
> wrote:
>> I needed another distraction so bulk-checked for typos, limited to
>> comments in *.[ch].

> I think you introduced another one while changing "explcitly" to
> "expilcitly" instead of "explicitly" :-)

LGTM for the most part, except for this change:

- * Therefore, we do not whinge about no-such-process.
+ * Therefore, we do not whine about no-such-process.

I think that spelling is intentional, so I didn't change it.
Pushed the rest, with Felix's correction.

regards, tom lane

Re: bulk typos

Hi,

On 2018-03-31 05:56:40 -0500, Justin Pryzby wrote:
> --- a/src/backend/jit/llvm/llvmjit_expr.c
> +++ b/src/backend/jit/llvm/llvmjit_expr.c
> @@ -1768,7 +1768,7 @@ llvm_compile_expr(ExprState *state)
>   
> b_compare_result,
>   b_null);
>  
> - /* build block analying the !NULL 
> comparator result */
> + /* build block analyzing the !NULL 
> comparator result */
>   LLVMPositionBuilderAtEnd(b, 
> b_compare_result);

Hah. I kinda like the previous way too ;)

Greetings,

Andres Freund

Re: Rethinking -L switch handling and construction of LDFLAGS

Hi,

On 2018-04-01 13:55:05 -0400, Tom Lane wrote:
> > Why don't we change the link commands to reference LDFLAGS_INTERNAL
> > explicitly?  That seems like it'd be cleaner.
> 
> I'm hesitant to do that because LDFLAGS is a name known to make's
> default rules, and I don't want to bet that we're not relying on
> those default rules anywhere.

FWIW, postgres builds cleanly with -r -R in MAKELAGS.

Greetings,

Andres Freund

Re: Rethinking -L switch handling and construction of LDFLAGS

Andres Freund  writes:
> On 2018-04-01 13:55:05 -0400, Tom Lane wrote:
>> I'm hesitant to do that because LDFLAGS is a name known to make's
>> default rules, and I don't want to bet that we're not relying on
>> those default rules anywhere.

> FWIW, postgres builds cleanly with -r -R in MAKELAGS.

That's pretty hard to believe.  Why would we bother to override every
default rule?  Even if it's true today, I would not accept it as project
policy that we must do so.  Perhaps more to the point, I would strongly
object to any design in which the standard Make variables don't mean
what the default rules expect them to mean.  That's just a recipe for
confusing people and creating hard-to-spot bugs.

regards, tom lane

Re: new function for tsquery creartion

2018-04-01 Thread Dmitry Ivanov


Hi Aleksandr,

I agree with Aleksander about silencing all errors in 
websearch_to_tsquery().


In the attachment is a revised patch with the attempt to introduce an
ability to ignore syntax errors in gettoken_tsvector().


Thanks for the further improvements! Yes, you're both right, the API has 
to be consistent. Unfortunately, I had to make some adjustments 
according to Oleg Bartunov's review. Here's a change log:


1. &, | and (), <-> are no longer considered operators in web search 
mode.
2. I've stumbled upon a bug: web search used to transform "pg_class" 
into 'pg <-> class', which is no longer the case.
3. I changed the behavior of gettoken_tsvector() as soon as I had heard 
from Aleksander Alekseev, so I decided to use my implementation in this 
revision of the patch. This is a good subject for discussion, though. 
Feel free to share your opinion.

4. As suggested by Theodor, I've replaced some bool args with bit flags.


The name of enum ts_parsestate looks more like a name of the function
than a name of a type.
In my version, it renamed to QueryParserState, but you can fix it if 
I'm wrong.


True, but gettoken_query() returns ts_tokentype, so I decided to use 
this naming scheme.


--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/backend/tsearch/to_tsany.c b/src/backend/tsearch/to_tsany.c
index ea5947a3a8..6055fb6b4e 100644
--- a/src/backend/tsearch/to_tsany.c
+++ b/src/backend/tsearch/to_tsany.c
@@ -490,7 +490,7 @@ to_tsquery_byid(PG_FUNCTION_ARGS)
 	query = parse_tsquery(text_to_cstring(in),
 		  pushval_morph,
 		  PointerGetDatum(&data),
-		  false);
+		  0);
 
 	PG_RETURN_TSQUERY(query);
 }
@@ -520,7 +520,7 @@ plainto_tsquery_byid(PG_FUNCTION_ARGS)
 	query = parse_tsquery(text_to_cstring(in),
 		  pushval_morph,
 		  PointerGetDatum(&data),
-		  true);
+		  P_TSQ_PLAIN);
 
 	PG_RETURN_POINTER(query);
 }
@@ -551,7 +551,7 @@ phraseto_tsquery_byid(PG_FUNCTION_ARGS)
 	query = parse_tsquery(text_to_cstring(in),
 		  pushval_morph,
 		  PointerGetDatum(&data),
-		  true);
+		  P_TSQ_PLAIN);
 
 	PG_RETURN_TSQUERY(query);
 }
@@ -567,3 +567,35 @@ phraseto_tsquery(PG_FUNCTION_ARGS)
 		ObjectIdGetDatum(cfgId),
 		PointerGetDatum(in)));
 }
+
+Datum
+websearch_to_tsquery_byid(PG_FUNCTION_ARGS)
+{
+	text	   *in = PG_GETARG_TEXT_PP(1);
+	MorphOpaque	data;
+	TSQuery		query = NULL;
+
+	data.cfg_id = PG_GETARG_OID(0);
+
+	data.qoperator = OP_AND;
+
+	query = parse_tsquery(text_to_cstring(in),
+		  pushval_morph,
+		  PointerGetDatum(&data),
+		  P_TSQ_WEB);
+
+	PG_RETURN_TSQUERY(query);
+}
+
+Datum
+websearch_to_tsquery(PG_FUNCTION_ARGS)
+{
+	text	   *in = PG_GETARG_TEXT_PP(0);
+	Oid			cfgId;
+
+	cfgId = getTSCurrentConfig(true);
+	PG_RETURN_DATUM(DirectFunctionCall2(websearch_to_tsquery_byid,
+		ObjectIdGetDatum(cfgId),
+		PointerGetDatum(in)));
+
+}
diff --git a/src/backend/utils/adt/tsquery.c b/src/backend/utils/adt/tsquery.c
index 1ccbf79030..695bdb89e9 100644
--- a/src/backend/utils/adt/tsquery.c
+++ b/src/backend/utils/adt/tsquery.c
@@ -32,14 +32,27 @@ const int	tsearch_op_priority[OP_COUNT] =
 	3			/* OP_PHRASE */
 };
 
+/*
+ * parser's states
+ */
+typedef enum
+{
+	WAITOPERAND = 1,
+	WAITOPERATOR = 2,
+	WAITFIRSTOPERAND = 3,
+	WAITSINGLEOPERAND = 4
+} ts_parserstate;
+
 struct TSQueryParserStateData
 {
 	/* State for gettoken_query */
 	char	   *buffer;			/* entire string we are scanning */
 	char	   *buf;			/* current scan point */
-	int			state;
 	int			count;			/* nesting count, incremented by (,
  * decremented by ) */
+	bool		in_quotes;		/* phrase in quotes "" */
+	bool		is_web;			/* is it a web search? */
+	ts_parserstate state;
 
 	/* polish (prefix) notation in list, filled in by push* functions */
 	List	   *polstr;
@@ -57,12 +70,6 @@ struct TSQueryParserStateData
 	TSVectorParseState valstate;
 };
 
-/* parser's states */
-#define WAITOPERAND 1
-#define WAITOPERATOR	2
-#define WAITFIRSTOPERAND 3
-#define WAITSINGLEOPERAND 4
-
 /*
  * subroutine to parse the modifiers (weight and prefix flag currently)
  * part, like ':AB*' of a query.
@@ -197,6 +204,26 @@ err:
 	return buf;
 }
 
+/*
+ * Parse OR operator used in websearch_to_tsquery().
+ */
+static bool
+parse_or_operator(TSQueryParserState state)
+{
+	char *buf = state->buf;
+
+	if (state->in_quotes)
+		return false;
+
+	return (t_iseq(&buf[0], 'o') || t_iseq(&buf[0], 'O')) &&
+		   (t_iseq(&buf[1], 'r') || t_iseq(&buf[1], 'R')) &&
+		   (buf[2] != '\0' &&
+!t_iseq(&buf[2], '-') &&
+!t_iseq(&buf[2], '_') &&
+!t_isalpha(&buf[2]) &&
+!t_isdigit(&buf[2]));
+}
+
 /*
  * token types for parsing
  */
@@ -219,10 +246,12 @@ typedef enum
  *
  */
 static ts_tokentype
-gettoken_query(TSQueryParserState state,
-			   int8 *operator,
-			   int *lenval, char **strval, int16 *weight, bool *prefix)
+gettoken_query(TSQueryParserState state, int8 *operator,
+			   int *len

Re: [PATCH] Logical decoding of TRUNCATE

On 2018-01-25 14:21:15 +0100, Marco Nenciarini wrote:
> + if (SessionReplicationRole != SESSION_REPLICATION_ROLE_REPLICA)
> + {
> + 
> + /*
> +  * Check foreign key references.  In CASCADE mode, this should 
> be
> +  * unnecessary since we just pulled in all the references; but 
> as a
> +  * cross-check, do it anyway if in an Assert-enabled build.
> +  */
>   #ifdef USE_ASSERT_CHECKING
>   heap_truncate_check_FKs(rels, false);
> + #else
> + if (stmt->behavior == DROP_RESTRICT)
> + heap_truncate_check_FKs(rels, false);
>   #endif
> + }

That *can't* be right.

> + case REORDER_BUFFER_CHANGE_TRUNCATE:
> + appendStringInfoString(ctx->out, " TRUNCATE:");
> + 
> + if (change->data.truncate_msg.restart_seqs
> + || change->data.truncate_msg.cascade)
> + {
> + if (change->data.truncate_msg.restart_seqs)
> + appendStringInfo(ctx->out, " 
> restart_seqs");
> + if (change->data.truncate_msg.cascade)
> + appendStringInfo(ctx->out, " cascade");
> + }
> + else
> + appendStringInfoString(ctx->out, " (no-flags)");
> + break;
>   default:
>   Assert(false);
>   }

I know this has been discussed in the thread already, but it really
strikes me as wrong to basically do some mini DDL replication feature
via per-command WAL records.
> ***
> *** 111,116  CREATE PUBLICATION  class="parameter">name
> --- 111,121 
> and so the default value for this option is
> 'insert, update, delete'.
>
> +  
> +TRUNCATE is treated as a form of
> +DELETE for the purpose of deciding whether
> +to publish, or not.
> +  
>   
>  
> 

Why is this a good idea?


Hm, it seems logicaldecoding.sgml hasn't been updated?

> + void
> + ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
> + DropBehavior behavior, 
> bool restart_seqs)
> + {
> + List   *rels = list_copy(explicit_rels);

Why is this copied?


> +  * Write a WAL record to allow this set of actions to be logically 
> decoded.
> +  * We could optimize this away when !RelationIsLogicallyLogged(rel)
> +  * but that doesn't save much space or time.

What you're saying isn't that you're not logging anything, but that
you're allocating the header regardless? Because this certainly sounds
like you unconditionally log a WAL record.

> +  * Assemble an array of relids, then an array of seqrelids so we can 
> write
> +  * a single WAL record for the whole action.
> +  */
> + logrelids = palloc(maxrelids * sizeof(Oid));
> + foreach (cell, relids_logged)
> + {
> + nrelids++;
> + if (nrelids > maxrelids)
> + {
> + maxrelids *= 2;
> + logrelids = repalloc(logrelids, maxrelids * 
> sizeof(Oid));
> + }
> + logrelids[nrelids - 1] = lfirst_oid(cell);
> + }
> + 
> + foreach (cell, seq_relids_logged)
> + {
> + nseqrelids++;
> + if ((nrelids + nseqrelids) > maxrelids)
> + {
> + maxrelids *= 2;
> + logrelids = repalloc(logrelids, maxrelids * 
> sizeof(Oid));
> + }
> + logrelids[nrelids + nseqrelids - 1] = lfirst_oid(cell);
> + }

I'm confused. Why do we need the resizing here, when we know the max
upfront?

> + /*
> +  * For truncate we list all truncated relids in an array, followed by all
> +  * sequence relids that need to be restarted, if any.
> +  * All rels are always within the same database, so we just list dbid once.
> +  */
> + typedef struct xl_heap_truncate
> + {
> + Oid dbId;
> + uint32  nrelids;
> + uint32  nseqrelids;
> + uint8   flags;
> + Oid relids[FLEXIBLE_ARRAY_MEMBER];
> + } xl_heap_truncate;

Given that the space is used anyway due to padding, I'd just make flags
32bit.

Greetings,

Andres Freund

Re: Optimizing nested ConvertRowtypeExpr execution