Re: [HACKERS] Speedup twophase transactions

2017-03-11 Thread Nikhil Sontakke
Hi David and Michael,


> It would be great to get this thread closed out after 14 months and many
> commits.
>
>
PFA, latest patch which addresses Michael's comments.

twophase.c: In function ‘PrepareRedoAdd’:
> twophase.c:2539:20: warning: variable ‘gxact’ set but not used
> [-Wunused-but-set-variable]
>   GlobalTransaction gxact;
> There is a warning at compilation.
>
>
Fixed.


> The comment at the top of PrescanPreparedTransactions() needs to be
> updated. Not only does this routine look for the contents of
> pg_twophase, but it does also look at the shared memory contents, at
> least those ones marked with inredo and not on_disk.
>
>
Changed comments at top of PrescanPreparedTransactions() ,
StandbyRecoverPreparedTransactions(), and RecoverPreparedTransactions().


> +   ereport(WARNING,
> +   (errmsg("removing future two-phase state data from
> memory \"%u\"",
> +   xid)));
> +   PrepareRedoRemove(xid);
> +   continue
> Those are not normal (partially because unlink is atomic, but not
> durable)... But they match the correct coding pattern regarding
> incorrect 2PC entries... I'd really like to see those switched to a
> FATAL with unlink() made durable for those calls.
>
>
Hmm, not sure what exactly we need to do here. If you look at the prior
checks, there we already skip on-disk entries. So, typically, the entries
that we encounter here will be in shmem only.


> +   /* Deconstruct header */
> +   hdr = (TwoPhaseFileHeader *) buf;
> +   Assert(TransactionIdEquals(hdr->xid, xid));
> +
> +   if (TransactionIdPrecedes(xid, result))
> +   result = xid;
> This portion is repeated three times and could be easily refactored.
> You could just have a routine that returns the oldes transaction ID
> used, and ignore the result for StandbyRecoverPreparedTransactions()
> by casting a (void) to let any kind of static analyzer understand that
> we don't care about the result for example. Handling for subxids is
> necessary as well depending on the code path. Spliting things into a
> could of sub-routines may be more readable as well. There are really a
> couple of small parts that can be gathered and strengthened.
>
>
Have added a new function to do this now. It reads either from disk or
shared memory and produces error/log messages accordingly as well now.


> +   /*
> +* Recreate its GXACT and dummy PGPROC
> +*/
> +   gxactnew = MarkAsPreparing(xid, gid,
> +   hdr->prepared_at,
> +   hdr->owner, hdr->database,
> +   gxact->prepare_start_lsn,
> +   gxact->prepare_end_lsn);
> MarkAsPreparing() does not need to be extended with two new arguments.
> RecoverPreparedTransactions() is used only at the end of recovery,
> where it is not necessary to look at the 2PC state files from the
> records. In this code path inredo is also set to false :)
>
>
That's not true. We will have entries with inredo set at the end of
recovery as well. Infact the MarkAsPreparing() call
from RecoverPreparedTransactions() is the one which will remove these
inredo entries and convert them into regular entries. We have optimized the
recovery code path as well.



> +   /* Delete TwoPhaseState gxact entry and/or 2PC file. */
> +   PrepareRedoRemove(parsed.twophase_xid);
> Both things should not be present, no? If the file is pushed to disk
> it means that the checkpoint horizon has already moved.
>
>
PREPARE in redo, followed by a checkpoint, followed by a COMMIT/ROLLBACK.
We can have both the bits set in this case.



> -   ereport(ERROR,
> +   /* It's ok to find an entry in the redo/recovery case */
> +   if (!gxact->inredo)
> +   ereport(ERROR,
> (errcode(ERRCODE_DUPLICATE_OBJECT),
>  errmsg("transaction identifier \"%s\" is already in
> use",
> gid)));
> +   else
> +   {
> +   found = true;
> +   break;
> +   }
> I would not have thought so.
>
> Since we are using the TwoPhaseState structure to track redo entries, at
end of recovery, we will find existing entries. Please see my comments
above for RecoverPreparedTransactions()


> MarkAsPreparing and MarkAsPreparingInRedo really share the same code.
> What if the setup of the dummy PGPROC entry is made conditional?
>

I realized that MarkAsPreparingInRedo() does not need to do all the sanity
checking since it's going to be invoked during redo and everything that
comes in is kosher already. So its contents are much simplified in this
latest patch.

Tests pass with this latest patch.

Regards,
Nikhils


twophase_recovery_shmem_110317.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/

Re: [HACKERS] Explicit subtransactions for PL/Tcl

2017-03-11 Thread Pavel Stehule
2017-03-10 20:31 GMT+01:00 Victor Wagner :

> On Thu, 9 Mar 2017 12:04:31 +0100
> Pavel Stehule  wrote:
>
>
> > > Now test demonstrate how errors uncaught on the Tcl level interact
> > > with postgresql error system.
> > >
> >
> > you can catch the exception outside and write own message
>
> OK, here is patch with tests which don't depend on function OIDs
>
> They ignore stack trace ($::errorinfo global variable) completely,
> and analyze just error message.
>
>
all tests passed

I have not any other objections

I'll mark this patch as ready for commiter

Regards

Pavel


>
> --
>Victor Wagner 
>


Re: [HACKERS] Re: proposal - psql: possibility to specify sort for describe commands, when size is printed

2017-03-11 Thread Pavel Stehule
2017-03-10 15:45 GMT+01:00 Alexander Korotkov :

> On Fri, Mar 10, 2017 at 5:10 PM, Peter Eisentraut <
> peter.eisentr...@2ndquadrant.com> wrote:
>
>> On 2/24/17 16:32, Pavel Stehule wrote:
>> > set EXTENDED_DESCRIBE_SORT size_desc
>> > \dt+
>> > \l+
>> > \di+
>> >
>> > Possible variants: schema_table, table_schema, size_desc, size_asc
>>
>> I can see this being useful, but I think it needs to be organized a
>> little better.
>>
>> Sort key and sort direction should be separate settings.
>>
>
> I agree.
>
> I'm not sure why we need to have separate settings to sort by schema
>> name and table name.
>
>
> I think sorting by schema name, object name makes sense for people, who
> have objects of same name in different schemas.
>

I am sending a updated version with separated sort direction in special
variable

There is a question. Has desc direction sense for columns like schema or
table name?

Using desc, asc for size is natural. But for tablename?

Regards

Pavel



>
> --
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 2a9c412020..747db58dd8 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -3507,6 +3507,27 @@ bar
   
 
   
+VERBOSE_SORT_COLUMNS
+
+
+This variable can be set to the values schema_name,
+name_schema or size to control the 
+order of content of decrible command.
+
+
+  
+
+  
+VERBOSE_SORT_DIRECTION
+
+
+This variable can be set to the values asc,
+or desc to control the order of content of decrible command.
+
+
+  
+
+  
 VERBOSITY
 
 
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 61a3e2a848..7ba24ea883 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -197,6 +197,9 @@ describeAccessMethods(const char *pattern, bool verbose)
 	return true;
 }
 
+#define SORT_DIRECTION_STR(v)		((v) == PSQL_SORT_ASC ? "ASC" : "DESC")
+
+
 /* \db
  * Takes an optional regexp to select particular tablespaces
  */
@@ -264,7 +267,18 @@ describeTablespaces(const char *pattern, bool verbose)
 		  NULL, "spcname", NULL,
 		  NULL);
 
-	appendPQExpBufferStr(&buf, "ORDER BY 1;");
+
+	if (verbose && pset.sversion >= 90200)
+	{
+		if (pset.verbose_sort_columns == PSQL_SORT_SIZE)
+			appendPQExpBuffer(&buf,
+			  "ORDER BY pg_catalog.pg_tablespace_size(oid) %s, 1;",
+			  SORT_DIRECTION_STR(pset.verbose_sort_direction));
+		else
+			appendPQExpBufferStr(&buf, "ORDER BY 1;");
+	}
+	else
+		appendPQExpBufferStr(&buf, "ORDER BY 1;");
 
 	res = PSQLexec(buf.data);
 	termPQExpBuffer(&buf);
@@ -824,7 +838,19 @@ listAllDbs(const char *pattern, bool verbose)
 		processSQLNamePattern(pset.db, &buf, pattern, false, false,
 			  NULL, "d.datname", NULL, NULL);
 
-	appendPQExpBufferStr(&buf, "ORDER BY 1;");
+	if (verbose && pset.sversion >= 80200)
+	{
+		if (pset.verbose_sort_columns == PSQL_SORT_SIZE)
+			appendPQExpBuffer(&buf,
+		  "ORDER BY CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT')\n"
+		  "  THEN pg_catalog.pg_database_size(d.datname) END %s, 1;\n",
+		  SORT_DIRECTION_STR(pset.verbose_sort_direction));
+		else
+			appendPQExpBufferStr(&buf, "ORDER BY 1");
+	}
+	else
+		appendPQExpBufferStr(&buf, "ORDER BY 1");
+
 	res = PSQLexec(buf.data);
 	termPQExpBuffer(&buf);
 	if (!res)
@@ -3295,7 +3321,26 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
 		  "n.nspname", "c.relname", NULL,
 		  "pg_catalog.pg_table_is_visible(c.oid)");
 
-	appendPQExpBufferStr(&buf, "ORDER BY 1,2;");
+	if (verbose && pset.sversion >= 80100)
+	{
+		if (pset.verbose_sort_columns == PSQL_SORT_SCHEMA_NAME)
+			appendPQExpBufferStr(&buf, "ORDER BY 1,2;");
+		else if (pset.verbose_sort_columns == PSQL_SORT_NAME_SCHEMA)
+			appendPQExpBufferStr(&buf, "ORDER BY 2,1;");
+		else
+		{
+			if (pset.sversion >= 9)
+appendPQExpBuffer(&buf,
+	"ORDER BY pg_catalog.pg_table_size(c.oid) %s, 1,2",
+	SORT_DIRECTION_STR(pset.verbose_sort_direction));
+			else
+appendPQExpBuffer(&buf,
+	"ORDER BY pg_catalog.pg_relation_size(c.oid) %s, 1,2",
+	SORT_DIRECTION_STR(pset.verbose_sort_direction));
+		}
+	}
+	else
+		appendPQExpBufferStr(&buf, "ORDER BY 1,2;");
 
 	res = PSQLexec(buf.data);
 	termPQExpBuffer(&buf);
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index ba14df0344..1ebe397a85 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -327,7 +327,7 @@ helpVariables(unsigned short int pager)
 	 * Windows builds currently print one more line than non-Windows builds.
 	 * Using the larger number is fine.
 	 */
-	output = PageOutput(88, pager ? &(pset.popt.topt) : NULL);
+	output = PageOutput(92, pager ? &(pset.popt.topt) 

Re: [HACKERS] make check-world output

2017-03-11 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera  writes:
> > Jeff Janes wrote:
> >> There was some recent discussion about making "make check-world" faster.
> >> I'm all for that, but how about making it quieter?  On both machines I've
> >> run it on (CentOS6.8 and Ubuntu 16.04.2), it dumps some gibberish to
> >> stderr, example attached.
> 
> > I think you're complaining about the test code added by commit
> > fcd15f13581f.  Together with behavior introduced by 2f227656076a, it is
> > certainly annoying.  I would vote for redirecting that output to a log
> > file which can be ignored/removed unless there is a failure.
> 
> What about just reverting 2f227656076a?

That works for me too, if we think we no longer need that level of
detail.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] background sessions

2017-03-11 Thread Pavel Stehule
2017-03-09 14:52 GMT+01:00 Peter Eisentraut :

> On 3/8/17 14:22, Pavel Stehule wrote:
> > 1. will be background session process closed automatically when parent
> > process is closed?
>
> If the communications queue goes away the process will eventually die.
> This is similar to how a backend process will eventually die if the
> client goes away.  Some more testing would be good here.
>

what means "eventually die"?

I called pg_sleep() in called subprocess.

Cancel, terminating parent process has not any effect. It is maybe
artificial test.

Little bit more realistic - waiting on table lock in background worker was
successful - and when parent was cancelled, then worker process was
destroyed too.

But when parent was terminated, then background worker process continued.

What is worse - the background worker had 100% CPU and I had to restart
notebook.

CREATE OR REPLACE FUNCTION public.foo()
 RETURNS void
 LANGUAGE plpythonu
AS $function$
with plpy.BackgroundSession() as a:
  a.execute('update foo2 set a = 30')
  a.execute('insert into foo2 values(10)')
$function$
postgres=#


I blocked foo2 in another session.

Regards

Pavel


>
> > 2. what timeouts are valid for this process - statement timeout, idle in
> > transaction timeout
>
> Those should work the same way.  It's the same code that runs the
> queries, starts/stops transactions, etc.
>
> > I see significant risk on leaking sessions.
>
> Yeah, that's a valid concern.  But I think it works ok.
>
> > There can be more doc and examples in plpython doc. It will be main
> > interface for this feature. Mainly about session processing.
>
> OK, I'll look into that again.
>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [HACKERS] make check-world output

2017-03-11 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> What about just reverting 2f227656076a?

> That works for me too, if we think we no longer need that level of
> detail.

A general issue with this sort of messaging is that when things are
working more or less normally, you'd just as soon not see it ... but
when you need to debug problems with the test scaffolding itself,
you want verbosity.

For the basic build process, we've largely solved that through the
use of "make -s".  But we don't really have a comparable "be quiet"
option for test runs, especially not the TAP tests.  Maybe we need
to think a bit more globally about what it is we're trying to
accomplish.  I could definitely see that "set -x" in pg_upgrade's
test.sh might be useful as part of a verbose testing option; what
it lacks is a way to turn it off.

An entirely different big-picture point is "what the heck are we
doing using a shell script here at all?  It is useless for testing
on Windows".  Somebody should look into rewriting it in Perl,
perhaps using the facilities of PostgresNode.pm.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-11 Thread Tom Lane
I wrote:
> I believe the core problem is that contrib/test_decoding's regresscheck
> and isolationcheck targets each want to use ./tmp_check as their
> --temp-instance.  make has no reason to believe it shouldn't run those
> two sub-jobs in parallel, but when it does, we get two postmasters trying
> to share the same directory.  This looks reasonably straightforward to
> solve, but I'm not entirely familiar with the code here, and am not
> sure what is the least ugly way to fix it.

Enlarging on that: if I cd into contrib/test_decoding and do
"make check -j4" or so, it reliably fails.  With the attached patch,
it passes.  This is a localized patch that only fixes things for
contrib/test_decoding; what I'm wondering is if it would be better to
establish a more widespread convention that $(pg_isolation_regress_check)
should use a different --temp-instance directory than $(pg_regress_check)
does.

regards, tom lane

diff --git a/contrib/test_decoding/.gitignore b/contrib/test_decoding/.gitignore
index 1f95503..09d60bf 100644
*** a/contrib/test_decoding/.gitignore
--- b/contrib/test_decoding/.gitignore
***
*** 3,5 
--- 3,6 
  /isolation_output/
  /regression_output/
  /tmp_check/
+ /tmp_check_iso/
diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index d2bc8b8..ea31d88 100644
*** a/contrib/test_decoding/Makefile
--- b/contrib/test_decoding/Makefile
*** PGFILEDESC = "test_decoding - example of
*** 5,11 
  
  # Note: because we don't tell the Makefile there are any regression tests,
  # we have to clean those result files explicitly
! EXTRA_CLEAN = $(pg_regress_clean_files) ./regression_output ./isolation_output
  
  ifdef USE_PGXS
  PG_CONFIG = pg_config
--- 5,12 
  
  # Note: because we don't tell the Makefile there are any regression tests,
  # we have to clean those result files explicitly
! EXTRA_CLEAN = $(pg_regress_clean_files) ./regression_output \
! 	./isolation_output ./tmp_check_iso
  
  ifdef USE_PGXS
  PG_CONFIG = pg_config
*** isolationcheck: | submake-isolation subm
*** 59,64 
--- 60,66 
  	$(MKDIR_P) isolation_output
  	$(pg_isolation_regress_check) \
  	--temp-config $(top_srcdir)/contrib/test_decoding/logical.conf \
+ 	--temp-instance=./tmp_check_iso \
  	--outputdir=./isolation_output \
  	$(ISOLATIONCHECKS)
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-11 Thread Oleg Bartunov
On Fri, Mar 10, 2017 at 7:07 AM, Petr Jelinek 
wrote:

> On 09/03/17 19:50, Peter van Hardenberg wrote:
> > Anecdotally, we just stored dates as strings and used a convention (key
> > ends in "_at", I believe) to interpret them. The lack of support for
> > dates in JSON is well-known, universally decried... and not a problem
> > the PostgreSQL community can fix.
> >
>
> The original complain was about JSON_VALUE extracting date but I don't
> understand why there is problem with that, the SQL/JSON defines that
> behavior. The RETURNING clause there is more or less just shorthand for
> casting with some advanced options.
>

There is no problem with serializing date and SQL/JSON describes it rather
well. There is no correct procedure to deserialize date from a correct json
string and the standards keeps silence about this and now we understand
that date[time] is actually virtual and the only use of them is in jsonpath
(filter) expressions.



>
> --
>   Petr Jelinek  http://www.2ndQuadrant.com/
>   PostgreSQL Development, 24x7 Support, Training & Services
>


Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-11 Thread Jim Nasby

On 3/10/17 6:06 PM, Peter Eisentraut wrote:

On 3/10/17 19:00, Jim Nasby wrote:

Maybe instead of having the commitfest app try and divine patches from
the list it should be able to send patches to the list from a specified
git repo/branch. Anyone that provides that info would have tests run
automagically, patches sent, etc. Anyone who doesn't can just keep using
the old process.


Those people who know what they're doing will presumably run all those
checks before they submit a patch.  It's those people who send in
patches that don't apply cleanly or fail the tests that would benefit
from this system.  But if they're that careless, then they also won't
take care to use this particular system correctly.


It's actually a lot harder to mess up providing a git repo link than 
manually submitting patches to the mailing list. For most patches, it's 
also a hell of a lot faster to just submit a repo URL rather than 
dealing with patch files. Having this also means that reviewers can 
focus more on what the patch is actually doing instead of mechanical 
crap best left to a machine.


Of course, *you* work on changes that are far more complex than any 
newbie will, and it wouldn't surprise me if such a feature wouldn't help 
you or other senior hackers at all. But AFAICT it wouldn't get in your 
way either. It would remove yet another burden for new hackers.


Anyway, this is well off topic for the original thread...
--
Jim Nasby, Chief Data Architect, OpenSCG
http://OpenSCG.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

2017-03-11 Thread Pavel Stehule
Hi

This proposal is followup of implementation of XMLTABLE.

Lot of XML documents has assigned document namespace.

http://x.y";>10

For these XML document any search path must use schema "http://x.y";. This
is not too intuitive, and from XMLTABLE usage is not too user friendly,
because the default column path (same like column name) cannot be used. A
solution of this issue is default namespace - defined in SQL/XML.

example - related to previous xml

without default namespace:
XMLTABLE(NAMESPACES('http://x.y' AS aux),
'/aux:rows/aux:row' PASSING ...
COLUMNS a int PATH 'aux:a')

with default namespace
XMLTABLE(NAMESPACES(DEFAULT 'http://x.y'),
'/rows/row' PASSING ...
COLUMNS a int);


Unfortunately the libxml2 doesn't support default namespaces in XPath
expressions. Because the libxml2 functionality is frozen, there is not big
chance for support in near future. A implementation is not too hard -
although it requires simple XPath expressions state translator.

The databases with XMLTABLE implementation supports default namespace for
XPath expressions.

The patch for initial implementation is attached.

Regards

Pavel
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 583b3b241a..c2558a33ef 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10465,8 +10465,7 @@ SELECT xpath_exists('/my:a/text()', 'http://example.com";>test
  The optional XMLNAMESPACES clause is a comma-separated
  list of namespaces.  It specifies the XML namespaces used in
- the document and their aliases. A default namespace specification
- is not currently supported.
+ the document and their aliases.
 
 
 
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 0f512753e4..5a3715cc84 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -29,7 +29,7 @@ OBJS = acl.o amutils.o arrayfuncs.o array_expanded.o array_selfuncs.o \
 	tsquery_op.o tsquery_rewrite.o tsquery_util.o tsrank.o \
 	tsvector.o tsvector_op.o tsvector_parser.o \
 	txid.o uuid.o varbit.o varchar.o varlena.o version.o \
-	windowfuncs.o xid.o xml.o
+	windowfuncs.o xid.o xml.o xpath_parser.o
 
 like.o: like.c like_match.c
 
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index f81cf489d2..d59a76f0b4 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -91,7 +91,7 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 #include "utils/xml.h"
-
+#include "utils/xpath_parser.h"
 
 /* GUC variables */
 int			xmlbinary;
@@ -184,6 +184,7 @@ typedef struct XmlTableBuilderData
 	xmlXPathCompExprPtr xpathcomp;
 	xmlXPathObjectPtr xpathobj;
 	xmlXPathCompExprPtr *xpathscomp;
+	bool		with_default_ns;
 } XmlTableBuilderData;
 #endif
 
@@ -4180,6 +4181,7 @@ XmlTableInitOpaque(TableFuncScanState *state, int natts)
 	xtCxt->magic = XMLTABLE_CONTEXT_MAGIC;
 	xtCxt->natts = natts;
 	xtCxt->xpathscomp = palloc0(sizeof(xmlXPathCompExprPtr) * natts);
+	xtCxt->with_default_ns = false;
 
 	xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_ALL);
 
@@ -4272,6 +4274,8 @@ XmlTableSetDocument(TableFuncScanState *state, Datum value)
 #endif   /* not USE_LIBXML */
 }
 
+#define DEFAULT_NAMESPACE_NAME		"pgdefnamespace"
+
 /*
  * XmlTableSetNamespace
  *		Add a namespace declaration
@@ -4282,12 +4286,14 @@ XmlTableSetNamespace(TableFuncScanState *state, char *name, char *uri)
 #ifdef USE_LIBXML
 	XmlTableBuilderData *xtCxt;
 
-	if (name == NULL)
-		ereport(ERROR,
-(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("DEFAULT namespace is not supported")));
 	xtCxt = GetXmlTableBuilderPrivateData(state, "XmlTableSetNamespace");
 
+	if (name == NULL)
+	{
+		xtCxt->with_default_ns = true;
+		name = DEFAULT_NAMESPACE_NAME;
+	}
+
 	if (xmlXPathRegisterNs(xtCxt->xpathcxt,
 		   pg_xmlCharStrndup(name, strlen(name)),
 		   pg_xmlCharStrndup(uri, strlen(uri
@@ -4316,6 +4322,14 @@ XmlTableSetRowFilter(TableFuncScanState *state, char *path)
 (errcode(ERRCODE_DATA_EXCEPTION),
  errmsg("row path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathcomp = xmlXPathCompile(xstr);
@@ -4347,6 +4361,14 @@ XmlTableSetColumnFilter(TableFuncScanState *state, char *path, int colnum)
 (errcode(ERRCODE_DATA_EXCEPTION),
  errmsg("column path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathscomp[colnum] = xmlXPathCompile(xstr);
diff --git a/src/backend/utils/adt/xpath_parser.c b/src/backend/utils/adt/xpath_parser.c
new file mode 100644
index 00..7ec2d584c6
--- /dev/null
+++ b/src

Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-11 Thread Andres Freund
On 2017-03-11 12:05:23 -0500, Tom Lane wrote:
> I wrote:
> > I believe the core problem is that contrib/test_decoding's regresscheck
> > and isolationcheck targets each want to use ./tmp_check as their
> > --temp-instance.  make has no reason to believe it shouldn't run those
> > two sub-jobs in parallel, but when it does, we get two postmasters trying
> > to share the same directory.  This looks reasonably straightforward to
> > solve, but I'm not entirely familiar with the code here, and am not
> > sure what is the least ugly way to fix it.
> 
> Enlarging on that: if I cd into contrib/test_decoding and do
> "make check -j4" or so, it reliably fails.

Yep, can reproduce here as well. Interesting that, with -j16, I could
survive several dozen runs from the toplevel locally.


> This is a localized patch that only fixes things for
> contrib/test_decoding; what I'm wondering is if it would be better to
> establish a more widespread convention that
> $(pg_isolation_regress_check) should use a different --temp-instance
> directory than $(pg_regress_check) does.

I think that'd be a good plan.  We probably should also keep --outputdir
seperate (which test_decoding/Makefile does, but
pg_isolation_regress_check doesn't)?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit subtransactions for PL/Tcl

2017-03-11 Thread Tom Lane
Pavel Stehule  writes:
> I'll mark this patch as ready for commiter

I've pushed this after mostly-cosmetic cleanup.  One thing I changed
that's not cosmetic is to put

MemoryContextSwitchTo(oldcontext);

after the BeginInternalSubTransaction call.  I see there was some
discussion upthread about what to do there, but I think this is the
correct answer because it corresponds to what pltcl_subtrans_begin
does.  That means that the execution environment of the called Tcl
fragment will be the same as, eg, the loop_body in spi_exec.  I do not
think we want it to be randomly different from those pre-existing cases.

Now, I believe that the MemoryContextSwitchTo in pltcl_subtrans_begin
was probably just cargo-culted in there from similar code in plpgsql.
Since pltcl doesn't rely on CurrentMemoryContext to anywhere near the
same degree that plpgsql does, it's possible that we could drop the
MemoryContextSwitchTo in both places, and just let the called code
fragments run in the subtransaction's memory context.  But I'm not
convinced of that, and in any case it ought to be undertaken as a
separate patch.

Some other comments:

* Whitespace still wasn't very much per project conventions.
I fixed that by running it through pgindent.

* The golden rule for code style and placement is "make your patch
look like it was always there".  Dropping new code at the tail end
of the file is seldom a good avenue to that.  I stuck it after
pltcl_SPI_lastoid, on the grounds that it should be with the other
Tcl command functions and should respect the (mostly) alphabetical
order of those functions.  Likewise I adopted a header comment
format in keeping with the existing nearby functions.

* I whacked the SGML docs around pretty thoroughly.  That addition
wasn't respecting the style of the surrounding text either as to
markup or indentation, and it had some other issues like syntax
errors in the example functions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-11 Thread Tom Lane
Jim Nasby  writes:
> It's actually a lot harder to mess up providing a git repo link than 
> manually submitting patches to the mailing list.

Yeah, we've heard that proposal before.  We're still not doing it though.
Insisting on patches being actually submitted to the mailing list is
important for archival and possibly legal reasons.  If someone sends
in a link to $random-repo, once that site goes away there's no way to
determine exactly what was submitted.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-11 Thread Jim Nasby

On 3/11/17 2:06 PM, Tom Lane wrote:

Jim Nasby  writes:

It's actually a lot harder to mess up providing a git repo link than
manually submitting patches to the mailing list.

Yeah, we've heard that proposal before.  We're still not doing it though.
Insisting on patches being actually submitted to the mailing list is
important for archival and possibly legal reasons.  If someone sends
in a link to $random-repo, once that site goes away there's no way to
determine exactly what was submitted.


The full proposal was that the commitfest app have the ability to 
generate and post the patch for you, assuming that the smoke-test passes.

--
Jim Nasby, Chief Data Architect, OpenSCG
http://OpenSCG.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

2017-03-11 Thread Kevin Grittner
On Fri, Mar 10, 2017 at 9:12 PM, Mengxing Liu
 wrote:

> My name is Mengxing Liu. I am interested in the project "Eliminate
> O(N^2) scaling from rw-conflict tracking in serializable
> transactions”. After discussing with Kevin off-list, I think it's
> time to post discussion here. I am afraid that there are two things
> that I need your help. Thank you very much.

Welcome to the -hackers list!  This is a key part of how development
happens in the community.  Don't be shy about posting questions and
ideas, but also expect fairly blunt discussion to occur at times;
don't let that put you off.

>>> So the task is clear. We can use a tree-like or hash-like data
>>> structure to speed up this function.
>>
>> Right -- especially with a large number of connections holding a
>> large number of conflicts.  In one paper with high concurrency, they
>> found over 50% of the CPU time for PostgreSQL was going to these
>> functions (including functions called by them).  This seems to me to
>> be due to the O(N^2) (or possibly worse) performance from the number
>> of connections.
>
> Anyone knows the title of this paper? I want to reproduce its
> workloads.

I seem to remember there being a couple other papers or talks, but
this is probably the most informative:

http://sydney.edu.au/engineering/it/research/tr/tr693.pdf

>> Remember, I think most of the work here is going to be in
>> benchmarking.  We not only need to show improvements in simple test
>> cases using readily available tools like pgbench, but think about
>> what types of cases might be worst for the approach taken and show
>> that it still does well -- or at least not horribly.  It can be OK
>> to have some slight regression in an unusual case if the common
>> cases improve a lot, but any large regression needs to be addressed
>> before the patch can be accepted.  There are some community members
>> who are truly diabolical in their ability to devise "worst case"
>> tests, and we don't want to be blind-sided by a bad result from one
>> of them late in the process.
>>
>
> Are there any documents or links introducing how to test and
> benchmark PostgreSQL? I may need some time to learn about it.

There is pgbench:

https://www.postgresql.org/docs/devel/static/pgbench.html

A read-only load and a read/write mix should both be tested,
probably with a few different scales and client counts to force the
bottleneck to be in different places.  The worst problems have been
seen with 32 or more cores on 4 or more sockets with a large number
of active connections.  I don't know whether you have access to a
machine capable of putting this kind of stress on it (perhaps at
your university?), but if not, the community has access to various
resources we should be able to schedule time on.

It may pay for you to search through the archives of the last year
or two to look for other benchmarks and see what people have
previously done.

--
Kevin Grittner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index usage for elem-contained-by-const-range clauses

2017-03-11 Thread Jim Nasby

On 3/10/17 8:29 AM, Alexander Korotkov wrote:

That's cool idea.  But I would say more.  Sometimes it's useful to
transform "intcol between x and y" into "intcol <@ 'x,y'::int4range".
 btree_gin supports "intcol between x and y" as overlap of "intcol >= x"
and "intcol <= y".  That is very inefficient.  But it this clause would
be transformed into "intcol <@ 'x,y'::int4range", btree_gin could handle
this very efficient.


That's certainly be nice as well, but IMHO it's outside the scope of 
this patch to accomplish that.


BTW, while we're wishing for things... Something else that would be nice 
is if there was a way to do these kind of transforms without hacking the 
backend...



Also, I noticed that patch haven't regression tests.


BTW, those tests need to pay special attention to inclusive vs exclusive 
bounds.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] scram and \password

2017-03-11 Thread Joe Conway
On 03/10/2017 02:43 PM, Michael Paquier wrote:
> On Sat, Mar 11, 2017 at 2:53 AM, Jeff Janes  wrote:
>> Should the \password tool in psql inspect password_encryption and act on it
>> being 'scram'?
> 
> Not sure if it is wise to change the default fot this release.
> 
>> I didn't see this issue discussed, but the ability to search the archives
>> for backslashes is rather limited.
> 
> I'll save you some time: it has not been discussed. Nor has
> PQencryptPassword been mentioned. Changing to SCRAM is not that
> complicated, just call scram_build_verifier() and you are good to go.
> 
> Instead of changing the default, I think that we should take this
> occasion to rename PQencryptPassword to something like
> PQhashPassword(), and extend it with a method argument to support both
> md5 and scram. PQencryptPassword could also be marked as deprecated,
> or let as-is for some time. For \password, we could have another
> meta-command but that sounds grotty, or just extend \password with a
> --method argument. Switching the default could happen in another
> release.
> 
> A patch among those lines would be a simple, do people feel that this
> should be part of PG 10?

Seems like it should work in an analogous way to CREATE/ALTER ROLE.
According to the docs:

8<
ENCRYPTED
UNENCRYPTED

These key words control whether the password is stored encrypted in
the system catalogs. (If neither is specified, the default behavior is
determined by the configuration parameter password_encryption.) If the
presented password string is already in MD5-encrypted or SCRAM-encrypted
format, then it is stored encrypted as-is, regardless of whether
ENCRYPTED or UNENCRYPTED is specified (since the system cannot decrypt
the specified encrypted password string). This allows reloading of
encrypted passwords during dump/restore.
8<

So if the password is not already set, \password uses
password_encryption to determine which format to use, and if the
password is already set, then the current method is assumed.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


[HACKERS] INSERT INTO arr2(array[1].d, array[2].d)

2017-03-11 Thread Jim Nasby
Over in [1], I was very surprised to discover $SUBJECT[2]. I looked in 
the docs, and they clearly indicate that INSERT accepts "column names".


What's the best way to describe this? "column expression"? "field 
expression"?


1: 
https://www.postgresql.org/message-id/20170311005810.kuccp7t5t5jhe...@alap3.anarazel.de


2:
CREATE TABLE arr(d int[]);
CREATE TABLE arr2(arr arr)
INSERT INTO arr2(arr[1].d, arr[2].d) VALUES(ARRAY[1,2],ARRAY[3,4]) 
RETURNING *

┌───┐
│  arr  │
├───┤
│ {"(\"{1,2}\")","(\"{3,4}\")"} │
└───┘
--
Jim Nasby, Chief Data Architect, OpenSCG
http://OpenSCG.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to get the 'ctid' from a record type?

2017-03-11 Thread Jim Nasby

On 3/10/17 10:31 PM, Eric Ridge wrote:

What about this?  Is the tuple currently being evaluated (I suppose in
the case of a sequential scan) available in the context of a function call?


AFAIK that *very* specific case would work, because the executor would 
be handing you the raw tuple. Not a great bet to make though. Also, 
there should be a macro somewhere that will tell you whether you have a 
full tuple or not. You'd want to make sure to check that an throw an 
error if you weren't handed a full tuple.

--
Jim Nasby, Chief Data Architect, OpenSCG
http://OpenSCG.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT INTO arr2(array[1].d, array[2].d)

2017-03-11 Thread Andres Freund
On 2017-03-11 14:43:55 -0600, Jim Nasby wrote:
> Over in [1], I was very surprised to discover $SUBJECT[2]. I looked in the
> docs, and they clearly indicate that INSERT accepts "column names".

They also say "The column name can be qualified with a subfield name or
array subscript, if needed."


> What's the best way to describe this? "column expression"? "field
> expression"?

field expression is the better of the two, but I'm not really convinced
changing.


For reference:
> 1: 
> https://www.postgresql.org/message-id/20170311005810.kuccp7t5t5jhe...@alap3.anarazel.de

> 
> 2:
> CREATE TABLE arr(d int[]);
> CREATE TABLE arr2(arr arr)
> INSERT INTO arr2(arr[1].d, arr[2].d) VALUES(ARRAY[1,2],ARRAY[3,4]) RETURNING
> *
> ┌───┐
> │  arr  │
> ├───┤
> │ {"(\"{1,2}\")","(\"{3,4}\")"} │
> └───┘

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to get the 'ctid' from a record type?

2017-03-11 Thread Andres Freund
On 2017-03-11 04:31:16 +, Eric Ridge wrote:
> Well shoot.  That kinda spoils my plans.

I think you should elaborate on what you're trying to achieve -
otherwise our advice will be affected by the recent, widely reported,
crystal ball scarcity.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-11 Thread Tom Lane
Corey Huinker  writes:
> [ 0001.if_endif.v21.diff ]

Starting to poke at this... the proposal to add prove checks for psql
just to see whether \if respects ON_ERROR_STOP seems like an incredibly
expensive way to test a rather minor point.  On my machine, "make check"
in bin/psql goes from zero time to close to 8 seconds.  I'm not really
on board with adding that kind of time to every buildfarm run for the
foreseeable future just for this.

Couldn't we get close to the same coverage by adding a single-purpose
test script to the main regression tests?  Along the lines of

 \set ON_ERROR_STOP 1
 \if invalid
 \echo should not get here
 \endif
 \echo should not get here either

You could imagine just dropping that at the end of psql.sql, but I
think probably a separate script is worth the trouble.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] dblink leaks unnamed connections

2017-03-11 Thread Joe Conway
On 03/09/2017 08:31 AM, Joe Conway wrote:
> On 03/09/2017 07:54 AM, Tom Lane wrote:
>> Fujii Masao  writes:
>>> On Wed, Mar 8, 2017 at 3:52 PM, Tsunakawa, Takayuki
>>>  wrote:
 dblink fails to close the unnamed connection as follows when a new unnamed 
 connection is requested.  The attached patch fixes this.
>> 
>>> This issue was reported about ten years ago and added as TODO item.
>>> http://archives.postgresql.org/pgsql-hackers/2007-10/msg00895.php
>> 
>>> I agree that this is a bug, and am tempted to back-patch to all the 
>>> supported
>>> versions. But it had not been fixed in many years since the first report of
>>> the issue. So I'm not sure if it's ok to just treat this as a bug right now 
>>> and
>>> back-patch. Or we should fix this only in HEAD? Anyway I'd like to hear
>>> more opinions about this.
>> 
>> It looks to me like the issue simply fell through the cracks because Joe
>> wasn't excited about fixing it.  Now that we have a second complaint,
>> I think it's worth treating as a bug and back-patching.
>> 
>> (I've not read this particular patch and am not expressing an opinion
>> whether it's correct.)
> 
> Ok, will take another look.

I pushed a fix to all supported branches.

Thanks,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] How to get the 'ctid' from a record type?

2017-03-11 Thread Eric Ridge
On Sat, Mar 11, 2017 at 2:14 PM Andres Freund  wrote:

> On 2017-03-11 04:31:16 +, Eric Ridge wrote:
> > Well shoot.  That kinda spoils my plans.
>
> I think you should elaborate on what you're trying to achieve -
> otherwise our advice will be affected by the recent, widely reported,
> crystal ball scarcity.
>

What I'm trying to do is port https://github.com/zombodb/zombodb to
Postgres 9.6+.  It's an Access Method that stores full rows, encoded as
JSON, in Elasticsearch instead of in local storage.  It was fairly
straightforward to do the mechanical work to convert it to use 9.6's new AM
API (which is very nice, btw!), but the fact that 9.6 also disallows
including system columns (specifically ctid) has me turned upside down.

With <9.6, I was able to cook-up a scheme where it was able to answer
queries from the remote Elasticsearch index even when Postgres decided to
plan a sequential scan.  That hinged, mostly, on being able to create a
multi-column index where the first column was a function call that included
as an argument (among other things) the ctid system column.

The ability to answer sequential scans (and filters) using the remote ES
index is pretty important as the knowledge of how to do that exists in
Elasticsearch, not my custom operator function in Postgres.

Anyways, I've been trying to find a way to intuit the ctid system column
value with 9.6 and it's clear now that that just isn't possible.  The
closest I got was digging through
ActivePortal->queryDesc->estate->es_tuple, but that only works when it's a
real tuple, not one that's virtual or minimal.

I'm pretty sure that I need to be implementing a Custom Scan Provider
instead, and I've been spending time with that API too.  There's a pretty
steep learning curve for me, but I'll eventually get over that hump.

I could probably bore you with greater detail but basically, I want to take:
   CREATE INDEX idxfoo ON table USING zombodb (zdb(table),
zdb_to_json(table)) WITH (url='http://remote.ip.addr:9200/');
   SELECT * FROM table WHERE zdb(table) ==> 'some full text query' OR id =
42;

And have the "zdb(table) ==> 'some full text query'" bit be answered by my
extension, regardless of how PG wants to plan the query.  While I was able
to hack something together for <9.6, I think that means a Custom Scan
Provider now?

eric


[HACKERS] Preserving param location

2017-03-11 Thread Julien Rouhaud
Hello,

When a query contains parameters, the original param node contains the token
location.  However, this information is lost when the Const node is generated,
this one will only contain position -1 (unknown).

FWIW, we do have a use case for this (custom extension that tracks quals
statistics, which among other thing is used to regenerate query string from a
pgss normalized query, thus needing the original param location).

Is this something we want to get fixed? If yes, attached is a simple patch for
that.

Regards.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/src/backend/optimizer/util/clauses.c 
b/src/backend/optimizer/util/clauses.c
index b19380e1b1..beb0f99144 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -2444,6 +2444,7 @@ eval_const_expressions_mutator(Node *node,
int16   typLen;
bool
typByVal;
Datum   pval;
+   Const  *con;
 
Assert(prm->ptype == 
param->paramtype);

get_typlenbyval(param->paramtype,
@@ -2452,13 +2453,17 @@ eval_const_expressions_mutator(Node *node,
pval = 
prm->value;
else
pval = 
datumCopy(prm->value, typByVal, typLen);
-   return (Node *) 
makeConst(param->paramtype,
+
+   con = 
makeConst(param->paramtype,

  param->paramtypmod,

  param->paramcollid,

  (int) typLen,

  pval,

  prm->isnull,

  typByVal);
+   con->location = 
param->location;
+
+   return (Node *) con;
}
}
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-11 Thread Andres Freund
On 2017-03-07 09:36:51 -0300, Alvaro Herrera wrote:
> FWIW, +1 on improving matters here.
>
> Andres Freund wrote:
>
> > The best I can come up so far is a toplevel target that creates the temp
> > install, starts a cluster and then runs the 'installcheck-or-check'
> > target on all the subdirectories via recursion. Individual makefiles can
> > either use the pre-existing cluster (most of of contrib for example), or
> > use the temporary install and run their pre-existing check target using
> > that (the tap tests, test_decoding, ...).
>
> I think a toplevel installcheck-or-check target is a good first step
> (though definitely lets find a better name).  Just being able to run all
> tests without the need for 95% of pointless initdb's would be helpful
> enough.

Here's a hacky proof-of-concept patch that implements a contrib/
'fastcheck' target.
timing of:

make -s -j01 -Otarget check 2m49.286s
make -s -j16 -Otarget check 0m44.120s
make -s -j01 -Otarget fastcheck 1m1.533s
make -s -j16 -Otarget fastcheck 0m24.385s
make -s -j01 -Otarget USE_MODULE_DB=1 \
installcheck check-test_decoding-recurse0m56.016s
make -s -j16 -Otarget USE_MODULE_DB=1 \
installcheck check-test_decoding-recurse0m23.608s

All of these should, excepting rerunning initdb/server start, have
equivalent test coverage.

'fastcheck' is obviously, if you look at it, a POC. We'd need:
- abstract this away somehow, it's not nice to copy the logic into a
  makefile
- tie contrib/'fastcheck' into check-world by overriding that target
- use the temp install we've created previously?
- more bulletproof server stop mechanism?

- Andres
>From 77520424a7b8a45e629eada864baac49db7fceb1 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Sat, 11 Mar 2017 14:03:26 -0800
Subject: [PATCH] WIP; 'fastcheck' target in contrib

---
 contrib/Makefile | 17 +
 1 file changed, 17 insertions(+)

diff --git a/contrib/Makefile b/contrib/Makefile
index e84eb67008..ca08fc9b74 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -90,6 +90,23 @@ endif
 # Missing:
 #		start-scripts	\ (does not have a makefile)
 
+NONSHARED_DATADIR_TARGETS=test_decoding
+SHARED_DATADIR_TARGETS=$(filter-out $(NONSHARED_DATADIR_TARGETS), $(SUBDIRS))
+SERVOPTIONS="-c log_line_prefix='%m [%p] %q%a ' -c unix_socket_directories=$(shell pwd)/tmp_fastcheck/ -clisten_addresses= -c port=5432"
+REGRESSPORT=5432
+REGRESSHOST=$(shell pwd)/tmp_fastcheck/
+
+fastcheck:
+	@mkdir -p tmp_fastcheck
+	@rm -rf tmp_fastcheck/data
+	$(bindir)/initdb --no-clean --no-sync -D tmp_fastcheck/data > tmp_fastcheck/initdb.log
+	PGPORT=$(REGRESSPORT) PGHOST=$(REGRESSHOST) \
+		$(bindir)/pg_ctl -D tmp_fastcheck/data -o$(SERVOPTIONS) --log tmp_fastcheck/postmaster.log start
+	$(MAKE) PGPORT=$(REGRESSPORT) PGHOST=$(REGRESSHOST) USE_MODULE_DB=1 -Otarget \
+		$(foreach t,$(SHARED_DATADIR_TARGETS),installcheck-$(t)-recurse) \
+		$(foreach t,$(NONSHARED_DATADIR_TARGETS),check-$(t)-recurse) || \
+		$(bindir)/pg_ctl -D tmp_fastcheck/data stop
+	$(bindir)/pg_ctl -D tmp_fastcheck/data stop
 
 $(recurse)
 $(recurse_always)
-- 
2.11.0.22.g8d7a455.dirty


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] scram and \password

2017-03-11 Thread Michael Paquier
On Sun, Mar 12, 2017 at 5:35 AM, Joe Conway  wrote:
> On 03/10/2017 02:43 PM, Michael Paquier wrote:
>> Instead of changing the default, I think that we should take this
>> occasion to rename PQencryptPassword to something like
>> PQhashPassword(), and extend it with a method argument to support both
>> md5 and scram. PQencryptPassword could also be marked as deprecated,
>> or let as-is for some time. For \password, we could have another
>> meta-command but that sounds grotty, or just extend \password with a
>> --method argument. Switching the default could happen in another
>> release.
>>
>> A patch among those lines would be a simple, do people feel that this
>> should be part of PG 10?
>
> Seems like it should work in an analogous way to CREATE/ALTER ROLE.
> According to the docs:
>
> 8<
> ENCRYPTED
> UNENCRYPTED
>
> These key words control whether the password is stored encrypted in
> the system catalogs. (If neither is specified, the default behavior is
> determined by the configuration parameter password_encryption.) If the
> presented password string is already in MD5-encrypted or SCRAM-encrypted
> format, then it is stored encrypted as-is, regardless of whether
> ENCRYPTED or UNENCRYPTED is specified (since the system cannot decrypt
> the specified encrypted password string). This allows reloading of
> encrypted passwords during dump/restore.
> 8<
>
> So if the password is not already set, \password uses
> password_encryption to determine which format to use, and if the
> password is already set, then the current method is assumed.

Yeah, the problem here being that this routine does not need a live
connection to work, and we surely don't want to make that mandatory,
that's why I am suggesting something like the above. Another approach
would be to switch to SCRAM once password_encryption does this switch
as well... There is no perfect scenario here.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-11 Thread Andres Freund
On 2017-03-11 14:17:42 -0800, Andres Freund wrote:
> On 2017-03-07 09:36:51 -0300, Alvaro Herrera wrote:
> > FWIW, +1 on improving matters here.
> >
> > Andres Freund wrote:
> >
> > > The best I can come up so far is a toplevel target that creates the temp
> > > install, starts a cluster and then runs the 'installcheck-or-check'
> > > target on all the subdirectories via recursion. Individual makefiles can
> > > either use the pre-existing cluster (most of of contrib for example), or
> > > use the temporary install and run their pre-existing check target using
> > > that (the tap tests, test_decoding, ...).
> >
> > I think a toplevel installcheck-or-check target is a good first step
> > (though definitely lets find a better name).  Just being able to run all
> > tests without the need for 95% of pointless initdb's would be helpful
> > enough.
> 
> Here's a hacky proof-of-concept patch that implements a contrib/
> 'fastcheck' target.
> timing of:
> 
> make -s -j01 -Otarget check   2m49.286s
> make -s -j16 -Otarget check   0m44.120s
> make -s -j01 -Otarget fastcheck   1m1.533s
> make -s -j16 -Otarget fastcheck   0m24.385s
> make -s -j01 -Otarget USE_MODULE_DB=1 \
>   installcheck check-test_decoding-recurse0m56.016s
> make -s -j16 -Otarget USE_MODULE_DB=1 \
>   installcheck check-test_decoding-recurse0m23.608s

Ooops - all these timings are from a coverage enabled build - the times
are overall a bit smaller without that.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changing references of password encryption to hashing

2017-03-11 Thread Joe Conway
On 03/09/2017 06:16 PM, Michael Paquier wrote:
> As discussed here:
> https://www.postgresql.org/message-id/98cafcd0-5557-0bdf-4837-0f2b7782d...@joeconway.com
> We are using in documentation and code comments "encryption" to define
> what actually is hashing, which is confusing.
> 
> Attached is a patch for HEAD to change the documentation to match hashing.
> 
> There are a couple of things I noticed on the way:
> 1) There is the user-visible PQencryptPassword in libpq, which
> actually hashes the password, and not encrypts it :)
> 2) There is as well pg_md5_encrypt() in the code, as well as there is
> pg_md5_hash(). Those may be better if renamed, at least I would think
> that pg_md5_encrypt should be pg_md5_password_hash, because a salt is
> used as well, and the routine is dedicated to work on passwords.
> Thoughts?
> 3) createuser also has --encrypt and --unencrypted, perhaps those
> should be renamed? Honestly I don't really think that this is worth a
> breakage and the option names match with the SQL commands.

My opinion is that the user visible aspects of this should be deprecated
and correct syntax provided. But perhaps that is overkill.

8<
key and server key, respectively, in hexadecimal format. A password that
-   does not follow either of those formats is assumed to be unencrypted.
+   does not follow either of those formats is assumed to be in plain
format,
+   non-hashed.
   
  
8<
I think here, instead of "in plain format, non-hashed" it is ok to just
say "cleartext" or maybe "plaintext". Whichever is picked, it should be
used consistently.

8<
  
   
-   To prevent unencrypted passwords from being sent across the network,
+   To prevent non-hashed passwords from being sent across the network,
8<
same here "non-hashed" ==> "cleartext"

8<
   
-   Caution must be exercised when specifying an unencrypted password
+   Caution must be exercised when specifying a non-hashed password
8<
and here

...etc...

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


[HACKERS] src/test/regress's check-prepared-txns target

2017-03-11 Thread Andres Freund
Hi,

In 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ae55d9fbe3871a5e6309d9b91629f1b0ff2b8cba
src/test/regress grew a check-prepared-txns (and an accompanying
installcheck-prepared-txns) target.

Is that still sensible given that pg_regress actually enables prepared
transactions?  I bet these tests are run just about never.

I'd suggest removing at least check-prepared-txns and adding the test to
the normal check target (which is guaranteed to have prepared xacts enabled).

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] scram and \password

2017-03-11 Thread Joe Conway
On 03/11/2017 02:21 PM, Michael Paquier wrote:
> On Sun, Mar 12, 2017 at 5:35 AM, Joe Conway  wrote:
>> So if the password is not already set, \password uses
>> password_encryption to determine which format to use, and if the
>> password is already set, then the current method is assumed.
> 
> Yeah, the problem here being that this routine does not need a live
> connection to work, and we surely don't want to make that mandatory,
> that's why I am suggesting something like the above. Another approach
> would be to switch to SCRAM once password_encryption does this switch
> as well... There is no perfect scenario here.

You might extend PQencryptPassword() to take a method. Or create a new
function that does? Certainly psql has a connection available to run the
ALTER ROLE command that it crafts.

I guess a related problem might be, do we have a SQL visible way to
determine what method is used by the current password for a given role?

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] scram and \password

2017-03-11 Thread Michael Paquier
On Sun, Mar 12, 2017 at 8:04 AM, Joe Conway  wrote:
> On 03/11/2017 02:21 PM, Michael Paquier wrote:
>> On Sun, Mar 12, 2017 at 5:35 AM, Joe Conway  wrote:
>>> So if the password is not already set, \password uses
>>> password_encryption to determine which format to use, and if the
>>> password is already set, then the current method is assumed.
>>
>> Yeah, the problem here being that this routine does not need a live
>> connection to work, and we surely don't want to make that mandatory,
>> that's why I am suggesting something like the above. Another approach
>> would be to switch to SCRAM once password_encryption does this switch
>> as well... There is no perfect scenario here.
>
> You might extend PQencryptPassword() to take a method. Or create a new
> function that does? Certainly psql has a connection available to run the
> ALTER ROLE command that it crafts.

Yeah but it can be called as well while the application is calling
PQgetResult() and still looping until it gets a NULL result. Not sure
if this is a use-case to worry about, but sending a query to the
server in PQencryptPassword() could as well break some applications.

PQencryptPassword() is used for CREATE/ALTER ROLE commands, so
actually wouldn't it make sense to just switch PQencryptPassword to
handle SCRAM if at some point we decide to switch the default from md5
to scram? So many questions.

> I guess a related problem might be, do we have a SQL visible way to
> determine what method is used by the current password for a given role?

Nope. We are simply looking at a function doing a lookup at pg_authid
and then use get_password_type() to check which type of verifier is
used... Or have the type of verifier as a new column of pg_authid,
information that could be made visible to any users with column
privileges.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-11 Thread Andres Freund
On 2017-03-11 11:48:31 -0800, Andres Freund wrote:
> On 2017-03-11 12:05:23 -0500, Tom Lane wrote:
> > I wrote:
> > > I believe the core problem is that contrib/test_decoding's regresscheck
> > > and isolationcheck targets each want to use ./tmp_check as their
> > > --temp-instance.  make has no reason to believe it shouldn't run those
> > > two sub-jobs in parallel, but when it does, we get two postmasters trying
> > > to share the same directory.  This looks reasonably straightforward to
> > > solve, but I'm not entirely familiar with the code here, and am not
> > > sure what is the least ugly way to fix it.
> > 
> > Enlarging on that: if I cd into contrib/test_decoding and do
> > "make check -j4" or so, it reliably fails.
> 
> Yep, can reproduce here as well. Interesting that, with -j16, I could
> survive several dozen runs from the toplevel locally.
> 
> 
> > This is a localized patch that only fixes things for
> > contrib/test_decoding; what I'm wondering is if it would be better to
> > establish a more widespread convention that
> > $(pg_isolation_regress_check) should use a different --temp-instance
> > directory than $(pg_regress_check) does.
> 
> I think that'd be a good plan.  We probably should also keep --outputdir
> seperate (which test_decoding/Makefile does, but
> pg_isolation_regress_check doesn't)?

Here's a patch doing that (based on yours).  I'd be kind of inclined to
set --outputdir for !isolation tests too; possibly even move tmp_check
below output_iso/ output_regress/ or such - but that seems like it
potentially could cause some disagreement...

- Andres
>From fb89f431d120e9123dca367d23f330b7d31b3f01 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Sat, 11 Mar 2017 15:03:18 -0800
Subject: [PATCH] Improve isolation regression tests infrastructure.

Previously if a directory had both isolationtester and plain
regression tests they couldn't be run in parallel, because they'd
access the same files/directories.  That, so far, only affected
contrib/test_decoding.

Rather than fix that locally in contrib/test_decoding improve
pg_regress_isolation_[install]check to use separate resources from
plain regression tests.

Use the improved helpers everywhere, even where previously not used.

Author: Tom Lane and Andres Freund
Discussion: https://postgr.es/m/20170311194831.vm5ikpczq52c2...@alap3.anarazel.de
---
 contrib/test_decoding/.gitignore |  5 +++--
 contrib/test_decoding/Makefile   |  7 +--
 src/Makefile.global.in   | 29 ++--
 src/test/isolation/.gitignore|  5 ++---
 src/test/isolation/Makefile  |  8 
 src/test/modules/snapshot_too_old/.gitignore |  2 +-
 src/test/modules/snapshot_too_old/Makefile   |  6 +++---
 7 files changed, 37 insertions(+), 25 deletions(-)

diff --git a/contrib/test_decoding/.gitignore b/contrib/test_decoding/.gitignore
index 1f95503494..b4903eba65 100644
--- a/contrib/test_decoding/.gitignore
+++ b/contrib/test_decoding/.gitignore
@@ -1,5 +1,6 @@
 # Generated subdirectories
 /log/
-/isolation_output/
-/regression_output/
+/results/
+/output_iso/
 /tmp_check/
+/tmp_check_iso/
diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index d2bc8b8350..6c18189d9d 100644
--- a/contrib/test_decoding/Makefile
+++ b/contrib/test_decoding/Makefile
@@ -5,7 +5,7 @@ PGFILEDESC = "test_decoding - example of a logical decoding output plugin"
 
 # Note: because we don't tell the Makefile there are any regression tests,
 # we have to clean those result files explicitly
-EXTRA_CLEAN = $(pg_regress_clean_files) ./regression_output ./isolation_output
+EXTRA_CLEAN = $(pg_regress_clean_files)
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
@@ -42,11 +42,8 @@ REGRESSCHECKS=ddl xact rewrite toast permissions decoding_in_xact \
 	spill slot
 
 regresscheck: | submake-regress submake-test_decoding temp-install
-	$(MKDIR_P) regression_output
 	$(pg_regress_check) \
 	--temp-config $(top_srcdir)/contrib/test_decoding/logical.conf \
-	--temp-instance=./tmp_check \
-	--outputdir=./regression_output \
 	$(REGRESSCHECKS)
 
 regresscheck-install-force: | submake-regress submake-test_decoding temp-install
@@ -56,10 +53,8 @@ regresscheck-install-force: | submake-regress submake-test_decoding temp-install
 ISOLATIONCHECKS=mxact delayed_startup ondisk_startup concurrent_ddl_dml
 
 isolationcheck: | submake-isolation submake-test_decoding temp-install
-	$(MKDIR_P) isolation_output
 	$(pg_isolation_regress_check) \
 	--temp-config $(top_srcdir)/contrib/test_decoding/logical.conf \
-	--outputdir=./isolation_output \
 	$(ISOLATIONCHECKS)
 
 isolationcheck-install-force: all | submake-isolation submake-test_decoding temp-install
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 831d39a9d1..9796ffd753 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -545,14 +545,31 @@ TEMP_CONF += --temp-config=$(TEMP_CONFIG)
 endif
 
 pg

Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-11 Thread Corey Huinker
On Sat, Mar 11, 2017 at 4:17 PM, Tom Lane  wrote:

> Corey Huinker  writes:
> > [ 0001.if_endif.v21.diff ]
>
> Starting to poke at this... the proposal to add prove checks for psql
> just to see whether \if respects ON_ERROR_STOP seems like an incredibly
> expensive way to test a rather minor point.  On my machine, "make check"
> in bin/psql goes from zero time to close to 8 seconds.  I'm not really
> on board with adding that kind of time to every buildfarm run for the
> foreseeable future just for this.
>
> Couldn't we get close to the same coverage by adding a single-purpose
> test script to the main regression tests?  Along the lines of
>
>  \set ON_ERROR_STOP 1
>  \if invalid
>  \echo should not get here
>  \endif
>  \echo should not get here either
>
> You could imagine just dropping that at the end of psql.sql, but I
> think probably a separate script is worth the trouble.
>
> regards, tom lane
>


I think I can manage that. Just to be clear, you're asking me to replace
the perl script with one new sql script? If so, there's probably a few
non-on-stop tests in there that might be worth preserving in regression
form.


Re: [HACKERS] possible encoding issues with libxml2 functions

2017-03-11 Thread Noah Misch
On Mon, Feb 20, 2017 at 07:48:18PM +0100, Pavel Stehule wrote:
> Today I played with xml_recv function and with xml processing functions.
> 
> xml_recv function ensures correct encoding from document encoding to server
> encoding. But the decl section holds original encoding info - that should
> be obsolete after encoding. Sometimes we solve this issue by removing decl
> section - see the xml_out function.
> 
> Sometimes we don't do it - lot of functions uses direct conversion from
> xmltype to xmlChar.

> There are possible two fixes
> 
> a) clean decl on input - the encoding info can be removed from decl part
> 
> b) use xml_out_internal everywhere before transformation to
> xmlChar. pg_xmlCharStrndup can be good candidate.

I'd prefer (a) if the xml type were a new feature, because no good can come of
storing an encoding in each xml field when we know the actual encoding is the
database encoding.  However, if you implemented (a), we'd still see untreated
values brought over via pg_upgrade.  Therefore, I would try (b) first.  I
suspect the intent of xml_parse() was to implement (b); it will be interesting
to see your test case that malfunctions.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] scram and \password

2017-03-11 Thread Joe Conway
On 03/11/2017 03:15 PM, Michael Paquier wrote:
> Yeah but it can be called as well while the application is calling
> PQgetResult() and still looping until it gets a NULL result. Not sure
> if this is a use-case to worry about, but sending a query to the
> server in PQencryptPassword() could as well break some applications.

I was suggesting sending the query outside of PQencryptPassword() in
order to determine what method should be passed as a new argument to
PQencryptPassword().

> PQencryptPassword() is used for CREATE/ALTER ROLE commands, so
> actually wouldn't it make sense to just switch PQencryptPassword to
> handle SCRAM if at some point we decide to switch the default from md5
> to scram? So many questions.

As long as we support more than one method it would seem to me we need a
way to determine which one we want to use and not only default it, don't
we? Apologies if this has already been discussed -- I was not able to
follow the lengthy threads on SCRAM in any detail.

>> I guess a related problem might be, do we have a SQL visible way to
>> determine what method is used by the current password for a given role?
> 
> Nope. We are simply looking at a function doing a lookup at pg_authid
> and then use get_password_type() to check which type of verifier is
> used... Or have the type of verifier as a new column of pg_authid,
> information that could be made visible to any users with column
> privileges.

What happens if the user does not have privs for pg_authid? E.g. if I
want to change my own password what happens if the default is one
method, and my password uses the other -- now I cannot change my own
password using \password?

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-11 Thread Tom Lane
Corey Huinker  writes:
> [ 0001.if_endif.v21.diff ]

I had thought that this patch was pretty close to committable, but
the more I poke at it the less I think so.  The technology it uses
for skipping unexecuted script sections has got too many bugs.

* Daniel Verite previously pointed out the desirability of disabling
variable expansion while skipping script.  That doesn't seem to be here,
though there's an apparently-vestigial comment in psql/common.c claiming
that it is.  IIRC, I objected to putting knowledge of ConditionalStack
into the shared psqlscan.l lexer, and I still think that would be a bad
idea; but we need some way to get the lexer to shut that off.  Probably
the best way is to add a passthrough "void *" argument that would let the
get_variable callback function mechanize the rule about not expanding
in a false branch.

* Whether or not you think it's important not to expand skipped variables,
I think that it's critical that skipped backtick expressions not be
executed.  The specific use-case that I'm concerned about is backtick
evals in \if expressions, which are going to be all over the place as
long as we haven't got any native expression eval capability, and will
doubtless remain important even when/if we do.  So in a case like

\if something
\elif `expr :var1 + :var2 = :var3`
\endif

I think it's essential that expr not be called if the first if-condition
succeeded.  (That first condition might be checking whether the vars
contain valid integers, for example.)  The current patch gets this totally
wrong --- not only does it perform the backtick, but \elif complains if
the result isn't a valid bool.  I do not think that a skipped \if or \elif
should evaluate its argument at all.

* The documentation says that an \if or \elif expression extends to the
end of the line, but actually the code is just eating one OT_NORMAL
argument.  That means it's OK to do this:

regression=# \if 1 \echo foo \echo bar \endif
foo
bar
regression=# 

which doesn't seem insane, except that the inverse case is insane:

regression=# \if 0 \echo foo \echo bar \endif
regression@# 

(notice we're not out of the conditional).  Even if we change it to
eat the whole line as argument, this inconsistency will remain:

regression=# \if 1
regression=# \echo foo \endif
foo
regression=# 

(notice we're out of the conditional)

regression=# \if 0
regression@# \echo foo \endif
command ignored, use \endif or Ctrl-C to exit current branch.
regression@# 

(notice we're not out of the conditional)

This inconsistency seems to have to do with the code in HandleSlashCmds
that discards arguments until EOL after a failed backslash command.
You've modified that to also discard arguments after a non-executed
command, and I think that's broken.

* More generally, I do not think that the approach of having exec_command
simply fall out immediately when in a false branch is going to work,
because it ignores the fact that different backslash commands have
different argument parsing rules.  Some will eat the rest of the line and
some won't.  I'm afraid that it might be necessary to remove that code
block and add a test to every single backslash command that decides
whether to actually perform its action after it's consumed its arguments.
That would be tedious :-(.  But as it stands, backslash commands will get
parsed differently (ie with potentially-different ending points) depending
on whether they're in a live branch or not, and that seems just way too
error-prone to be allowed to stand.

* I think it's completely wrong to do either resetPQExpBuffer(query_buf)
or psql_scan_reset(scan_state) when deciding a branch is not to be
executed.  Compare these results:

regression=# select (1 +
regression(# \if 1
regression-# \echo foo
foo
regression-# \endif
regression-# 2);
 ?column? 
--
3
(1 row)

regression=# select (1 +
regression(# \if 0
regression-# \echo foo
command ignored, use \endif or Ctrl-C to exit current branch.
regression@# \endif
regression=# 2);
ERROR:  syntax error at or near "2"
LINE 1: 2);
^
regression=# 

If the first \if doesn't throw away an incomplete query buffer (which it
should not), then surely the second should not either.  Somebody who
actually wants to toss the query buffer can put \r into the appropriate
branch of their \if; it's not for us to force that on them.

* Also, the documentation for psql_scan_reset is pretty clear that it's to
be called when and only when the query buffer is reset, which makes your
calls in the bodies of the conditional commands wrong.  As an example:

regression=# select (1 +
regression(# 2;
regression(# 

(notice we've not sent the known-incomplete command to the server) vs

regression(# \r
Query buffer reset (cleared).
regression=# select (1 +
regression(# \if 1
regression-# \endif
regression-# 2;
ERROR:  syntax error at or near ";"
LINE 2: 2;
 ^
regression=# 

That happens because the \if code gratuituously resets the lexer,
as we can 

Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-11 Thread Robert Haas
On Fri, Mar 3, 2017 at 3:18 AM, Fabien COELHO  wrote:
> I'm ok with this patch. I think that the very simple automaton code
> structure achieved is worth the very few code duplications. It is also
> significantly shorter than the nested if/switch variants, and it does
> exactly what Tom and Robert wished with respect to errors, so I think that
> this is a good compromise.

I think that I have not taken a firm position on what the behavior
should be with respect to errors.I took the position that the
messages being printed saying what happened were too detailed, because
they not only described what had happened but also tried to
prognosticate what would happen next, which was dissimilar to what we
do elsewhere and likely to be hard to maintain - or even get right for
v1.  But I have not taken a position on what should happen if the
condition for \if or \elsif evaluates to a baffling value.  Corey's
prior proposal was to treat it, essentially, as neither true nor
false, skipping both arms of the if.  Tom seems to want an invalid
value treated as false.  You could also imagine pretending that the
command never happened at all, likely leading to complete chaos.
Other positions are also possible.  I suggest that doing it the way
Tom likes may be the path of least resistance, but this isn't really
something I'm very animated about personally.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Faster Expression Processing v4

2017-03-11 Thread Robert Haas
On Fri, Mar 10, 2017 at 7:42 PM, Bruce Momjian  wrote:
> On Fri, Mar 10, 2017 at 07:15:59PM -0500, Peter Eisentraut wrote:
>> On 3/10/17 14:40, Andres Freund wrote:
>> > I'd really like to get it in.  The performance improvements on its own
>> > are significant, and it provides the basis for significantly larger
>> > improvements again (JIT) - those followup improvements are large patches
>> > again though, so I'd rather not do all of that next cycle.
>> >
>> > My next step (over the weekend) is to add tests to execQual.c to get it
>> > a good chunk closer to 100% test coverage, and then do the same for the
>> > new implementation (there's probably very little additional tests needed
>> > after the conversion).  Given all tests pass before/after, and there's a
>> > lot of them, I think we can have a reasonable confidence of a low bug
>> > density.
>>
>> That seems like a plan, but now would probably be a good time for some
>> other hackers to take note of this proposal.
>
> Well, the executor is long overdue for improvement, so I would love to
> have this in 10.0.  I am not sure what additional polishing will happen
> if we punt it for 11.0.  I think the only downside of having it in 10.0
> is that it will not have lived in the source tree for as long a time
> between commit and PG release, but if it is tested, that seems fine.

Yeah.  I think we can't rule out the possibility that this patch is
full of bugs, but (1) a lot of the change is pretty mechanical and (2)
none of this touches the on-disk format or the catalog contents, so it
doesn't break anybody's existing install if we have to patch it.  On
the other hand, (3) it's a big patch and (4) if expression evaluation
doesn't work, PostgreSQL is pretty much unusable.

So my feeling is:

1. If Andres commits this and it turns out to be seriously buggy, he
should be prepared to revert it without a big fight.  We can try again
for v11.

2. If Andres commits this and it turns out to be mildly buggy, he
should be prepared to address bugs very proactively and very promptly.

3. If Andres wants to commit this for v10, it should be done RSN.
Feature freeze isn't technically until the end of March, but giant
potentialy-destabilizing patches that land on March 31st are likely to
lead to outcomes that nobody wants.

Like, I think, everyone else, I'd really like to have these speedups
and I think they will benefit everyone who uses PostgreSQL.  However,
I think that we cannot afford to have a repeat of the simplehash thing
where serious bugs sat around for months without really getting
properly addressed.  If that happens with this, I will be extremely
unhappy, and I'm fairly sure I won't be alone.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Write Ahead Logging for Hash Indexes

2017-03-11 Thread Robert Haas
On Sat, Mar 11, 2017 at 12:20 AM, Amit Kapila  wrote:
>>  /*
>> + * Change the shared buffer state in critical section,
>> + * otherwise any error could make it unrecoverable after
>> + * recovery.
>> + */
>> +START_CRIT_SECTION();
>> +
>> +/*
>>   * Insert tuple on new page, using _hash_pgaddtup to ensure
>>   * correct ordering by hashkey.  This is a tad inefficient
>>   * since we may have to shuffle itempointers repeatedly.
>>   * Possible future improvement: accumulate all the items for
>>   * the new page and qsort them before insertion.
>>   */
>>  (void) _hash_pgaddtup(rel, nbuf, itemsz, new_itup);
>>
>> +END_CRIT_SECTION();
>>
>> No way.  You have to start the critical section before making any page
>> modifications and keep it alive until all changes have been logged.
>>
>
> I think what we need to do here is to accumulate all the tuples that
> need to be added to new bucket page till either that page has no more
> space or there are no more tuples remaining in an old bucket.  Then in
> a critical section, add them to the page using _hash_pgaddmultitup and
> log the entire new bucket page contents as is currently done in patch
> log_split_page().

I agree.

> Now, here we can choose to log the individual
> tuples as well instead of a complete page, however not sure if there
> is any benefit for doing the same because XLogRecordAssemble() will
> anyway remove the empty space from the page.  Let me know if you have
> something else in mind.

Well, if you have two pages that are 75% full, and you move a third of
the tuples from one of them into the other, it's going to be about
four times more efficient to log only the moved tuples than the whole
page.  But logging the whole filled page wouldn't be wrong, just
somewhat inefficient.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

2017-03-11 Thread Mengxing Liu



> -Original Messages-
> From: "Kevin Grittner" 
> Sent Time: 2017-03-12 04:24:29 (Sunday)
> To: "Mengxing Liu" 
> Cc: "pgsql-hackers@postgresql.org" 
> Subject: Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict tracking in 
> serializable transactions
> 
> On Fri, Mar 10, 2017 at 9:12 PM, Mengxing Liu
>  wrote:
> 
> > My name is Mengxing Liu. I am interested in the project "Eliminate
> > O(N^2) scaling from rw-conflict tracking in serializable
> > transactions”. After discussing with Kevin off-list, I think it's
> > time to post discussion here. I am afraid that there are two things
> > that I need your help. Thank you very much.
> 
> Welcome to the -hackers list!  This is a key part of how development
> happens in the community.  Don't be shy about posting questions and
> ideas, but also expect fairly blunt discussion to occur at times;
> don't let that put you off.
> 

Thanks, Kevin. I will keep that in mind. 

> >>> So the task is clear. We can use a tree-like or hash-like data
> >>> structure to speed up this function.
> >>
> >> Right -- especially with a large number of connections holding a
> >> large number of conflicts.  In one paper with high concurrency, they
> >> found over 50% of the CPU time for PostgreSQL was going to these
> >> functions (including functions called by them).  This seems to me to
> >> be due to the O(N^2) (or possibly worse) performance from the number
> >> of connections.
> >
> > Anyone knows the title of this paper? I want to reproduce its
> > workloads.
> 
> I seem to remember there being a couple other papers or talks, but
> this is probably the most informative:
> 
> http://sydney.edu.au/engineering/it/research/tr/tr693.pdf
> 

Thanks, It's exactly what I need.

> >> Remember, I think most of the work here is going to be in
> >> benchmarking.  We not only need to show improvements in simple test
> >> cases using readily available tools like pgbench, but think about
> >> what types of cases might be worst for the approach taken and show
> >> that it still does well -- or at least not horribly.  It can be OK
> >> to have some slight regression in an unusual case if the common
> >> cases improve a lot, but any large regression needs to be addressed
> >> before the patch can be accepted.  There are some community members
> >> who are truly diabolical in their ability to devise "worst case"
> >> tests, and we don't want to be blind-sided by a bad result from one
> >> of them late in the process.
> >>
> >
> > Are there any documents or links introducing how to test and
> > benchmark PostgreSQL? I may need some time to learn about it.
> 
> There is pgbench:
> 
> https://www.postgresql.org/docs/devel/static/pgbench.html
> 
> A read-only load and a read/write mix should both be tested,
> probably with a few different scales and client counts to force the
> bottleneck to be in different places.  The worst problems have been
> seen with 32 or more cores on 4 or more sockets with a large number
> of active connections.  I don't know whether you have access to a
> machine capable of putting this kind of stress on it (perhaps at
> your university?), but if not, the community has access to various
> resources we should be able to schedule time on.
> 

There is a NUMA machine ( 120 cores, 8 sockets) in my lab. I think it's enough 
to put this kind of stress. 
Anyway, I would ask for help if necessary.

> It may pay for you to search through the archives of the last year
> or two to look for other benchmarks and see what people have
> previously done.
> 

I would try.



--
Mengxing Liu










-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-11 Thread Tom Lane
Robert Haas  writes:
> I think that I have not taken a firm position on what the behavior
> should be with respect to errors.I took the position that the
> messages being printed saying what happened were too detailed, because
> they not only described what had happened but also tried to
> prognosticate what would happen next, which was dissimilar to what we
> do elsewhere and likely to be hard to maintain - or even get right for
> v1.

I thought the same of the version you were complaining about, but
the current patch seems to have dialed it back a good deal.  Do you
still find the current error messages unmaintainable?

> But I have not taken a position on what should happen if the
> condition for \if or \elsif evaluates to a baffling value.  Corey's
> prior proposal was to treat it, essentially, as neither true nor
> false, skipping both arms of the if.  Tom seems to want an invalid
> value treated as false.  You could also imagine pretending that the
> command never happened at all, likely leading to complete chaos.

Hmm, if that "prior proposal" was indeed on the table, I missed it.
The current patch, AFAICS, implements your third choice, which I quite
agree would lead to complete chaos; there would be no way to write a
script that did anything useful with that.

It is interesting to think about what would happen if "expr is neither
true nor false" were defined as "skip immediately to \endif" (which
I think is the natural generalization of what you said to apply to an
intermediate \elif).  I believe that it'd be possible to work with it,
but it's not very clear if it'd be easier or harder to work with than
the rule of treating bogus results as false.  What is clear is that
it'd be unlike any other conditional construct I ever worked with.
As was pointed out upthread, "treat error results as false" is what
you get from "if" in a POSIX shell.  I think it's fair also to draw
an analogy to what SQL does with null boolean values, which is to
treat them as false when a decision is required (in, eg, WHERE or
CASE).  So I think "treat bogus results as false" is the most
conservative, least likely to cause unhappy surprises, solution here.

> Other positions are also possible.

If you've got concrete ideas about that, let's hear them.  I'm not
trying to foreclose discussion.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2017-03-11 Thread Robert Haas
On Fri, Mar 10, 2017 at 7:39 PM, Amit Kapila  wrote:
> I agree that more analysis can help us to decide if we can use subxids
> from PGPROC and if so under what conditions.  Have you considered the
> another patch I have posted to fix the issue which is to do this
> optimization only when subxids are not present?  In that patch, it
> will remove the dependency of relying on subxids in PGPROC.

Well, that's an option, but it narrows the scope of the optimization
quite a bit.  I think Simon previously opposed handling only the
no-subxid cases (although I may be misremembering) and I'm not that
keen about it either.

I was wondering about doing an explicit test: if the XID being
committed matches the one in the PGPROC, and nsubxids matches, and the
actual list of XIDs matches, then apply the optimization.  That could
replace the logic that you've proposed to exclude non-commit cases,
gxact cases, etc. and it seems fundamentally safer.  But it might be a
more expensive test, too, so I'm not sure.

It would be nice to get some other opinions on how (and whether) to
proceed with this.  I'm feeling really nervous about this right at the
moment, because it seems like everybody including me missed some
fairly critical points relating to the safety (or lack thereof) of
this patch, and I want to make sure that if it gets committed again,
we've really got everything nailed down tight.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] src/test/regress's check-prepared-txns target

2017-03-11 Thread Andrew Dunstan


On 03/11/2017 06:00 PM, Andres Freund wrote:
> Hi,
>
> In 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ae55d9fbe3871a5e6309d9b91629f1b0ff2b8cba
> src/test/regress grew a check-prepared-txns (and an accompanying
> installcheck-prepared-txns) target.
>
> Is that still sensible given that pg_regress actually enables prepared
> transactions?  I bet these tests are run just about never.
>
> I'd suggest removing at least check-prepared-txns and adding the test to
> the normal check target (which is guaranteed to have prepared xacts enabled).
>

Sounds reasonable.

cheers

andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-11 Thread Tom Lane
Andres Freund  writes:
> On 2017-03-11 11:48:31 -0800, Andres Freund wrote:
>> I think that'd be a good plan.  We probably should also keep --outputdir
>> seperate (which test_decoding/Makefile does, but
>> pg_isolation_regress_check doesn't)?

> Here's a patch doing that (based on yours).  I'd be kind of inclined to
> set --outputdir for !isolation tests too; possibly even move tmp_check
> below output_iso/ output_regress/ or such - but that seems like it
> potentially could cause some disagreement...

This looks generally sane to me, although I'm not very happy about folding
the "$(MKDIR_P) output_iso" call into pg_isolation_regress_check --- that
seems weird and unlike the way it's done for the regular regression test
case.  But probably Peter has a better-informed opinion.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-11 Thread Robert Haas
On Sat, Mar 11, 2017 at 9:40 PM, Tom Lane  wrote:
> I thought the same of the version you were complaining about, but
> the current patch seems to have dialed it back a good deal.  Do you
> still find the current error messages unmaintainable?

I haven't looked, but I had the impression this had been much improved.

>> But I have not taken a position on what should happen if the
>> condition for \if or \elsif evaluates to a baffling value.  Corey's
>> prior proposal was to treat it, essentially, as neither true nor
>> false, skipping both arms of the if.  Tom seems to want an invalid
>> value treated as false.  You could also imagine pretending that the
>> command never happened at all, likely leading to complete chaos.
>
> Hmm, if that "prior proposal" was indeed on the table, I missed it.
> The current patch, AFAICS, implements your third choice, which I quite
> agree would lead to complete chaos; there would be no way to write a
> script that did anything useful with that.

Well, other than: don't write a script with invalid commands in it.

But I'm not seriously advocating for that position.

> It is interesting to think about what would happen if "expr is neither
> true nor false" were defined as "skip immediately to \endif" (which
> I think is the natural generalization of what you said to apply to an
> intermediate \elif).  I believe that it'd be possible to work with it,
> but it's not very clear if it'd be easier or harder to work with than
> the rule of treating bogus results as false.  What is clear is that
> it'd be unlike any other conditional construct I ever worked with.

True.

> As was pointed out upthread, "treat error results as false" is what
> you get from "if" in a POSIX shell.  I think it's fair also to draw
> an analogy to what SQL does with null boolean values, which is to
> treat them as false when a decision is required (in, eg, WHERE or
> CASE).  So I think "treat bogus results as false" is the most
> conservative, least likely to cause unhappy surprises, solution here.

I don't mind that.  I was simply stating that I hadn't advocated for
anything in particular.

>> Other positions are also possible.
>
> If you've got concrete ideas about that, let's hear them.  I'm not
> trying to foreclose discussion.

I personally don't, but others may.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Append implementation

2017-03-11 Thread Robert Haas
On Fri, Mar 10, 2017 at 8:12 AM, Amit Khandekar  wrote:
>> +static inline void
>> +exec_append_scan_first(AppendState *appendstate)
>> +{
>> +appendstate->as_whichplan = 0;
>> +}
>>
>> I don't think this is buying you anything, and suggest backing it out.
>
> This is required for sequential Append, so that we can start executing
> from the first subplan.

My point is that there's really no point in defining a static inline
function containing one line of code.  You could just put that line of
code in whatever places need it, which would probably be more clear.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Append implementation

2017-03-11 Thread Robert Haas
On Fri, Mar 10, 2017 at 6:01 AM, Tels  wrote:
> Just a question for me to understand the implementation details vs. the
> strategy:
>
> Have you considered how the scheduling decision might impact performance
> due to "inter-plan parallelism vs. in-plan parallelism"?
>
> So what would be the scheduling strategy? And should there be a fixed one
> or user-influencable? And what could be good ones?
>
> A simple example:
>
> E.g. if we have 5 subplans, and each can have at most 5 workers and we
> have 5 workers overall.
>
> So, do we:
>
>   Assign 5 workers to plan 1. Let it finish.
>   Then assign 5 workers to plan 2. Let it finish.
>   and so on
>
> or:
>
>   Assign 1 workers to each plan until no workers are left?

Currently, we do the first of those, but I'm pretty sure the second is
way better.  For example, suppose each subplan has a startup cost.  If
you have all the workers pile on each plan in turn, every worker pays
the startup cost for every subplan.  If you spread them out, then
subplans can get finished without being visited by all workers, and
then the other workers never pay those costs.  Moreover, you reduce
contention for spinlocks, condition variables, etc.  It's not
impossible to imagine a scenario where having all workers pile on one
subplan at a time works out better: for example, suppose you have a
table with lots of partitions all of which are on the same disk, and
it's actually one physical spinning disk, not an SSD or a disk array
or anything, and the query is completely I/O-bound.  Well, it could
be, in that scenario, that spreading out the workers is going to turn
sequential I/O into random I/O and that might be terrible.  In most
cases, though, I think you're going to be better off.  If the
partitions are on different spindles or if there's some slack I/O
capacity for prefetching, you're going to come out ahead, maybe way
ahead.  If you come out behind, then you're evidently totally I/O
bound and have no capacity for I/O parallelism; in that scenario, you
should probably just turn parallel query off altogether, because
you're not going to benefit from it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Faster Expression Processing v4

2017-03-11 Thread Tomas Vondra

Hi,

On 03/07/2017 04:01 AM, Andres Freund wrote:

Hi,

Attached is an updated version of the patchset, but more importantly
some benchmark numbers.



I wanted to do a bit of testing and benchmarking on this, but 0004 seems 
to be a bit broken. The patch does not apply anymore - there are some 
conflicts in execQual.c, but I think I fixed those. But then I ran into 
a bunch of compile-time errors, because some of the executor nodes still 
reference bits that were moved elsewhere.


E.g. there's no targetlist in PlanState anymore, yet nodeGatherMerge and 
nodeTableFuncscan do this:


gm_state->ps.targetlist = (List *)
ExecInitExpr((Expr *) node->plan.targetlist,
 (PlanState *) gm_state);

Some of the nodes also assign to ps.qual values that are (List *), but 
that field was switched to ExprState. That needs fixing, I guess.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-11 Thread David G. Johnston
On Sat, Mar 11, 2017 at 5:45 PM, Tom Lane  wrote:

>
> * Whether or not you think it's important not to expand skipped variables,
> I think that it's critical that skipped backtick expressions not be
> executed.
> ​ [...] ​
> I do not think that a skipped \if or \elif
> should evaluate its argument at all.
>
>
​[...]
​

> * I'm not on board with having a bad expression result in failing
> the \if or \elif altogether.  It was stated several times upthread
> that that should be processed as though the result were "false",
> and I agree with that.


​+1​

​Oddly, Corey was using you as support for this position...though without
an actual quote:

"""
Tom was pretty adamant that invalid commands are not executed. So in a case
like this, with ON_ERROR_STOP off:

\if false
\echo 'a'
\elif true
\echo 'b'
\elif invalid
\echo 'c'
\endif

Both 'b' and 'c' should print, because "\elif invalid" should not execute.
The code I had before was simpler, but it missed that.
"""
https://www.postgresql.org/message-id/CADkLM%3De9BY_-
PT96mcs4qqiLtt8t-Fp%3De_AdycW-aS0OQvbC9Q%40mail.gmail.com

Also,

Robert made a comment somewhere along the line about users wanting to
simply re-type the intended line if the "invalid" was interactive and due
to a typo.  That concern is pretty much limited to just the "\if" situation
- if you typo an "\elif" block you can just type "\elif" again and begin
yet another "\elif" block.  I say we live with it and focus on the UX - if
you type \if no matter what happens after you hit enter you are in a
conditional block and will need to \endif at some point. Re-typing the
correct \if command will just make you need another one of them.

David J.


Re: [HACKERS] scram and \password

2017-03-11 Thread Michael Paquier
On Sun, Mar 12, 2017 at 9:10 AM, Joe Conway  wrote:
> On 03/11/2017 03:15 PM, Michael Paquier wrote:
>> Yeah but it can be called as well while the application is calling
>> PQgetResult() and still looping until it gets a NULL result. Not sure
>> if this is a use-case to worry about, but sending a query to the
>> server in PQencryptPassword() could as well break some applications.
>
> I was suggesting sending the query outside of PQencryptPassword() in
> order to determine what method should be passed as a new argument to
> PQencryptPassword().

Why not. Our thoughts don't overlap, I thought about having
PQencryptPassword() call itself the server for the value of
password_encryption, and force the type depending on what the server
answers.

>> PQencryptPassword() is used for CREATE/ALTER ROLE commands, so
>> actually wouldn't it make sense to just switch PQencryptPassword to
>> handle SCRAM if at some point we decide to switch the default from md5
>> to scram? So many questions.
>
> As long as we support more than one method it would seem to me we need a
> way to determine which one we want to use and not only default it, don't
> we? Apologies if this has already been discussed -- I was not able to
> follow the lengthy threads on SCRAM in any detail.

Definitely, the most simple solution would be just to extend
PQencryptPassword() with a method value, to allow a user to do what he
wants...

>>> I guess a related problem might be, do we have a SQL visible way to
>>> determine what method is used by the current password for a given role?
>>
>> Nope. We are simply looking at a function doing a lookup at pg_authid
>> and then use get_password_type() to check which type of verifier is
>> used... Or have the type of verifier as a new column of pg_authid,
>> information that could be made visible to any users with column
>> privileges.
>
> What happens if the user does not have privs for pg_authid? E.g. if I
> want to change my own password what happens if the default is one
> method, and my password uses the other -- now I cannot change my own
> password using \password?

You can now. However it would be a problem for a user having a SCRAM
verifier using an application that changes the password with
PQencryptPassword() as it would change it back to MD5 on an update.
Having a RLS on pg_authid to allow a user to look at its own password
type is an idea. With multiple verifier types per role such class of
bugs can be also completely discarded.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-11 Thread Fabien COELHO


Starting to poke at this... the proposal to add prove checks for psql 
just to see whether \if respects ON_ERROR_STOP seems like an incredibly 
expensive way to test a rather minor point. On my machine, "make check" 
in bin/psql goes from zero time to close to 8 seconds.  I'm not really 
on board with adding that kind of time to every buildfarm run for the 
foreseeable future just for this.


ISTM that these tests allowed to find bugs in the implementation, so they 
were useful at some point. They are still useful in the short term if the 
implementation is to be changed significantly to respond to your various 
requirements. The underlying issue with TAP test is that it installs a new 
cluster on each script, which is quite costly.


In this case, the same result could be achieved with a number of small 
failing tests, which only launch "psql". Could that be acceptable? What 
you suggest is to keep only *one* failing test, which I find is kind of a 
regression from a testing coverage perspective, although obviously it is 
possible.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers