Re: [HACKERS] Speedup twophase transactions
Hi David and Michael, > It would be great to get this thread closed out after 14 months and many > commits. > > PFA, latest patch which addresses Michael's comments. twophase.c: In function ‘PrepareRedoAdd’: > twophase.c:2539:20: warning: variable ‘gxact’ set but not used > [-Wunused-but-set-variable] > GlobalTransaction gxact; > There is a warning at compilation. > > Fixed. > The comment at the top of PrescanPreparedTransactions() needs to be > updated. Not only does this routine look for the contents of > pg_twophase, but it does also look at the shared memory contents, at > least those ones marked with inredo and not on_disk. > > Changed comments at top of PrescanPreparedTransactions() , StandbyRecoverPreparedTransactions(), and RecoverPreparedTransactions(). > + ereport(WARNING, > + (errmsg("removing future two-phase state data from > memory \"%u\"", > + xid))); > + PrepareRedoRemove(xid); > + continue > Those are not normal (partially because unlink is atomic, but not > durable)... But they match the correct coding pattern regarding > incorrect 2PC entries... I'd really like to see those switched to a > FATAL with unlink() made durable for those calls. > > Hmm, not sure what exactly we need to do here. If you look at the prior checks, there we already skip on-disk entries. So, typically, the entries that we encounter here will be in shmem only. > + /* Deconstruct header */ > + hdr = (TwoPhaseFileHeader *) buf; > + Assert(TransactionIdEquals(hdr->xid, xid)); > + > + if (TransactionIdPrecedes(xid, result)) > + result = xid; > This portion is repeated three times and could be easily refactored. > You could just have a routine that returns the oldes transaction ID > used, and ignore the result for StandbyRecoverPreparedTransactions() > by casting a (void) to let any kind of static analyzer understand that > we don't care about the result for example. Handling for subxids is > necessary as well depending on the code path. Spliting things into a > could of sub-routines may be more readable as well. There are really a > couple of small parts that can be gathered and strengthened. > > Have added a new function to do this now. It reads either from disk or shared memory and produces error/log messages accordingly as well now. > + /* > +* Recreate its GXACT and dummy PGPROC > +*/ > + gxactnew = MarkAsPreparing(xid, gid, > + hdr->prepared_at, > + hdr->owner, hdr->database, > + gxact->prepare_start_lsn, > + gxact->prepare_end_lsn); > MarkAsPreparing() does not need to be extended with two new arguments. > RecoverPreparedTransactions() is used only at the end of recovery, > where it is not necessary to look at the 2PC state files from the > records. In this code path inredo is also set to false :) > > That's not true. We will have entries with inredo set at the end of recovery as well. Infact the MarkAsPreparing() call from RecoverPreparedTransactions() is the one which will remove these inredo entries and convert them into regular entries. We have optimized the recovery code path as well. > + /* Delete TwoPhaseState gxact entry and/or 2PC file. */ > + PrepareRedoRemove(parsed.twophase_xid); > Both things should not be present, no? If the file is pushed to disk > it means that the checkpoint horizon has already moved. > > PREPARE in redo, followed by a checkpoint, followed by a COMMIT/ROLLBACK. We can have both the bits set in this case. > - ereport(ERROR, > + /* It's ok to find an entry in the redo/recovery case */ > + if (!gxact->inredo) > + ereport(ERROR, > (errcode(ERRCODE_DUPLICATE_OBJECT), > errmsg("transaction identifier \"%s\" is already in > use", > gid))); > + else > + { > + found = true; > + break; > + } > I would not have thought so. > > Since we are using the TwoPhaseState structure to track redo entries, at end of recovery, we will find existing entries. Please see my comments above for RecoverPreparedTransactions() > MarkAsPreparing and MarkAsPreparingInRedo really share the same code. > What if the setup of the dummy PGPROC entry is made conditional? > I realized that MarkAsPreparingInRedo() does not need to do all the sanity checking since it's going to be invoked during redo and everything that comes in is kosher already. So its contents are much simplified in this latest patch. Tests pass with this latest patch. Regards, Nikhils twophase_recovery_shmem_110317.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/
Re: [HACKERS] Explicit subtransactions for PL/Tcl
2017-03-10 20:31 GMT+01:00 Victor Wagner : > On Thu, 9 Mar 2017 12:04:31 +0100 > Pavel Stehule wrote: > > > > > Now test demonstrate how errors uncaught on the Tcl level interact > > > with postgresql error system. > > > > > > > you can catch the exception outside and write own message > > OK, here is patch with tests which don't depend on function OIDs > > They ignore stack trace ($::errorinfo global variable) completely, > and analyze just error message. > > all tests passed I have not any other objections I'll mark this patch as ready for commiter Regards Pavel > > -- >Victor Wagner >
Re: [HACKERS] Re: proposal - psql: possibility to specify sort for describe commands, when size is printed
2017-03-10 15:45 GMT+01:00 Alexander Korotkov : > On Fri, Mar 10, 2017 at 5:10 PM, Peter Eisentraut < > peter.eisentr...@2ndquadrant.com> wrote: > >> On 2/24/17 16:32, Pavel Stehule wrote: >> > set EXTENDED_DESCRIBE_SORT size_desc >> > \dt+ >> > \l+ >> > \di+ >> > >> > Possible variants: schema_table, table_schema, size_desc, size_asc >> >> I can see this being useful, but I think it needs to be organized a >> little better. >> >> Sort key and sort direction should be separate settings. >> > > I agree. > > I'm not sure why we need to have separate settings to sort by schema >> name and table name. > > > I think sorting by schema name, object name makes sense for people, who > have objects of same name in different schemas. > I am sending a updated version with separated sort direction in special variable There is a question. Has desc direction sense for columns like schema or table name? Using desc, asc for size is natural. But for tablename? Regards Pavel > > -- > Alexander Korotkov > Postgres Professional: http://www.postgrespro.com > The Russian Postgres Company > > diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 2a9c412020..747db58dd8 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -3507,6 +3507,27 @@ bar +VERBOSE_SORT_COLUMNS + + +This variable can be set to the values schema_name, +name_schema or size to control the +order of content of decrible command. + + + + + +VERBOSE_SORT_DIRECTION + + +This variable can be set to the values asc, +or desc to control the order of content of decrible command. + + + + + VERBOSITY diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c index 61a3e2a848..7ba24ea883 100644 --- a/src/bin/psql/describe.c +++ b/src/bin/psql/describe.c @@ -197,6 +197,9 @@ describeAccessMethods(const char *pattern, bool verbose) return true; } +#define SORT_DIRECTION_STR(v) ((v) == PSQL_SORT_ASC ? "ASC" : "DESC") + + /* \db * Takes an optional regexp to select particular tablespaces */ @@ -264,7 +267,18 @@ describeTablespaces(const char *pattern, bool verbose) NULL, "spcname", NULL, NULL); - appendPQExpBufferStr(&buf, "ORDER BY 1;"); + + if (verbose && pset.sversion >= 90200) + { + if (pset.verbose_sort_columns == PSQL_SORT_SIZE) + appendPQExpBuffer(&buf, + "ORDER BY pg_catalog.pg_tablespace_size(oid) %s, 1;", + SORT_DIRECTION_STR(pset.verbose_sort_direction)); + else + appendPQExpBufferStr(&buf, "ORDER BY 1;"); + } + else + appendPQExpBufferStr(&buf, "ORDER BY 1;"); res = PSQLexec(buf.data); termPQExpBuffer(&buf); @@ -824,7 +838,19 @@ listAllDbs(const char *pattern, bool verbose) processSQLNamePattern(pset.db, &buf, pattern, false, false, NULL, "d.datname", NULL, NULL); - appendPQExpBufferStr(&buf, "ORDER BY 1;"); + if (verbose && pset.sversion >= 80200) + { + if (pset.verbose_sort_columns == PSQL_SORT_SIZE) + appendPQExpBuffer(&buf, + "ORDER BY CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT')\n" + " THEN pg_catalog.pg_database_size(d.datname) END %s, 1;\n", + SORT_DIRECTION_STR(pset.verbose_sort_direction)); + else + appendPQExpBufferStr(&buf, "ORDER BY 1"); + } + else + appendPQExpBufferStr(&buf, "ORDER BY 1"); + res = PSQLexec(buf.data); termPQExpBuffer(&buf); if (!res) @@ -3295,7 +3321,26 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys "n.nspname", "c.relname", NULL, "pg_catalog.pg_table_is_visible(c.oid)"); - appendPQExpBufferStr(&buf, "ORDER BY 1,2;"); + if (verbose && pset.sversion >= 80100) + { + if (pset.verbose_sort_columns == PSQL_SORT_SCHEMA_NAME) + appendPQExpBufferStr(&buf, "ORDER BY 1,2;"); + else if (pset.verbose_sort_columns == PSQL_SORT_NAME_SCHEMA) + appendPQExpBufferStr(&buf, "ORDER BY 2,1;"); + else + { + if (pset.sversion >= 9) +appendPQExpBuffer(&buf, + "ORDER BY pg_catalog.pg_table_size(c.oid) %s, 1,2", + SORT_DIRECTION_STR(pset.verbose_sort_direction)); + else +appendPQExpBuffer(&buf, + "ORDER BY pg_catalog.pg_relation_size(c.oid) %s, 1,2", + SORT_DIRECTION_STR(pset.verbose_sort_direction)); + } + } + else + appendPQExpBufferStr(&buf, "ORDER BY 1,2;"); res = PSQLexec(buf.data); termPQExpBuffer(&buf); diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c index ba14df0344..1ebe397a85 100644 --- a/src/bin/psql/help.c +++ b/src/bin/psql/help.c @@ -327,7 +327,7 @@ helpVariables(unsigned short int pager) * Windows builds currently print one more line than non-Windows builds. * Using the larger number is fine. */ - output = PageOutput(88, pager ? &(pset.popt.topt) : NULL); + output = PageOutput(92, pager ? &(pset.popt.topt)
Re: [HACKERS] make check-world output
Tom Lane wrote: > Alvaro Herrera writes: > > Jeff Janes wrote: > >> There was some recent discussion about making "make check-world" faster. > >> I'm all for that, but how about making it quieter? On both machines I've > >> run it on (CentOS6.8 and Ubuntu 16.04.2), it dumps some gibberish to > >> stderr, example attached. > > > I think you're complaining about the test code added by commit > > fcd15f13581f. Together with behavior introduced by 2f227656076a, it is > > certainly annoying. I would vote for redirecting that output to a log > > file which can be ignored/removed unless there is a failure. > > What about just reverting 2f227656076a? That works for me too, if we think we no longer need that level of detail. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] background sessions
2017-03-09 14:52 GMT+01:00 Peter Eisentraut : > On 3/8/17 14:22, Pavel Stehule wrote: > > 1. will be background session process closed automatically when parent > > process is closed? > > If the communications queue goes away the process will eventually die. > This is similar to how a backend process will eventually die if the > client goes away. Some more testing would be good here. > what means "eventually die"? I called pg_sleep() in called subprocess. Cancel, terminating parent process has not any effect. It is maybe artificial test. Little bit more realistic - waiting on table lock in background worker was successful - and when parent was cancelled, then worker process was destroyed too. But when parent was terminated, then background worker process continued. What is worse - the background worker had 100% CPU and I had to restart notebook. CREATE OR REPLACE FUNCTION public.foo() RETURNS void LANGUAGE plpythonu AS $function$ with plpy.BackgroundSession() as a: a.execute('update foo2 set a = 30') a.execute('insert into foo2 values(10)') $function$ postgres=# I blocked foo2 in another session. Regards Pavel > > > 2. what timeouts are valid for this process - statement timeout, idle in > > transaction timeout > > Those should work the same way. It's the same code that runs the > queries, starts/stops transactions, etc. > > > I see significant risk on leaking sessions. > > Yeah, that's a valid concern. But I think it works ok. > > > There can be more doc and examples in plpython doc. It will be main > > interface for this feature. Mainly about session processing. > > OK, I'll look into that again. > > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >
Re: [HACKERS] make check-world output
Alvaro Herrera writes: > Tom Lane wrote: >> What about just reverting 2f227656076a? > That works for me too, if we think we no longer need that level of > detail. A general issue with this sort of messaging is that when things are working more or less normally, you'd just as soon not see it ... but when you need to debug problems with the test scaffolding itself, you want verbosity. For the basic build process, we've largely solved that through the use of "make -s". But we don't really have a comparable "be quiet" option for test runs, especially not the TAP tests. Maybe we need to think a bit more globally about what it is we're trying to accomplish. I could definitely see that "set -x" in pg_upgrade's test.sh might be useful as part of a verbose testing option; what it lacks is a way to turn it off. An entirely different big-picture point is "what the heck are we doing using a shell script here at all? It is useless for testing on Windows". Somebody should look into rewriting it in Perl, perhaps using the facilities of PostgresNode.pm. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a builtin way to run all tests faster manner
I wrote: > I believe the core problem is that contrib/test_decoding's regresscheck > and isolationcheck targets each want to use ./tmp_check as their > --temp-instance. make has no reason to believe it shouldn't run those > two sub-jobs in parallel, but when it does, we get two postmasters trying > to share the same directory. This looks reasonably straightforward to > solve, but I'm not entirely familiar with the code here, and am not > sure what is the least ugly way to fix it. Enlarging on that: if I cd into contrib/test_decoding and do "make check -j4" or so, it reliably fails. With the attached patch, it passes. This is a localized patch that only fixes things for contrib/test_decoding; what I'm wondering is if it would be better to establish a more widespread convention that $(pg_isolation_regress_check) should use a different --temp-instance directory than $(pg_regress_check) does. regards, tom lane diff --git a/contrib/test_decoding/.gitignore b/contrib/test_decoding/.gitignore index 1f95503..09d60bf 100644 *** a/contrib/test_decoding/.gitignore --- b/contrib/test_decoding/.gitignore *** *** 3,5 --- 3,6 /isolation_output/ /regression_output/ /tmp_check/ + /tmp_check_iso/ diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile index d2bc8b8..ea31d88 100644 *** a/contrib/test_decoding/Makefile --- b/contrib/test_decoding/Makefile *** PGFILEDESC = "test_decoding - example of *** 5,11 # Note: because we don't tell the Makefile there are any regression tests, # we have to clean those result files explicitly ! EXTRA_CLEAN = $(pg_regress_clean_files) ./regression_output ./isolation_output ifdef USE_PGXS PG_CONFIG = pg_config --- 5,12 # Note: because we don't tell the Makefile there are any regression tests, # we have to clean those result files explicitly ! EXTRA_CLEAN = $(pg_regress_clean_files) ./regression_output \ ! ./isolation_output ./tmp_check_iso ifdef USE_PGXS PG_CONFIG = pg_config *** isolationcheck: | submake-isolation subm *** 59,64 --- 60,66 $(MKDIR_P) isolation_output $(pg_isolation_regress_check) \ --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf \ + --temp-instance=./tmp_check_iso \ --outputdir=./isolation_output \ $(ISOLATIONCHECKS) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/JSON in PostgreSQL
On Fri, Mar 10, 2017 at 7:07 AM, Petr Jelinek wrote: > On 09/03/17 19:50, Peter van Hardenberg wrote: > > Anecdotally, we just stored dates as strings and used a convention (key > > ends in "_at", I believe) to interpret them. The lack of support for > > dates in JSON is well-known, universally decried... and not a problem > > the PostgreSQL community can fix. > > > > The original complain was about JSON_VALUE extracting date but I don't > understand why there is problem with that, the SQL/JSON defines that > behavior. The RETURNING clause there is more or less just shorthand for > casting with some advanced options. > There is no problem with serializing date and SQL/JSON describes it rather well. There is no correct procedure to deserialize date from a correct json string and the standards keeps silence about this and now we understand that date[time] is actually virtual and the only use of them is in jsonpath (filter) expressions. > > -- > Petr Jelinek http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: [HACKERS] Need a builtin way to run all tests faster manner
On 3/10/17 6:06 PM, Peter Eisentraut wrote: On 3/10/17 19:00, Jim Nasby wrote: Maybe instead of having the commitfest app try and divine patches from the list it should be able to send patches to the list from a specified git repo/branch. Anyone that provides that info would have tests run automagically, patches sent, etc. Anyone who doesn't can just keep using the old process. Those people who know what they're doing will presumably run all those checks before they submit a patch. It's those people who send in patches that don't apply cleanly or fail the tests that would benefit from this system. But if they're that careless, then they also won't take care to use this particular system correctly. It's actually a lot harder to mess up providing a git repo link than manually submitting patches to the mailing list. For most patches, it's also a hell of a lot faster to just submit a repo URL rather than dealing with patch files. Having this also means that reviewers can focus more on what the patch is actually doing instead of mechanical crap best left to a machine. Of course, *you* work on changes that are far more complex than any newbie will, and it wouldn't surprise me if such a feature wouldn't help you or other senior hackers at all. But AFAICT it wouldn't get in your way either. It would remove yet another burden for new hackers. Anyway, this is well off topic for the original thread... -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)
Hi This proposal is followup of implementation of XMLTABLE. Lot of XML documents has assigned document namespace. http://x.y";>10 For these XML document any search path must use schema "http://x.y";. This is not too intuitive, and from XMLTABLE usage is not too user friendly, because the default column path (same like column name) cannot be used. A solution of this issue is default namespace - defined in SQL/XML. example - related to previous xml without default namespace: XMLTABLE(NAMESPACES('http://x.y' AS aux), '/aux:rows/aux:row' PASSING ... COLUMNS a int PATH 'aux:a') with default namespace XMLTABLE(NAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING ... COLUMNS a int); Unfortunately the libxml2 doesn't support default namespaces in XPath expressions. Because the libxml2 functionality is frozen, there is not big chance for support in near future. A implementation is not too hard - although it requires simple XPath expressions state translator. The databases with XMLTABLE implementation supports default namespace for XPath expressions. The patch for initial implementation is attached. Regards Pavel diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 583b3b241a..c2558a33ef 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -10465,8 +10465,7 @@ SELECT xpath_exists('/my:a/text()', 'http://example.com";>test The optional XMLNAMESPACES clause is a comma-separated list of namespaces. It specifies the XML namespaces used in - the document and their aliases. A default namespace specification - is not currently supported. + the document and their aliases. diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile index 0f512753e4..5a3715cc84 100644 --- a/src/backend/utils/adt/Makefile +++ b/src/backend/utils/adt/Makefile @@ -29,7 +29,7 @@ OBJS = acl.o amutils.o arrayfuncs.o array_expanded.o array_selfuncs.o \ tsquery_op.o tsquery_rewrite.o tsquery_util.o tsrank.o \ tsvector.o tsvector_op.o tsvector_parser.o \ txid.o uuid.o varbit.o varchar.o varlena.o version.o \ - windowfuncs.o xid.o xml.o + windowfuncs.o xid.o xml.o xpath_parser.o like.o: like.c like_match.c diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c index f81cf489d2..d59a76f0b4 100644 --- a/src/backend/utils/adt/xml.c +++ b/src/backend/utils/adt/xml.c @@ -91,7 +91,7 @@ #include "utils/rel.h" #include "utils/syscache.h" #include "utils/xml.h" - +#include "utils/xpath_parser.h" /* GUC variables */ int xmlbinary; @@ -184,6 +184,7 @@ typedef struct XmlTableBuilderData xmlXPathCompExprPtr xpathcomp; xmlXPathObjectPtr xpathobj; xmlXPathCompExprPtr *xpathscomp; + bool with_default_ns; } XmlTableBuilderData; #endif @@ -4180,6 +4181,7 @@ XmlTableInitOpaque(TableFuncScanState *state, int natts) xtCxt->magic = XMLTABLE_CONTEXT_MAGIC; xtCxt->natts = natts; xtCxt->xpathscomp = palloc0(sizeof(xmlXPathCompExprPtr) * natts); + xtCxt->with_default_ns = false; xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_ALL); @@ -4272,6 +4274,8 @@ XmlTableSetDocument(TableFuncScanState *state, Datum value) #endif /* not USE_LIBXML */ } +#define DEFAULT_NAMESPACE_NAME "pgdefnamespace" + /* * XmlTableSetNamespace * Add a namespace declaration @@ -4282,12 +4286,14 @@ XmlTableSetNamespace(TableFuncScanState *state, char *name, char *uri) #ifdef USE_LIBXML XmlTableBuilderData *xtCxt; - if (name == NULL) - ereport(ERROR, -(errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("DEFAULT namespace is not supported"))); xtCxt = GetXmlTableBuilderPrivateData(state, "XmlTableSetNamespace"); + if (name == NULL) + { + xtCxt->with_default_ns = true; + name = DEFAULT_NAMESPACE_NAME; + } + if (xmlXPathRegisterNs(xtCxt->xpathcxt, pg_xmlCharStrndup(name, strlen(name)), pg_xmlCharStrndup(uri, strlen(uri @@ -4316,6 +4322,14 @@ XmlTableSetRowFilter(TableFuncScanState *state, char *path) (errcode(ERRCODE_DATA_EXCEPTION), errmsg("row path filter must not be empty string"))); + if (xtCxt->with_default_ns) + { + StringInfoData str; + + transformXPath(&str, path, DEFAULT_NAMESPACE_NAME); + path = str.data; + } + xstr = pg_xmlCharStrndup(path, strlen(path)); xtCxt->xpathcomp = xmlXPathCompile(xstr); @@ -4347,6 +4361,14 @@ XmlTableSetColumnFilter(TableFuncScanState *state, char *path, int colnum) (errcode(ERRCODE_DATA_EXCEPTION), errmsg("column path filter must not be empty string"))); + if (xtCxt->with_default_ns) + { + StringInfoData str; + + transformXPath(&str, path, DEFAULT_NAMESPACE_NAME); + path = str.data; + } + xstr = pg_xmlCharStrndup(path, strlen(path)); xtCxt->xpathscomp[colnum] = xmlXPathCompile(xstr); diff --git a/src/backend/utils/adt/xpath_parser.c b/src/backend/utils/adt/xpath_parser.c new file mode 100644 index 00..7ec2d584c6 --- /dev/null +++ b/src
Re: [HACKERS] Need a builtin way to run all tests faster manner
On 2017-03-11 12:05:23 -0500, Tom Lane wrote: > I wrote: > > I believe the core problem is that contrib/test_decoding's regresscheck > > and isolationcheck targets each want to use ./tmp_check as their > > --temp-instance. make has no reason to believe it shouldn't run those > > two sub-jobs in parallel, but when it does, we get two postmasters trying > > to share the same directory. This looks reasonably straightforward to > > solve, but I'm not entirely familiar with the code here, and am not > > sure what is the least ugly way to fix it. > > Enlarging on that: if I cd into contrib/test_decoding and do > "make check -j4" or so, it reliably fails. Yep, can reproduce here as well. Interesting that, with -j16, I could survive several dozen runs from the toplevel locally. > This is a localized patch that only fixes things for > contrib/test_decoding; what I'm wondering is if it would be better to > establish a more widespread convention that > $(pg_isolation_regress_check) should use a different --temp-instance > directory than $(pg_regress_check) does. I think that'd be a good plan. We probably should also keep --outputdir seperate (which test_decoding/Makefile does, but pg_isolation_regress_check doesn't)? - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Explicit subtransactions for PL/Tcl
Pavel Stehule writes: > I'll mark this patch as ready for commiter I've pushed this after mostly-cosmetic cleanup. One thing I changed that's not cosmetic is to put MemoryContextSwitchTo(oldcontext); after the BeginInternalSubTransaction call. I see there was some discussion upthread about what to do there, but I think this is the correct answer because it corresponds to what pltcl_subtrans_begin does. That means that the execution environment of the called Tcl fragment will be the same as, eg, the loop_body in spi_exec. I do not think we want it to be randomly different from those pre-existing cases. Now, I believe that the MemoryContextSwitchTo in pltcl_subtrans_begin was probably just cargo-culted in there from similar code in plpgsql. Since pltcl doesn't rely on CurrentMemoryContext to anywhere near the same degree that plpgsql does, it's possible that we could drop the MemoryContextSwitchTo in both places, and just let the called code fragments run in the subtransaction's memory context. But I'm not convinced of that, and in any case it ought to be undertaken as a separate patch. Some other comments: * Whitespace still wasn't very much per project conventions. I fixed that by running it through pgindent. * The golden rule for code style and placement is "make your patch look like it was always there". Dropping new code at the tail end of the file is seldom a good avenue to that. I stuck it after pltcl_SPI_lastoid, on the grounds that it should be with the other Tcl command functions and should respect the (mostly) alphabetical order of those functions. Likewise I adopted a header comment format in keeping with the existing nearby functions. * I whacked the SGML docs around pretty thoroughly. That addition wasn't respecting the style of the surrounding text either as to markup or indentation, and it had some other issues like syntax errors in the example functions. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a builtin way to run all tests faster manner
Jim Nasby writes: > It's actually a lot harder to mess up providing a git repo link than > manually submitting patches to the mailing list. Yeah, we've heard that proposal before. We're still not doing it though. Insisting on patches being actually submitted to the mailing list is important for archival and possibly legal reasons. If someone sends in a link to $random-repo, once that site goes away there's no way to determine exactly what was submitted. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a builtin way to run all tests faster manner
On 3/11/17 2:06 PM, Tom Lane wrote: Jim Nasby writes: It's actually a lot harder to mess up providing a git repo link than manually submitting patches to the mailing list. Yeah, we've heard that proposal before. We're still not doing it though. Insisting on patches being actually submitted to the mailing list is important for archival and possibly legal reasons. If someone sends in a link to $random-repo, once that site goes away there's no way to determine exactly what was submitted. The full proposal was that the commitfest app have the ability to generate and post the patch for you, assuming that the smoke-test passes. -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions
On Fri, Mar 10, 2017 at 9:12 PM, Mengxing Liu wrote: > My name is Mengxing Liu. I am interested in the project "Eliminate > O(N^2) scaling from rw-conflict tracking in serializable > transactions”. After discussing with Kevin off-list, I think it's > time to post discussion here. I am afraid that there are two things > that I need your help. Thank you very much. Welcome to the -hackers list! This is a key part of how development happens in the community. Don't be shy about posting questions and ideas, but also expect fairly blunt discussion to occur at times; don't let that put you off. >>> So the task is clear. We can use a tree-like or hash-like data >>> structure to speed up this function. >> >> Right -- especially with a large number of connections holding a >> large number of conflicts. In one paper with high concurrency, they >> found over 50% of the CPU time for PostgreSQL was going to these >> functions (including functions called by them). This seems to me to >> be due to the O(N^2) (or possibly worse) performance from the number >> of connections. > > Anyone knows the title of this paper? I want to reproduce its > workloads. I seem to remember there being a couple other papers or talks, but this is probably the most informative: http://sydney.edu.au/engineering/it/research/tr/tr693.pdf >> Remember, I think most of the work here is going to be in >> benchmarking. We not only need to show improvements in simple test >> cases using readily available tools like pgbench, but think about >> what types of cases might be worst for the approach taken and show >> that it still does well -- or at least not horribly. It can be OK >> to have some slight regression in an unusual case if the common >> cases improve a lot, but any large regression needs to be addressed >> before the patch can be accepted. There are some community members >> who are truly diabolical in their ability to devise "worst case" >> tests, and we don't want to be blind-sided by a bad result from one >> of them late in the process. >> > > Are there any documents or links introducing how to test and > benchmark PostgreSQL? I may need some time to learn about it. There is pgbench: https://www.postgresql.org/docs/devel/static/pgbench.html A read-only load and a read/write mix should both be tested, probably with a few different scales and client counts to force the bottleneck to be in different places. The worst problems have been seen with 32 or more cores on 4 or more sockets with a large number of active connections. I don't know whether you have access to a machine capable of putting this kind of stress on it (perhaps at your university?), but if not, the community has access to various resources we should be able to schedule time on. It may pay for you to search through the archives of the last year or two to look for other benchmarks and see what people have previously done. -- Kevin Grittner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index usage for elem-contained-by-const-range clauses
On 3/10/17 8:29 AM, Alexander Korotkov wrote: That's cool idea. But I would say more. Sometimes it's useful to transform "intcol between x and y" into "intcol <@ 'x,y'::int4range". btree_gin supports "intcol between x and y" as overlap of "intcol >= x" and "intcol <= y". That is very inefficient. But it this clause would be transformed into "intcol <@ 'x,y'::int4range", btree_gin could handle this very efficient. That's certainly be nice as well, but IMHO it's outside the scope of this patch to accomplish that. BTW, while we're wishing for things... Something else that would be nice is if there was a way to do these kind of transforms without hacking the backend... Also, I noticed that patch haven't regression tests. BTW, those tests need to pay special attention to inclusive vs exclusive bounds. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] scram and \password
On 03/10/2017 02:43 PM, Michael Paquier wrote: > On Sat, Mar 11, 2017 at 2:53 AM, Jeff Janes wrote: >> Should the \password tool in psql inspect password_encryption and act on it >> being 'scram'? > > Not sure if it is wise to change the default fot this release. > >> I didn't see this issue discussed, but the ability to search the archives >> for backslashes is rather limited. > > I'll save you some time: it has not been discussed. Nor has > PQencryptPassword been mentioned. Changing to SCRAM is not that > complicated, just call scram_build_verifier() and you are good to go. > > Instead of changing the default, I think that we should take this > occasion to rename PQencryptPassword to something like > PQhashPassword(), and extend it with a method argument to support both > md5 and scram. PQencryptPassword could also be marked as deprecated, > or let as-is for some time. For \password, we could have another > meta-command but that sounds grotty, or just extend \password with a > --method argument. Switching the default could happen in another > release. > > A patch among those lines would be a simple, do people feel that this > should be part of PG 10? Seems like it should work in an analogous way to CREATE/ALTER ROLE. According to the docs: 8< ENCRYPTED UNENCRYPTED These key words control whether the password is stored encrypted in the system catalogs. (If neither is specified, the default behavior is determined by the configuration parameter password_encryption.) If the presented password string is already in MD5-encrypted or SCRAM-encrypted format, then it is stored encrypted as-is, regardless of whether ENCRYPTED or UNENCRYPTED is specified (since the system cannot decrypt the specified encrypted password string). This allows reloading of encrypted passwords during dump/restore. 8< So if the password is not already set, \password uses password_encryption to determine which format to use, and if the password is already set, then the current method is assumed. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
[HACKERS] INSERT INTO arr2(array[1].d, array[2].d)
Over in [1], I was very surprised to discover $SUBJECT[2]. I looked in the docs, and they clearly indicate that INSERT accepts "column names". What's the best way to describe this? "column expression"? "field expression"? 1: https://www.postgresql.org/message-id/20170311005810.kuccp7t5t5jhe...@alap3.anarazel.de 2: CREATE TABLE arr(d int[]); CREATE TABLE arr2(arr arr) INSERT INTO arr2(arr[1].d, arr[2].d) VALUES(ARRAY[1,2],ARRAY[3,4]) RETURNING * ┌───┐ │ arr │ ├───┤ │ {"(\"{1,2}\")","(\"{3,4}\")"} │ └───┘ -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to get the 'ctid' from a record type?
On 3/10/17 10:31 PM, Eric Ridge wrote: What about this? Is the tuple currently being evaluated (I suppose in the case of a sequential scan) available in the context of a function call? AFAIK that *very* specific case would work, because the executor would be handing you the raw tuple. Not a great bet to make though. Also, there should be a macro somewhere that will tell you whether you have a full tuple or not. You'd want to make sure to check that an throw an error if you weren't handed a full tuple. -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT INTO arr2(array[1].d, array[2].d)
On 2017-03-11 14:43:55 -0600, Jim Nasby wrote: > Over in [1], I was very surprised to discover $SUBJECT[2]. I looked in the > docs, and they clearly indicate that INSERT accepts "column names". They also say "The column name can be qualified with a subfield name or array subscript, if needed." > What's the best way to describe this? "column expression"? "field > expression"? field expression is the better of the two, but I'm not really convinced changing. For reference: > 1: > https://www.postgresql.org/message-id/20170311005810.kuccp7t5t5jhe...@alap3.anarazel.de > > 2: > CREATE TABLE arr(d int[]); > CREATE TABLE arr2(arr arr) > INSERT INTO arr2(arr[1].d, arr[2].d) VALUES(ARRAY[1,2],ARRAY[3,4]) RETURNING > * > ┌───┐ > │ arr │ > ├───┤ > │ {"(\"{1,2}\")","(\"{3,4}\")"} │ > └───┘ - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to get the 'ctid' from a record type?
On 2017-03-11 04:31:16 +, Eric Ridge wrote: > Well shoot. That kinda spoils my plans. I think you should elaborate on what you're trying to achieve - otherwise our advice will be affected by the recent, widely reported, crystal ball scarcity. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)
Corey Huinker writes: > [ 0001.if_endif.v21.diff ] Starting to poke at this... the proposal to add prove checks for psql just to see whether \if respects ON_ERROR_STOP seems like an incredibly expensive way to test a rather minor point. On my machine, "make check" in bin/psql goes from zero time to close to 8 seconds. I'm not really on board with adding that kind of time to every buildfarm run for the foreseeable future just for this. Couldn't we get close to the same coverage by adding a single-purpose test script to the main regression tests? Along the lines of \set ON_ERROR_STOP 1 \if invalid \echo should not get here \endif \echo should not get here either You could imagine just dropping that at the end of psql.sql, but I think probably a separate script is worth the trouble. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] dblink leaks unnamed connections
On 03/09/2017 08:31 AM, Joe Conway wrote: > On 03/09/2017 07:54 AM, Tom Lane wrote: >> Fujii Masao writes: >>> On Wed, Mar 8, 2017 at 3:52 PM, Tsunakawa, Takayuki >>> wrote: dblink fails to close the unnamed connection as follows when a new unnamed connection is requested. The attached patch fixes this. >> >>> This issue was reported about ten years ago and added as TODO item. >>> http://archives.postgresql.org/pgsql-hackers/2007-10/msg00895.php >> >>> I agree that this is a bug, and am tempted to back-patch to all the >>> supported >>> versions. But it had not been fixed in many years since the first report of >>> the issue. So I'm not sure if it's ok to just treat this as a bug right now >>> and >>> back-patch. Or we should fix this only in HEAD? Anyway I'd like to hear >>> more opinions about this. >> >> It looks to me like the issue simply fell through the cracks because Joe >> wasn't excited about fixing it. Now that we have a second complaint, >> I think it's worth treating as a bug and back-patching. >> >> (I've not read this particular patch and am not expressing an opinion >> whether it's correct.) > > Ok, will take another look. I pushed a fix to all supported branches. Thanks, Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: [HACKERS] How to get the 'ctid' from a record type?
On Sat, Mar 11, 2017 at 2:14 PM Andres Freund wrote: > On 2017-03-11 04:31:16 +, Eric Ridge wrote: > > Well shoot. That kinda spoils my plans. > > I think you should elaborate on what you're trying to achieve - > otherwise our advice will be affected by the recent, widely reported, > crystal ball scarcity. > What I'm trying to do is port https://github.com/zombodb/zombodb to Postgres 9.6+. It's an Access Method that stores full rows, encoded as JSON, in Elasticsearch instead of in local storage. It was fairly straightforward to do the mechanical work to convert it to use 9.6's new AM API (which is very nice, btw!), but the fact that 9.6 also disallows including system columns (specifically ctid) has me turned upside down. With <9.6, I was able to cook-up a scheme where it was able to answer queries from the remote Elasticsearch index even when Postgres decided to plan a sequential scan. That hinged, mostly, on being able to create a multi-column index where the first column was a function call that included as an argument (among other things) the ctid system column. The ability to answer sequential scans (and filters) using the remote ES index is pretty important as the knowledge of how to do that exists in Elasticsearch, not my custom operator function in Postgres. Anyways, I've been trying to find a way to intuit the ctid system column value with 9.6 and it's clear now that that just isn't possible. The closest I got was digging through ActivePortal->queryDesc->estate->es_tuple, but that only works when it's a real tuple, not one that's virtual or minimal. I'm pretty sure that I need to be implementing a Custom Scan Provider instead, and I've been spending time with that API too. There's a pretty steep learning curve for me, but I'll eventually get over that hump. I could probably bore you with greater detail but basically, I want to take: CREATE INDEX idxfoo ON table USING zombodb (zdb(table), zdb_to_json(table)) WITH (url='http://remote.ip.addr:9200/'); SELECT * FROM table WHERE zdb(table) ==> 'some full text query' OR id = 42; And have the "zdb(table) ==> 'some full text query'" bit be answered by my extension, regardless of how PG wants to plan the query. While I was able to hack something together for <9.6, I think that means a Custom Scan Provider now? eric
[HACKERS] Preserving param location
Hello, When a query contains parameters, the original param node contains the token location. However, this information is lost when the Const node is generated, this one will only contain position -1 (unknown). FWIW, we do have a use case for this (custom extension that tracks quals statistics, which among other thing is used to regenerate query string from a pgss normalized query, thus needing the original param location). Is this something we want to get fixed? If yes, attached is a simple patch for that. Regards. -- Julien Rouhaud http://dalibo.com - http://dalibo.org diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index b19380e1b1..beb0f99144 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -2444,6 +2444,7 @@ eval_const_expressions_mutator(Node *node, int16 typLen; bool typByVal; Datum pval; + Const *con; Assert(prm->ptype == param->paramtype); get_typlenbyval(param->paramtype, @@ -2452,13 +2453,17 @@ eval_const_expressions_mutator(Node *node, pval = prm->value; else pval = datumCopy(prm->value, typByVal, typLen); - return (Node *) makeConst(param->paramtype, + + con = makeConst(param->paramtype, param->paramtypmod, param->paramcollid, (int) typLen, pval, prm->isnull, typByVal); + con->location = param->location; + + return (Node *) con; } } } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a builtin way to run all tests faster manner
On 2017-03-07 09:36:51 -0300, Alvaro Herrera wrote: > FWIW, +1 on improving matters here. > > Andres Freund wrote: > > > The best I can come up so far is a toplevel target that creates the temp > > install, starts a cluster and then runs the 'installcheck-or-check' > > target on all the subdirectories via recursion. Individual makefiles can > > either use the pre-existing cluster (most of of contrib for example), or > > use the temporary install and run their pre-existing check target using > > that (the tap tests, test_decoding, ...). > > I think a toplevel installcheck-or-check target is a good first step > (though definitely lets find a better name). Just being able to run all > tests without the need for 95% of pointless initdb's would be helpful > enough. Here's a hacky proof-of-concept patch that implements a contrib/ 'fastcheck' target. timing of: make -s -j01 -Otarget check 2m49.286s make -s -j16 -Otarget check 0m44.120s make -s -j01 -Otarget fastcheck 1m1.533s make -s -j16 -Otarget fastcheck 0m24.385s make -s -j01 -Otarget USE_MODULE_DB=1 \ installcheck check-test_decoding-recurse0m56.016s make -s -j16 -Otarget USE_MODULE_DB=1 \ installcheck check-test_decoding-recurse0m23.608s All of these should, excepting rerunning initdb/server start, have equivalent test coverage. 'fastcheck' is obviously, if you look at it, a POC. We'd need: - abstract this away somehow, it's not nice to copy the logic into a makefile - tie contrib/'fastcheck' into check-world by overriding that target - use the temp install we've created previously? - more bulletproof server stop mechanism? - Andres >From 77520424a7b8a45e629eada864baac49db7fceb1 Mon Sep 17 00:00:00 2001 From: Andres Freund Date: Sat, 11 Mar 2017 14:03:26 -0800 Subject: [PATCH] WIP; 'fastcheck' target in contrib --- contrib/Makefile | 17 + 1 file changed, 17 insertions(+) diff --git a/contrib/Makefile b/contrib/Makefile index e84eb67008..ca08fc9b74 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -90,6 +90,23 @@ endif # Missing: # start-scripts \ (does not have a makefile) +NONSHARED_DATADIR_TARGETS=test_decoding +SHARED_DATADIR_TARGETS=$(filter-out $(NONSHARED_DATADIR_TARGETS), $(SUBDIRS)) +SERVOPTIONS="-c log_line_prefix='%m [%p] %q%a ' -c unix_socket_directories=$(shell pwd)/tmp_fastcheck/ -clisten_addresses= -c port=5432" +REGRESSPORT=5432 +REGRESSHOST=$(shell pwd)/tmp_fastcheck/ + +fastcheck: + @mkdir -p tmp_fastcheck + @rm -rf tmp_fastcheck/data + $(bindir)/initdb --no-clean --no-sync -D tmp_fastcheck/data > tmp_fastcheck/initdb.log + PGPORT=$(REGRESSPORT) PGHOST=$(REGRESSHOST) \ + $(bindir)/pg_ctl -D tmp_fastcheck/data -o$(SERVOPTIONS) --log tmp_fastcheck/postmaster.log start + $(MAKE) PGPORT=$(REGRESSPORT) PGHOST=$(REGRESSHOST) USE_MODULE_DB=1 -Otarget \ + $(foreach t,$(SHARED_DATADIR_TARGETS),installcheck-$(t)-recurse) \ + $(foreach t,$(NONSHARED_DATADIR_TARGETS),check-$(t)-recurse) || \ + $(bindir)/pg_ctl -D tmp_fastcheck/data stop + $(bindir)/pg_ctl -D tmp_fastcheck/data stop $(recurse) $(recurse_always) -- 2.11.0.22.g8d7a455.dirty -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] scram and \password
On Sun, Mar 12, 2017 at 5:35 AM, Joe Conway wrote: > On 03/10/2017 02:43 PM, Michael Paquier wrote: >> Instead of changing the default, I think that we should take this >> occasion to rename PQencryptPassword to something like >> PQhashPassword(), and extend it with a method argument to support both >> md5 and scram. PQencryptPassword could also be marked as deprecated, >> or let as-is for some time. For \password, we could have another >> meta-command but that sounds grotty, or just extend \password with a >> --method argument. Switching the default could happen in another >> release. >> >> A patch among those lines would be a simple, do people feel that this >> should be part of PG 10? > > Seems like it should work in an analogous way to CREATE/ALTER ROLE. > According to the docs: > > 8< > ENCRYPTED > UNENCRYPTED > > These key words control whether the password is stored encrypted in > the system catalogs. (If neither is specified, the default behavior is > determined by the configuration parameter password_encryption.) If the > presented password string is already in MD5-encrypted or SCRAM-encrypted > format, then it is stored encrypted as-is, regardless of whether > ENCRYPTED or UNENCRYPTED is specified (since the system cannot decrypt > the specified encrypted password string). This allows reloading of > encrypted passwords during dump/restore. > 8< > > So if the password is not already set, \password uses > password_encryption to determine which format to use, and if the > password is already set, then the current method is assumed. Yeah, the problem here being that this routine does not need a live connection to work, and we surely don't want to make that mandatory, that's why I am suggesting something like the above. Another approach would be to switch to SCRAM once password_encryption does this switch as well... There is no perfect scenario here. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a builtin way to run all tests faster manner
On 2017-03-11 14:17:42 -0800, Andres Freund wrote: > On 2017-03-07 09:36:51 -0300, Alvaro Herrera wrote: > > FWIW, +1 on improving matters here. > > > > Andres Freund wrote: > > > > > The best I can come up so far is a toplevel target that creates the temp > > > install, starts a cluster and then runs the 'installcheck-or-check' > > > target on all the subdirectories via recursion. Individual makefiles can > > > either use the pre-existing cluster (most of of contrib for example), or > > > use the temporary install and run their pre-existing check target using > > > that (the tap tests, test_decoding, ...). > > > > I think a toplevel installcheck-or-check target is a good first step > > (though definitely lets find a better name). Just being able to run all > > tests without the need for 95% of pointless initdb's would be helpful > > enough. > > Here's a hacky proof-of-concept patch that implements a contrib/ > 'fastcheck' target. > timing of: > > make -s -j01 -Otarget check 2m49.286s > make -s -j16 -Otarget check 0m44.120s > make -s -j01 -Otarget fastcheck 1m1.533s > make -s -j16 -Otarget fastcheck 0m24.385s > make -s -j01 -Otarget USE_MODULE_DB=1 \ > installcheck check-test_decoding-recurse0m56.016s > make -s -j16 -Otarget USE_MODULE_DB=1 \ > installcheck check-test_decoding-recurse0m23.608s Ooops - all these timings are from a coverage enabled build - the times are overall a bit smaller without that. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing references of password encryption to hashing
On 03/09/2017 06:16 PM, Michael Paquier wrote: > As discussed here: > https://www.postgresql.org/message-id/98cafcd0-5557-0bdf-4837-0f2b7782d...@joeconway.com > We are using in documentation and code comments "encryption" to define > what actually is hashing, which is confusing. > > Attached is a patch for HEAD to change the documentation to match hashing. > > There are a couple of things I noticed on the way: > 1) There is the user-visible PQencryptPassword in libpq, which > actually hashes the password, and not encrypts it :) > 2) There is as well pg_md5_encrypt() in the code, as well as there is > pg_md5_hash(). Those may be better if renamed, at least I would think > that pg_md5_encrypt should be pg_md5_password_hash, because a salt is > used as well, and the routine is dedicated to work on passwords. > Thoughts? > 3) createuser also has --encrypt and --unencrypted, perhaps those > should be renamed? Honestly I don't really think that this is worth a > breakage and the option names match with the SQL commands. My opinion is that the user visible aspects of this should be deprecated and correct syntax provided. But perhaps that is overkill. 8< key and server key, respectively, in hexadecimal format. A password that - does not follow either of those formats is assumed to be unencrypted. + does not follow either of those formats is assumed to be in plain format, + non-hashed. 8< I think here, instead of "in plain format, non-hashed" it is ok to just say "cleartext" or maybe "plaintext". Whichever is picked, it should be used consistently. 8< - To prevent unencrypted passwords from being sent across the network, + To prevent non-hashed passwords from being sent across the network, 8< same here "non-hashed" ==> "cleartext" 8< - Caution must be exercised when specifying an unencrypted password + Caution must be exercised when specifying a non-hashed password 8< and here ...etc... Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
[HACKERS] src/test/regress's check-prepared-txns target
Hi, In https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ae55d9fbe3871a5e6309d9b91629f1b0ff2b8cba src/test/regress grew a check-prepared-txns (and an accompanying installcheck-prepared-txns) target. Is that still sensible given that pg_regress actually enables prepared transactions? I bet these tests are run just about never. I'd suggest removing at least check-prepared-txns and adding the test to the normal check target (which is guaranteed to have prepared xacts enabled). Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] scram and \password
On 03/11/2017 02:21 PM, Michael Paquier wrote: > On Sun, Mar 12, 2017 at 5:35 AM, Joe Conway wrote: >> So if the password is not already set, \password uses >> password_encryption to determine which format to use, and if the >> password is already set, then the current method is assumed. > > Yeah, the problem here being that this routine does not need a live > connection to work, and we surely don't want to make that mandatory, > that's why I am suggesting something like the above. Another approach > would be to switch to SCRAM once password_encryption does this switch > as well... There is no perfect scenario here. You might extend PQencryptPassword() to take a method. Or create a new function that does? Certainly psql has a connection available to run the ALTER ROLE command that it crafts. I guess a related problem might be, do we have a SQL visible way to determine what method is used by the current password for a given role? Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: [HACKERS] scram and \password
On Sun, Mar 12, 2017 at 8:04 AM, Joe Conway wrote: > On 03/11/2017 02:21 PM, Michael Paquier wrote: >> On Sun, Mar 12, 2017 at 5:35 AM, Joe Conway wrote: >>> So if the password is not already set, \password uses >>> password_encryption to determine which format to use, and if the >>> password is already set, then the current method is assumed. >> >> Yeah, the problem here being that this routine does not need a live >> connection to work, and we surely don't want to make that mandatory, >> that's why I am suggesting something like the above. Another approach >> would be to switch to SCRAM once password_encryption does this switch >> as well... There is no perfect scenario here. > > You might extend PQencryptPassword() to take a method. Or create a new > function that does? Certainly psql has a connection available to run the > ALTER ROLE command that it crafts. Yeah but it can be called as well while the application is calling PQgetResult() and still looping until it gets a NULL result. Not sure if this is a use-case to worry about, but sending a query to the server in PQencryptPassword() could as well break some applications. PQencryptPassword() is used for CREATE/ALTER ROLE commands, so actually wouldn't it make sense to just switch PQencryptPassword to handle SCRAM if at some point we decide to switch the default from md5 to scram? So many questions. > I guess a related problem might be, do we have a SQL visible way to > determine what method is used by the current password for a given role? Nope. We are simply looking at a function doing a lookup at pg_authid and then use get_password_type() to check which type of verifier is used... Or have the type of verifier as a new column of pg_authid, information that could be made visible to any users with column privileges. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a builtin way to run all tests faster manner
On 2017-03-11 11:48:31 -0800, Andres Freund wrote: > On 2017-03-11 12:05:23 -0500, Tom Lane wrote: > > I wrote: > > > I believe the core problem is that contrib/test_decoding's regresscheck > > > and isolationcheck targets each want to use ./tmp_check as their > > > --temp-instance. make has no reason to believe it shouldn't run those > > > two sub-jobs in parallel, but when it does, we get two postmasters trying > > > to share the same directory. This looks reasonably straightforward to > > > solve, but I'm not entirely familiar with the code here, and am not > > > sure what is the least ugly way to fix it. > > > > Enlarging on that: if I cd into contrib/test_decoding and do > > "make check -j4" or so, it reliably fails. > > Yep, can reproduce here as well. Interesting that, with -j16, I could > survive several dozen runs from the toplevel locally. > > > > This is a localized patch that only fixes things for > > contrib/test_decoding; what I'm wondering is if it would be better to > > establish a more widespread convention that > > $(pg_isolation_regress_check) should use a different --temp-instance > > directory than $(pg_regress_check) does. > > I think that'd be a good plan. We probably should also keep --outputdir > seperate (which test_decoding/Makefile does, but > pg_isolation_regress_check doesn't)? Here's a patch doing that (based on yours). I'd be kind of inclined to set --outputdir for !isolation tests too; possibly even move tmp_check below output_iso/ output_regress/ or such - but that seems like it potentially could cause some disagreement... - Andres >From fb89f431d120e9123dca367d23f330b7d31b3f01 Mon Sep 17 00:00:00 2001 From: Andres Freund Date: Sat, 11 Mar 2017 15:03:18 -0800 Subject: [PATCH] Improve isolation regression tests infrastructure. Previously if a directory had both isolationtester and plain regression tests they couldn't be run in parallel, because they'd access the same files/directories. That, so far, only affected contrib/test_decoding. Rather than fix that locally in contrib/test_decoding improve pg_regress_isolation_[install]check to use separate resources from plain regression tests. Use the improved helpers everywhere, even where previously not used. Author: Tom Lane and Andres Freund Discussion: https://postgr.es/m/20170311194831.vm5ikpczq52c2...@alap3.anarazel.de --- contrib/test_decoding/.gitignore | 5 +++-- contrib/test_decoding/Makefile | 7 +-- src/Makefile.global.in | 29 ++-- src/test/isolation/.gitignore| 5 ++--- src/test/isolation/Makefile | 8 src/test/modules/snapshot_too_old/.gitignore | 2 +- src/test/modules/snapshot_too_old/Makefile | 6 +++--- 7 files changed, 37 insertions(+), 25 deletions(-) diff --git a/contrib/test_decoding/.gitignore b/contrib/test_decoding/.gitignore index 1f95503494..b4903eba65 100644 --- a/contrib/test_decoding/.gitignore +++ b/contrib/test_decoding/.gitignore @@ -1,5 +1,6 @@ # Generated subdirectories /log/ -/isolation_output/ -/regression_output/ +/results/ +/output_iso/ /tmp_check/ +/tmp_check_iso/ diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile index d2bc8b8350..6c18189d9d 100644 --- a/contrib/test_decoding/Makefile +++ b/contrib/test_decoding/Makefile @@ -5,7 +5,7 @@ PGFILEDESC = "test_decoding - example of a logical decoding output plugin" # Note: because we don't tell the Makefile there are any regression tests, # we have to clean those result files explicitly -EXTRA_CLEAN = $(pg_regress_clean_files) ./regression_output ./isolation_output +EXTRA_CLEAN = $(pg_regress_clean_files) ifdef USE_PGXS PG_CONFIG = pg_config @@ -42,11 +42,8 @@ REGRESSCHECKS=ddl xact rewrite toast permissions decoding_in_xact \ spill slot regresscheck: | submake-regress submake-test_decoding temp-install - $(MKDIR_P) regression_output $(pg_regress_check) \ --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf \ - --temp-instance=./tmp_check \ - --outputdir=./regression_output \ $(REGRESSCHECKS) regresscheck-install-force: | submake-regress submake-test_decoding temp-install @@ -56,10 +53,8 @@ regresscheck-install-force: | submake-regress submake-test_decoding temp-install ISOLATIONCHECKS=mxact delayed_startup ondisk_startup concurrent_ddl_dml isolationcheck: | submake-isolation submake-test_decoding temp-install - $(MKDIR_P) isolation_output $(pg_isolation_regress_check) \ --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf \ - --outputdir=./isolation_output \ $(ISOLATIONCHECKS) isolationcheck-install-force: all | submake-isolation submake-test_decoding temp-install diff --git a/src/Makefile.global.in b/src/Makefile.global.in index 831d39a9d1..9796ffd753 100644 --- a/src/Makefile.global.in +++ b/src/Makefile.global.in @@ -545,14 +545,31 @@ TEMP_CONF += --temp-config=$(TEMP_CONFIG) endif pg
Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)
On Sat, Mar 11, 2017 at 4:17 PM, Tom Lane wrote: > Corey Huinker writes: > > [ 0001.if_endif.v21.diff ] > > Starting to poke at this... the proposal to add prove checks for psql > just to see whether \if respects ON_ERROR_STOP seems like an incredibly > expensive way to test a rather minor point. On my machine, "make check" > in bin/psql goes from zero time to close to 8 seconds. I'm not really > on board with adding that kind of time to every buildfarm run for the > foreseeable future just for this. > > Couldn't we get close to the same coverage by adding a single-purpose > test script to the main regression tests? Along the lines of > > \set ON_ERROR_STOP 1 > \if invalid > \echo should not get here > \endif > \echo should not get here either > > You could imagine just dropping that at the end of psql.sql, but I > think probably a separate script is worth the trouble. > > regards, tom lane > I think I can manage that. Just to be clear, you're asking me to replace the perl script with one new sql script? If so, there's probably a few non-on-stop tests in there that might be worth preserving in regression form.
Re: [HACKERS] possible encoding issues with libxml2 functions
On Mon, Feb 20, 2017 at 07:48:18PM +0100, Pavel Stehule wrote: > Today I played with xml_recv function and with xml processing functions. > > xml_recv function ensures correct encoding from document encoding to server > encoding. But the decl section holds original encoding info - that should > be obsolete after encoding. Sometimes we solve this issue by removing decl > section - see the xml_out function. > > Sometimes we don't do it - lot of functions uses direct conversion from > xmltype to xmlChar. > There are possible two fixes > > a) clean decl on input - the encoding info can be removed from decl part > > b) use xml_out_internal everywhere before transformation to > xmlChar. pg_xmlCharStrndup can be good candidate. I'd prefer (a) if the xml type were a new feature, because no good can come of storing an encoding in each xml field when we know the actual encoding is the database encoding. However, if you implemented (a), we'd still see untreated values brought over via pg_upgrade. Therefore, I would try (b) first. I suspect the intent of xml_parse() was to implement (b); it will be interesting to see your test case that malfunctions. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] scram and \password
On 03/11/2017 03:15 PM, Michael Paquier wrote: > Yeah but it can be called as well while the application is calling > PQgetResult() and still looping until it gets a NULL result. Not sure > if this is a use-case to worry about, but sending a query to the > server in PQencryptPassword() could as well break some applications. I was suggesting sending the query outside of PQencryptPassword() in order to determine what method should be passed as a new argument to PQencryptPassword(). > PQencryptPassword() is used for CREATE/ALTER ROLE commands, so > actually wouldn't it make sense to just switch PQencryptPassword to > handle SCRAM if at some point we decide to switch the default from md5 > to scram? So many questions. As long as we support more than one method it would seem to me we need a way to determine which one we want to use and not only default it, don't we? Apologies if this has already been discussed -- I was not able to follow the lengthy threads on SCRAM in any detail. >> I guess a related problem might be, do we have a SQL visible way to >> determine what method is used by the current password for a given role? > > Nope. We are simply looking at a function doing a lookup at pg_authid > and then use get_password_type() to check which type of verifier is > used... Or have the type of verifier as a new column of pg_authid, > information that could be made visible to any users with column > privileges. What happens if the user does not have privs for pg_authid? E.g. if I want to change my own password what happens if the default is one method, and my password uses the other -- now I cannot change my own password using \password? Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)
Corey Huinker writes: > [ 0001.if_endif.v21.diff ] I had thought that this patch was pretty close to committable, but the more I poke at it the less I think so. The technology it uses for skipping unexecuted script sections has got too many bugs. * Daniel Verite previously pointed out the desirability of disabling variable expansion while skipping script. That doesn't seem to be here, though there's an apparently-vestigial comment in psql/common.c claiming that it is. IIRC, I objected to putting knowledge of ConditionalStack into the shared psqlscan.l lexer, and I still think that would be a bad idea; but we need some way to get the lexer to shut that off. Probably the best way is to add a passthrough "void *" argument that would let the get_variable callback function mechanize the rule about not expanding in a false branch. * Whether or not you think it's important not to expand skipped variables, I think that it's critical that skipped backtick expressions not be executed. The specific use-case that I'm concerned about is backtick evals in \if expressions, which are going to be all over the place as long as we haven't got any native expression eval capability, and will doubtless remain important even when/if we do. So in a case like \if something \elif `expr :var1 + :var2 = :var3` \endif I think it's essential that expr not be called if the first if-condition succeeded. (That first condition might be checking whether the vars contain valid integers, for example.) The current patch gets this totally wrong --- not only does it perform the backtick, but \elif complains if the result isn't a valid bool. I do not think that a skipped \if or \elif should evaluate its argument at all. * The documentation says that an \if or \elif expression extends to the end of the line, but actually the code is just eating one OT_NORMAL argument. That means it's OK to do this: regression=# \if 1 \echo foo \echo bar \endif foo bar regression=# which doesn't seem insane, except that the inverse case is insane: regression=# \if 0 \echo foo \echo bar \endif regression@# (notice we're not out of the conditional). Even if we change it to eat the whole line as argument, this inconsistency will remain: regression=# \if 1 regression=# \echo foo \endif foo regression=# (notice we're out of the conditional) regression=# \if 0 regression@# \echo foo \endif command ignored, use \endif or Ctrl-C to exit current branch. regression@# (notice we're not out of the conditional) This inconsistency seems to have to do with the code in HandleSlashCmds that discards arguments until EOL after a failed backslash command. You've modified that to also discard arguments after a non-executed command, and I think that's broken. * More generally, I do not think that the approach of having exec_command simply fall out immediately when in a false branch is going to work, because it ignores the fact that different backslash commands have different argument parsing rules. Some will eat the rest of the line and some won't. I'm afraid that it might be necessary to remove that code block and add a test to every single backslash command that decides whether to actually perform its action after it's consumed its arguments. That would be tedious :-(. But as it stands, backslash commands will get parsed differently (ie with potentially-different ending points) depending on whether they're in a live branch or not, and that seems just way too error-prone to be allowed to stand. * I think it's completely wrong to do either resetPQExpBuffer(query_buf) or psql_scan_reset(scan_state) when deciding a branch is not to be executed. Compare these results: regression=# select (1 + regression(# \if 1 regression-# \echo foo foo regression-# \endif regression-# 2); ?column? -- 3 (1 row) regression=# select (1 + regression(# \if 0 regression-# \echo foo command ignored, use \endif or Ctrl-C to exit current branch. regression@# \endif regression=# 2); ERROR: syntax error at or near "2" LINE 1: 2); ^ regression=# If the first \if doesn't throw away an incomplete query buffer (which it should not), then surely the second should not either. Somebody who actually wants to toss the query buffer can put \r into the appropriate branch of their \if; it's not for us to force that on them. * Also, the documentation for psql_scan_reset is pretty clear that it's to be called when and only when the query buffer is reset, which makes your calls in the bodies of the conditional commands wrong. As an example: regression=# select (1 + regression(# 2; regression(# (notice we've not sent the known-incomplete command to the server) vs regression(# \r Query buffer reset (cleared). regression=# select (1 + regression(# \if 1 regression-# \endif regression-# 2; ERROR: syntax error at or near ";" LINE 2: 2; ^ regression=# That happens because the \if code gratuituously resets the lexer, as we can
Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)
On Fri, Mar 3, 2017 at 3:18 AM, Fabien COELHO wrote: > I'm ok with this patch. I think that the very simple automaton code > structure achieved is worth the very few code duplications. It is also > significantly shorter than the nested if/switch variants, and it does > exactly what Tom and Robert wished with respect to errors, so I think that > this is a good compromise. I think that I have not taken a firm position on what the behavior should be with respect to errors.I took the position that the messages being printed saying what happened were too detailed, because they not only described what had happened but also tried to prognosticate what would happen next, which was dissimilar to what we do elsewhere and likely to be hard to maintain - or even get right for v1. But I have not taken a position on what should happen if the condition for \if or \elsif evaluates to a baffling value. Corey's prior proposal was to treat it, essentially, as neither true nor false, skipping both arms of the if. Tom seems to want an invalid value treated as false. You could also imagine pretending that the command never happened at all, likely leading to complete chaos. Other positions are also possible. I suggest that doing it the way Tom likes may be the path of least resistance, but this isn't really something I'm very animated about personally. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Faster Expression Processing v4
On Fri, Mar 10, 2017 at 7:42 PM, Bruce Momjian wrote: > On Fri, Mar 10, 2017 at 07:15:59PM -0500, Peter Eisentraut wrote: >> On 3/10/17 14:40, Andres Freund wrote: >> > I'd really like to get it in. The performance improvements on its own >> > are significant, and it provides the basis for significantly larger >> > improvements again (JIT) - those followup improvements are large patches >> > again though, so I'd rather not do all of that next cycle. >> > >> > My next step (over the weekend) is to add tests to execQual.c to get it >> > a good chunk closer to 100% test coverage, and then do the same for the >> > new implementation (there's probably very little additional tests needed >> > after the conversion). Given all tests pass before/after, and there's a >> > lot of them, I think we can have a reasonable confidence of a low bug >> > density. >> >> That seems like a plan, but now would probably be a good time for some >> other hackers to take note of this proposal. > > Well, the executor is long overdue for improvement, so I would love to > have this in 10.0. I am not sure what additional polishing will happen > if we punt it for 11.0. I think the only downside of having it in 10.0 > is that it will not have lived in the source tree for as long a time > between commit and PG release, but if it is tested, that seems fine. Yeah. I think we can't rule out the possibility that this patch is full of bugs, but (1) a lot of the change is pretty mechanical and (2) none of this touches the on-disk format or the catalog contents, so it doesn't break anybody's existing install if we have to patch it. On the other hand, (3) it's a big patch and (4) if expression evaluation doesn't work, PostgreSQL is pretty much unusable. So my feeling is: 1. If Andres commits this and it turns out to be seriously buggy, he should be prepared to revert it without a big fight. We can try again for v11. 2. If Andres commits this and it turns out to be mildly buggy, he should be prepared to address bugs very proactively and very promptly. 3. If Andres wants to commit this for v10, it should be done RSN. Feature freeze isn't technically until the end of March, but giant potentialy-destabilizing patches that land on March 31st are likely to lead to outcomes that nobody wants. Like, I think, everyone else, I'd really like to have these speedups and I think they will benefit everyone who uses PostgreSQL. However, I think that we cannot afford to have a repeat of the simplehash thing where serious bugs sat around for months without really getting properly addressed. If that happens with this, I will be extremely unhappy, and I'm fairly sure I won't be alone. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Write Ahead Logging for Hash Indexes
On Sat, Mar 11, 2017 at 12:20 AM, Amit Kapila wrote: >> /* >> + * Change the shared buffer state in critical section, >> + * otherwise any error could make it unrecoverable after >> + * recovery. >> + */ >> +START_CRIT_SECTION(); >> + >> +/* >> * Insert tuple on new page, using _hash_pgaddtup to ensure >> * correct ordering by hashkey. This is a tad inefficient >> * since we may have to shuffle itempointers repeatedly. >> * Possible future improvement: accumulate all the items for >> * the new page and qsort them before insertion. >> */ >> (void) _hash_pgaddtup(rel, nbuf, itemsz, new_itup); >> >> +END_CRIT_SECTION(); >> >> No way. You have to start the critical section before making any page >> modifications and keep it alive until all changes have been logged. >> > > I think what we need to do here is to accumulate all the tuples that > need to be added to new bucket page till either that page has no more > space or there are no more tuples remaining in an old bucket. Then in > a critical section, add them to the page using _hash_pgaddmultitup and > log the entire new bucket page contents as is currently done in patch > log_split_page(). I agree. > Now, here we can choose to log the individual > tuples as well instead of a complete page, however not sure if there > is any benefit for doing the same because XLogRecordAssemble() will > anyway remove the empty space from the page. Let me know if you have > something else in mind. Well, if you have two pages that are 75% full, and you move a third of the tuples from one of them into the other, it's going to be about four times more efficient to log only the moved tuples than the whole page. But logging the whole filled page wouldn't be wrong, just somewhat inefficient. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions
> -Original Messages- > From: "Kevin Grittner" > Sent Time: 2017-03-12 04:24:29 (Sunday) > To: "Mengxing Liu" > Cc: "pgsql-hackers@postgresql.org" > Subject: Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict tracking in > serializable transactions > > On Fri, Mar 10, 2017 at 9:12 PM, Mengxing Liu > wrote: > > > My name is Mengxing Liu. I am interested in the project "Eliminate > > O(N^2) scaling from rw-conflict tracking in serializable > > transactions”. After discussing with Kevin off-list, I think it's > > time to post discussion here. I am afraid that there are two things > > that I need your help. Thank you very much. > > Welcome to the -hackers list! This is a key part of how development > happens in the community. Don't be shy about posting questions and > ideas, but also expect fairly blunt discussion to occur at times; > don't let that put you off. > Thanks, Kevin. I will keep that in mind. > >>> So the task is clear. We can use a tree-like or hash-like data > >>> structure to speed up this function. > >> > >> Right -- especially with a large number of connections holding a > >> large number of conflicts. In one paper with high concurrency, they > >> found over 50% of the CPU time for PostgreSQL was going to these > >> functions (including functions called by them). This seems to me to > >> be due to the O(N^2) (or possibly worse) performance from the number > >> of connections. > > > > Anyone knows the title of this paper? I want to reproduce its > > workloads. > > I seem to remember there being a couple other papers or talks, but > this is probably the most informative: > > http://sydney.edu.au/engineering/it/research/tr/tr693.pdf > Thanks, It's exactly what I need. > >> Remember, I think most of the work here is going to be in > >> benchmarking. We not only need to show improvements in simple test > >> cases using readily available tools like pgbench, but think about > >> what types of cases might be worst for the approach taken and show > >> that it still does well -- or at least not horribly. It can be OK > >> to have some slight regression in an unusual case if the common > >> cases improve a lot, but any large regression needs to be addressed > >> before the patch can be accepted. There are some community members > >> who are truly diabolical in their ability to devise "worst case" > >> tests, and we don't want to be blind-sided by a bad result from one > >> of them late in the process. > >> > > > > Are there any documents or links introducing how to test and > > benchmark PostgreSQL? I may need some time to learn about it. > > There is pgbench: > > https://www.postgresql.org/docs/devel/static/pgbench.html > > A read-only load and a read/write mix should both be tested, > probably with a few different scales and client counts to force the > bottleneck to be in different places. The worst problems have been > seen with 32 or more cores on 4 or more sockets with a large number > of active connections. I don't know whether you have access to a > machine capable of putting this kind of stress on it (perhaps at > your university?), but if not, the community has access to various > resources we should be able to schedule time on. > There is a NUMA machine ( 120 cores, 8 sockets) in my lab. I think it's enough to put this kind of stress. Anyway, I would ask for help if necessary. > It may pay for you to search through the archives of the last year > or two to look for other benchmarks and see what people have > previously done. > I would try. -- Mengxing Liu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)
Robert Haas writes: > I think that I have not taken a firm position on what the behavior > should be with respect to errors.I took the position that the > messages being printed saying what happened were too detailed, because > they not only described what had happened but also tried to > prognosticate what would happen next, which was dissimilar to what we > do elsewhere and likely to be hard to maintain - or even get right for > v1. I thought the same of the version you were complaining about, but the current patch seems to have dialed it back a good deal. Do you still find the current error messages unmaintainable? > But I have not taken a position on what should happen if the > condition for \if or \elsif evaluates to a baffling value. Corey's > prior proposal was to treat it, essentially, as neither true nor > false, skipping both arms of the if. Tom seems to want an invalid > value treated as false. You could also imagine pretending that the > command never happened at all, likely leading to complete chaos. Hmm, if that "prior proposal" was indeed on the table, I missed it. The current patch, AFAICS, implements your third choice, which I quite agree would lead to complete chaos; there would be no way to write a script that did anything useful with that. It is interesting to think about what would happen if "expr is neither true nor false" were defined as "skip immediately to \endif" (which I think is the natural generalization of what you said to apply to an intermediate \elif). I believe that it'd be possible to work with it, but it's not very clear if it'd be easier or harder to work with than the rule of treating bogus results as false. What is clear is that it'd be unlike any other conditional construct I ever worked with. As was pointed out upthread, "treat error results as false" is what you get from "if" in a POSIX shell. I think it's fair also to draw an analogy to what SQL does with null boolean values, which is to treat them as false when a decision is required (in, eg, WHERE or CASE). So I think "treat bogus results as false" is the most conservative, least likely to cause unhappy surprises, solution here. > Other positions are also possible. If you've got concrete ideas about that, let's hear them. I'm not trying to foreclose discussion. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers
On Fri, Mar 10, 2017 at 7:39 PM, Amit Kapila wrote: > I agree that more analysis can help us to decide if we can use subxids > from PGPROC and if so under what conditions. Have you considered the > another patch I have posted to fix the issue which is to do this > optimization only when subxids are not present? In that patch, it > will remove the dependency of relying on subxids in PGPROC. Well, that's an option, but it narrows the scope of the optimization quite a bit. I think Simon previously opposed handling only the no-subxid cases (although I may be misremembering) and I'm not that keen about it either. I was wondering about doing an explicit test: if the XID being committed matches the one in the PGPROC, and nsubxids matches, and the actual list of XIDs matches, then apply the optimization. That could replace the logic that you've proposed to exclude non-commit cases, gxact cases, etc. and it seems fundamentally safer. But it might be a more expensive test, too, so I'm not sure. It would be nice to get some other opinions on how (and whether) to proceed with this. I'm feeling really nervous about this right at the moment, because it seems like everybody including me missed some fairly critical points relating to the safety (or lack thereof) of this patch, and I want to make sure that if it gets committed again, we've really got everything nailed down tight. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] src/test/regress's check-prepared-txns target
On 03/11/2017 06:00 PM, Andres Freund wrote: > Hi, > > In > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ae55d9fbe3871a5e6309d9b91629f1b0ff2b8cba > src/test/regress grew a check-prepared-txns (and an accompanying > installcheck-prepared-txns) target. > > Is that still sensible given that pg_regress actually enables prepared > transactions? I bet these tests are run just about never. > > I'd suggest removing at least check-prepared-txns and adding the test to > the normal check target (which is guaranteed to have prepared xacts enabled). > Sounds reasonable. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a builtin way to run all tests faster manner
Andres Freund writes: > On 2017-03-11 11:48:31 -0800, Andres Freund wrote: >> I think that'd be a good plan. We probably should also keep --outputdir >> seperate (which test_decoding/Makefile does, but >> pg_isolation_regress_check doesn't)? > Here's a patch doing that (based on yours). I'd be kind of inclined to > set --outputdir for !isolation tests too; possibly even move tmp_check > below output_iso/ output_regress/ or such - but that seems like it > potentially could cause some disagreement... This looks generally sane to me, although I'm not very happy about folding the "$(MKDIR_P) output_iso" call into pg_isolation_regress_check --- that seems weird and unlike the way it's done for the regular regression test case. But probably Peter has a better-informed opinion. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)
On Sat, Mar 11, 2017 at 9:40 PM, Tom Lane wrote: > I thought the same of the version you were complaining about, but > the current patch seems to have dialed it back a good deal. Do you > still find the current error messages unmaintainable? I haven't looked, but I had the impression this had been much improved. >> But I have not taken a position on what should happen if the >> condition for \if or \elsif evaluates to a baffling value. Corey's >> prior proposal was to treat it, essentially, as neither true nor >> false, skipping both arms of the if. Tom seems to want an invalid >> value treated as false. You could also imagine pretending that the >> command never happened at all, likely leading to complete chaos. > > Hmm, if that "prior proposal" was indeed on the table, I missed it. > The current patch, AFAICS, implements your third choice, which I quite > agree would lead to complete chaos; there would be no way to write a > script that did anything useful with that. Well, other than: don't write a script with invalid commands in it. But I'm not seriously advocating for that position. > It is interesting to think about what would happen if "expr is neither > true nor false" were defined as "skip immediately to \endif" (which > I think is the natural generalization of what you said to apply to an > intermediate \elif). I believe that it'd be possible to work with it, > but it's not very clear if it'd be easier or harder to work with than > the rule of treating bogus results as false. What is clear is that > it'd be unlike any other conditional construct I ever worked with. True. > As was pointed out upthread, "treat error results as false" is what > you get from "if" in a POSIX shell. I think it's fair also to draw > an analogy to what SQL does with null boolean values, which is to > treat them as false when a decision is required (in, eg, WHERE or > CASE). So I think "treat bogus results as false" is the most > conservative, least likely to cause unhappy surprises, solution here. I don't mind that. I was simply stating that I hadn't advocated for anything in particular. >> Other positions are also possible. > > If you've got concrete ideas about that, let's hear them. I'm not > trying to foreclose discussion. I personally don't, but others may. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Append implementation
On Fri, Mar 10, 2017 at 8:12 AM, Amit Khandekar wrote: >> +static inline void >> +exec_append_scan_first(AppendState *appendstate) >> +{ >> +appendstate->as_whichplan = 0; >> +} >> >> I don't think this is buying you anything, and suggest backing it out. > > This is required for sequential Append, so that we can start executing > from the first subplan. My point is that there's really no point in defining a static inline function containing one line of code. You could just put that line of code in whatever places need it, which would probably be more clear. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Append implementation
On Fri, Mar 10, 2017 at 6:01 AM, Tels wrote: > Just a question for me to understand the implementation details vs. the > strategy: > > Have you considered how the scheduling decision might impact performance > due to "inter-plan parallelism vs. in-plan parallelism"? > > So what would be the scheduling strategy? And should there be a fixed one > or user-influencable? And what could be good ones? > > A simple example: > > E.g. if we have 5 subplans, and each can have at most 5 workers and we > have 5 workers overall. > > So, do we: > > Assign 5 workers to plan 1. Let it finish. > Then assign 5 workers to plan 2. Let it finish. > and so on > > or: > > Assign 1 workers to each plan until no workers are left? Currently, we do the first of those, but I'm pretty sure the second is way better. For example, suppose each subplan has a startup cost. If you have all the workers pile on each plan in turn, every worker pays the startup cost for every subplan. If you spread them out, then subplans can get finished without being visited by all workers, and then the other workers never pay those costs. Moreover, you reduce contention for spinlocks, condition variables, etc. It's not impossible to imagine a scenario where having all workers pile on one subplan at a time works out better: for example, suppose you have a table with lots of partitions all of which are on the same disk, and it's actually one physical spinning disk, not an SSD or a disk array or anything, and the query is completely I/O-bound. Well, it could be, in that scenario, that spreading out the workers is going to turn sequential I/O into random I/O and that might be terrible. In most cases, though, I think you're going to be better off. If the partitions are on different spindles or if there's some slack I/O capacity for prefetching, you're going to come out ahead, maybe way ahead. If you come out behind, then you're evidently totally I/O bound and have no capacity for I/O parallelism; in that scenario, you should probably just turn parallel query off altogether, because you're not going to benefit from it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Faster Expression Processing v4
Hi, On 03/07/2017 04:01 AM, Andres Freund wrote: Hi, Attached is an updated version of the patchset, but more importantly some benchmark numbers. I wanted to do a bit of testing and benchmarking on this, but 0004 seems to be a bit broken. The patch does not apply anymore - there are some conflicts in execQual.c, but I think I fixed those. But then I ran into a bunch of compile-time errors, because some of the executor nodes still reference bits that were moved elsewhere. E.g. there's no targetlist in PlanState anymore, yet nodeGatherMerge and nodeTableFuncscan do this: gm_state->ps.targetlist = (List *) ExecInitExpr((Expr *) node->plan.targetlist, (PlanState *) gm_state); Some of the nodes also assign to ps.qual values that are (List *), but that field was switched to ExprState. That needs fixing, I guess. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)
On Sat, Mar 11, 2017 at 5:45 PM, Tom Lane wrote: > > * Whether or not you think it's important not to expand skipped variables, > I think that it's critical that skipped backtick expressions not be > executed. > [...] > I do not think that a skipped \if or \elif > should evaluate its argument at all. > > [...] > * I'm not on board with having a bad expression result in failing > the \if or \elif altogether. It was stated several times upthread > that that should be processed as though the result were "false", > and I agree with that. +1 Oddly, Corey was using you as support for this position...though without an actual quote: """ Tom was pretty adamant that invalid commands are not executed. So in a case like this, with ON_ERROR_STOP off: \if false \echo 'a' \elif true \echo 'b' \elif invalid \echo 'c' \endif Both 'b' and 'c' should print, because "\elif invalid" should not execute. The code I had before was simpler, but it missed that. """ https://www.postgresql.org/message-id/CADkLM%3De9BY_- PT96mcs4qqiLtt8t-Fp%3De_AdycW-aS0OQvbC9Q%40mail.gmail.com Also, Robert made a comment somewhere along the line about users wanting to simply re-type the intended line if the "invalid" was interactive and due to a typo. That concern is pretty much limited to just the "\if" situation - if you typo an "\elif" block you can just type "\elif" again and begin yet another "\elif" block. I say we live with it and focus on the UX - if you type \if no matter what happens after you hit enter you are in a conditional block and will need to \endif at some point. Re-typing the correct \if command will just make you need another one of them. David J.
Re: [HACKERS] scram and \password
On Sun, Mar 12, 2017 at 9:10 AM, Joe Conway wrote: > On 03/11/2017 03:15 PM, Michael Paquier wrote: >> Yeah but it can be called as well while the application is calling >> PQgetResult() and still looping until it gets a NULL result. Not sure >> if this is a use-case to worry about, but sending a query to the >> server in PQencryptPassword() could as well break some applications. > > I was suggesting sending the query outside of PQencryptPassword() in > order to determine what method should be passed as a new argument to > PQencryptPassword(). Why not. Our thoughts don't overlap, I thought about having PQencryptPassword() call itself the server for the value of password_encryption, and force the type depending on what the server answers. >> PQencryptPassword() is used for CREATE/ALTER ROLE commands, so >> actually wouldn't it make sense to just switch PQencryptPassword to >> handle SCRAM if at some point we decide to switch the default from md5 >> to scram? So many questions. > > As long as we support more than one method it would seem to me we need a > way to determine which one we want to use and not only default it, don't > we? Apologies if this has already been discussed -- I was not able to > follow the lengthy threads on SCRAM in any detail. Definitely, the most simple solution would be just to extend PQencryptPassword() with a method value, to allow a user to do what he wants... >>> I guess a related problem might be, do we have a SQL visible way to >>> determine what method is used by the current password for a given role? >> >> Nope. We are simply looking at a function doing a lookup at pg_authid >> and then use get_password_type() to check which type of verifier is >> used... Or have the type of verifier as a new column of pg_authid, >> information that could be made visible to any users with column >> privileges. > > What happens if the user does not have privs for pg_authid? E.g. if I > want to change my own password what happens if the default is one > method, and my password uses the other -- now I cannot change my own > password using \password? You can now. However it would be a problem for a user having a SCRAM verifier using an application that changes the password with PQencryptPassword() as it would change it back to MD5 on an update. Having a RLS on pg_authid to allow a user to look at its own password type is an idea. With multiple verifier types per role such class of bugs can be also completely discarded. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)
Starting to poke at this... the proposal to add prove checks for psql just to see whether \if respects ON_ERROR_STOP seems like an incredibly expensive way to test a rather minor point. On my machine, "make check" in bin/psql goes from zero time to close to 8 seconds. I'm not really on board with adding that kind of time to every buildfarm run for the foreseeable future just for this. ISTM that these tests allowed to find bugs in the implementation, so they were useful at some point. They are still useful in the short term if the implementation is to be changed significantly to respond to your various requirements. The underlying issue with TAP test is that it installs a new cluster on each script, which is quite costly. In this case, the same result could be achieved with a number of small failing tests, which only launch "psql". Could that be acceptable? What you suggest is to keep only *one* failing test, which I find is kind of a regression from a testing coverage perspective, although obviously it is possible. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers