DDL deparsing is a feature that allows collection of DDL commands as they are executed in a server, in some flexible, complete, and fully-contained format that allows manipulation, storage, and transmission. This feature has several use cases; the two best known ones are DDL replication and DDL auditing.
We have came up with a design that uses a JSON structure to store commands. It is similar to the C sprintf() call in spirit: there is a base format string, which is a generic template for each command type, and contains placeholders that represent the variable parts of the command. The values for the placeholders in each specific command are members of the JSON object. A helper function is provided that expands the format string and replaces the placeholders with the values, and returns the SQL command as text. This design lets the user operate on the JSON structure in either a read-only fashion (for example to block table creation if the names don't satisfy a certain condition), or by modifying it (for example, to change the schema name so that tables are created in different schemas when they are replicated to some remote server). This design is mostly accepted by the community. The one sticking point is testing: how do we ensure that the JSON representation we have created correctly deparses back into a command that has the same effect as the original command. This was expressed by Robert Haas: http://www.postgresql.org/message-id/CA+TgmoZ=vzrijmxlkqi_v0jg4k4leapmwusc6rwxs5mquxu...@mail.gmail.com The problem cannot be solved by a standard regression test module which runs a bunch of previously-defined commands and verifies the output. We need not only check the output for the commands as they exist today, but also we need to ensure that this does not get broken as future patches modify the existing commands as well as create completely new commands. The challenge here is to create a new testing framework that ensures the DDL deparsing module will be maintained by future hackers as the DDL grammar is modified. What and How to Test -------------------- Our goal should be that patch authors run "make check-world" in their patched copies and notice that the DDL deparse test is failing; they can then modify deparse_utility.c to add support for the new commands, which should in general be pretty straightforward. This way, maintaining deparsing code would be part of new patches just like we require pg_dump support and documentation for new features. It would not work to require patch authors to add their new commands to a new pg_regress test file, because most would not be aware of the need, or they would just forget to do it, and patches would be submitted and possibly even committed without any realization of the breakage caused. There are two things we can rely on: standard regression tests, and pg_dump. Standard regression tests are helpful because patch authors include new test cases in the patches that stress their new options or commands. This new test framework needs to be something that internally runs the regression tests and exercises DDL deparsing as the regression tests execute DDL. That would mean that new commands and options would automatically be deparse-tested by our new framework as soon as patch authors add standard regress support. One thing is first-grade testing, that is ensure that the deparsed version of a DDL command can be executed at all, for if the deparsed version throws an error, it's immediately obvious that the deparse code is bogus. This is because we know the original command did not throw an error: otherwise, the deparse code would not have run at all, because ddl_command_end triggers are only executed once the original command has completed execution. So first-grade testing ensures that no trivial bugs are present. But there's second-grade verification as well: is the object produced by the deparsed version identical to the one produced by the original command? One trivial but incomplete approach is to run the command, then save the deparsed version; run the deparsed version, and deparse that one too; compare both. The problem with this approach is that if the deparse code is omitting some clause (say it omits IN TABLESPACE in a CREATE TABLE command), then both deparsed versions would contain the same bug yet they would compare equal. Therefore this approach is not good enough. The best idea we have so far to attack second-grade testing is to trust pg_dump to expose differences: accumulate commands as they run in the regression database, the run the deparsed versions in a different database; then pg_dump both databases and compare the dumped outputs. Proof-of-concept ---------------- We have now implemented this as a proof-of-concept; the code is available in the deparse branch at: http://git.postgresql.org/gitweb/?p=2ndquadrant_bdr.git a diff is attached for reference, but relies on the deparsing functionality available in the deparse branch. To implement the DDL deparsing, a pseudo-test is executed first, which creates an event trigger and a table in which to store queries captured by the event trigger. Following conclusion of all regression tests, a further test is executed which exports the query table, imports it into the second database and runs pg_dump; the output of this is then compared against the expected output. This test can fail either at the import stage, if the deparsed commands are syntactically incorrect, or at the comparison stage, if the a deparsed command is valid but syntactically different to the original. To facilitate this, some minimal changes to pg_regress itself have been necessary. In the current proof-of-concept it automatically creates (and where appropriate drops) the "shadow" database used to load the deparsed commands; and also provides a couple of additional tokens to the .source files to provide information otherwise unavailable to the SQL scripts such as the location of pg_dump and the name of the "shadow" database. A simple schedule to demonstrate this is available; execute from the src/test/regress/ directory like this: ./pg_regress \ --temp-install=./tmp_check \ --top-builddir=../../.. \ --dlpath=. \ --schedule=./schedule_ddl_deparse_demo Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/src/test/regress/expected/.gitignore b/src/test/regress/expected/.gitignore index 93c56c8..2eeaf57 100644 --- a/src/test/regress/expected/.gitignore +++ b/src/test/regress/expected/.gitignore @@ -7,3 +7,7 @@ /misc.out /security_label.out /tablespace.out +/create_function_ddl_demo.out +/deparse_init.out +/deparse_test.out + diff --git a/src/test/regress/expected/create_table_ddl_demo.out b/src/test/regress/expected/create_table_ddl_demo.out new file mode 100644 index 0000000..2fb3f9c --- /dev/null +++ b/src/test/regress/expected/create_table_ddl_demo.out @@ -0,0 +1,5 @@ +CREATE TABLE hobbies_r ( + name text, + person text +); +DROP TABLE hobbies_r; diff --git a/src/test/regress/input/create_function_ddl_demo.source b/src/test/regress/input/create_function_ddl_demo.source new file mode 100644 index 0000000..9982a73 --- /dev/null +++ b/src/test/regress/input/create_function_ddl_demo.source @@ -0,0 +1,4 @@ +CREATE FUNCTION check_foreign_key () + RETURNS trigger + AS '@libdir@/refint@DLSUFFIX@' + LANGUAGE C; \ No newline at end of file diff --git a/src/test/regress/input/deparse_init.source b/src/test/regress/input/deparse_init.source new file mode 100644 index 0000000..2a53cc3 --- /dev/null +++ b/src/test/regress/input/deparse_init.source @@ -0,0 +1,18 @@ +-- +-- DEPARSE_INIT +-- + +CREATE TABLE deparse_test_commands ( + id SERIAL PRIMARY KEY, + command TEXT +); + +CREATE FUNCTION deparse_test_ddl_command_end() + RETURNS event_trigger + LANGUAGE C +AS '@libdir@/regress@DLSUFFIX@', 'deparse_test_ddl_command_end'; + +/* This should come last - we don't want to log anything defined here */ +CREATE EVENT TRIGGER deparse_test_trg_ddl_command_end + ON ddl_command_end + EXECUTE PROCEDURE deparse_test_ddl_command_end(); \ No newline at end of file diff --git a/src/test/regress/input/deparse_test.source b/src/test/regress/input/deparse_test.source new file mode 100644 index 0000000..049924f --- /dev/null +++ b/src/test/regress/input/deparse_test.source @@ -0,0 +1,8 @@ +--- +--- DEPARSE_TEST +--- + +\copy (SELECT command || ';' FROM deparse_test_commands ORDER BY id) TO './sql/deparse_dump.sql' +\! @psqldir@/psql --dbname=@deparse_test_db@ < ./sql/deparse_dump.sql > /dev/null + +\! @psqldir@/pg_dump --schema-only --no-owner --no-privileges -Fp @deparse_test_db@ diff --git a/src/test/regress/output/create_function_ddl_demo.source b/src/test/regress/output/create_function_ddl_demo.source new file mode 100644 index 0000000..58dba52 --- /dev/null +++ b/src/test/regress/output/create_function_ddl_demo.source @@ -0,0 +1,4 @@ +CREATE FUNCTION check_foreign_key () + RETURNS trigger + AS '@libdir@/refint@DLSUFFIX@' + LANGUAGE C; diff --git a/src/test/regress/output/deparse_init.source b/src/test/regress/output/deparse_init.source new file mode 100644 index 0000000..10cc234 --- /dev/null +++ b/src/test/regress/output/deparse_init.source @@ -0,0 +1,15 @@ +-- +-- DEPARSE_INIT +-- +CREATE TABLE deparse_test_commands ( + id SERIAL PRIMARY KEY, + command TEXT +); +CREATE FUNCTION deparse_test_ddl_command_end() + RETURNS event_trigger + LANGUAGE C +AS '@libdir@/regress@DLSUFFIX@', 'deparse_test_ddl_command_end'; +/* This should come last - we don't want to log anything defined here */ +CREATE EVENT TRIGGER deparse_test_trg_ddl_command_end + ON ddl_command_end + EXECUTE PROCEDURE deparse_test_ddl_command_end(); diff --git a/src/test/regress/output/deparse_test.source b/src/test/regress/output/deparse_test.source new file mode 100644 index 0000000..2d5b072 --- /dev/null +++ b/src/test/regress/output/deparse_test.source @@ -0,0 +1,64 @@ +--- +--- DEPARSE_TEST +--- +\copy (SELECT command || ';' FROM deparse_test_commands ORDER BY id) TO './sql/deparse_dump.sql' +\! @psqldir@/psql --dbname=@deparse_test_db@ < ./sql/deparse_dump.sql > /dev/null +\! @psqldir@/pg_dump --schema-only --no-owner --no-privileges -Fp @deparse_test_db@ +-- +-- PostgreSQL database dump +-- + +-- Dumped from database version 9.5devel +-- Dumped by pg_dump version 9.5devel + +SET row_security = off; +SET statement_timeout = 0; +SET lock_timeout = 0; +SET client_encoding = 'UTF8'; +SET standard_conforming_strings = on; +SET check_function_bodies = false; +SET client_min_messages = warning; + +-- +-- Name: plpgsql; Type: EXTENSION; Schema: -; Owner: - +-- + +CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog; + + +-- +-- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner: - +-- + +COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language'; + + +SET search_path = public, pg_catalog; + +-- +-- Name: check_foreign_key(); Type: FUNCTION; Schema: public; Owner: - +-- + +CREATE FUNCTION check_foreign_key() RETURNS trigger + LANGUAGE c + AS '@libdir@/refint@DLSUFFIX@', 'check_foreign_key'; + + +SET default_tablespace = ''; + +SET default_with_oids = false; + +-- +-- Name: hobbies_r; Type: TABLE; Schema: public; Owner: -; Tablespace: +-- + +CREATE TABLE hobbies_r ( + name text, + person text +); + + +-- +-- PostgreSQL database dump complete +-- + diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index 27c46ab..965f561 100644 --- a/src/test/regress/pg_regress.c +++ b/src/test/regress/pg_regress.c @@ -81,6 +81,7 @@ const char *pretty_diff_opts = "-w -C3"; /* options settable from command line */ _stringlist *dblist = NULL; +char *deparse_test_db = NULL; bool debug = false; char *inputdir = "."; char *outputdir = "."; @@ -587,6 +588,8 @@ convert_sourcefiles_in(char *source_subdir, char *dest_dir, char *dest_subdir, c replace_string(line, "@testtablespace@", testtablespace); replace_string(line, "@libdir@", dlpath); replace_string(line, "@DLSUFFIX@", DLSUFFIX); + replace_string(line, "@psqldir@", psqldir); + replace_string(line, "@deparse_test_db@", deparse_test_db); fputs(line, outfile); } fclose(infile); @@ -1959,6 +1962,8 @@ help(void) printf(_("Options:\n")); printf(_(" --create-role=ROLE create the specified role before testing\n")); printf(_(" --dbname=DB use database DB (default \"regression\")\n")); + printf(_(" --dbname-deparse=DB use database DB for DDL deparse test (default\n")); + printf(_(" \"regression_deparse\")\n")); printf(_(" --debug turn on debug mode in programs that are run\n")); printf(_(" --dlpath=DIR look for dynamic libraries in DIR\n")); printf(_(" --encoding=ENCODING use ENCODING as the encoding\n")); @@ -2023,6 +2028,7 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc {"launcher", required_argument, NULL, 21}, {"load-extension", required_argument, NULL, 22}, {"extra-install", required_argument, NULL, 23}, + {"dbname-deparse", required_argument, NULL, 24}, {NULL, 0, NULL, 0} }; @@ -2137,6 +2143,9 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc case 23: add_stringlist_item(&extra_install, optarg); break; + case 24: + deparse_test_db = strdup(optarg); + break; default: /* getopt_long already emitted a complaint */ fprintf(stderr, _("\nTry \"%s -h\" for more information.\n"), @@ -2419,6 +2428,7 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc { for (sl = dblist; sl; sl = sl->next) drop_database_if_exists(sl->str); + drop_database_if_exists(deparse_test_db); for (sl = extraroles; sl; sl = sl->next) drop_role_if_exists(sl->str); } @@ -2431,6 +2441,7 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc { for (sl = dblist; sl; sl = sl->next) create_database(sl->str); + create_database(deparse_test_db); for (sl = extraroles; sl; sl = sl->next) create_role(sl->str, dblist); } diff --git a/src/test/regress/pg_regress.h b/src/test/regress/pg_regress.h index 942d761..c1f38ba 100644 --- a/src/test/regress/pg_regress.h +++ b/src/test/regress/pg_regress.h @@ -38,6 +38,7 @@ extern char *datadir; extern char *host_platform; extern _stringlist *dblist; +extern char *deparse_test_db; extern bool debug; extern char *inputdir; extern char *outputdir; diff --git a/src/test/regress/pg_regress_main.c b/src/test/regress/pg_regress_main.c index 22197aa..fb5d00c 100644 --- a/src/test/regress/pg_regress_main.c +++ b/src/test/regress/pg_regress_main.c @@ -88,6 +88,7 @@ psql_init(int argc, char **argv) { /* set default regression database name */ add_stringlist_item(&dblist, "regression"); + deparse_test_db = "regression_deparse"; } int diff --git a/src/test/regress/regress.c b/src/test/regress/regress.c index 1487171..84bd06d 100644 --- a/src/test/regress/regress.c +++ b/src/test/regress/regress.c @@ -13,6 +13,7 @@ #include "access/tuptoaster.h" #include "access/xact.h" #include "catalog/pg_type.h" +#include "commands/event_trigger.h" #include "commands/sequence.h" #include "commands/trigger.h" #include "executor/executor.h" @@ -1104,3 +1105,81 @@ test_atomic_ops(PG_FUNCTION_ARGS) PG_RETURN_BOOL(true); } + + +/* + * deparse_test_ddl_command_end() + * + * Event trigger function to assist with DDL deparse regression testing. + * + * At the end of each DDL command, this function retrieves the JSON + * representation of the command, converts it back to a query, and + * stores it in a table for later use. + */ + +PG_FUNCTION_INFO_V1(deparse_test_ddl_command_end); +Datum +deparse_test_ddl_command_end(PG_FUNCTION_ARGS) +{ + int ret, row; + TupleDesc spi_tupdesc; + const char *get_creation_commands; + const char *save_command_text; + + MemoryContext tmpcontext; + MemoryContext oldcontext; + + if (!CALLED_AS_EVENT_TRIGGER(fcinfo)) + elog(ERROR, "not fired by event trigger manager"); + + get_creation_commands = + "SELECT command FROM pg_event_trigger_get_creation_commands()"; + + save_command_text = + "INSERT INTO deparse_test_commands (command) VALUES ($1)"; + + tmpcontext = AllocSetContextCreate(CurrentMemoryContext, + "deparse_test_ddl_command_end temporary context", + ALLOCSET_DEFAULT_MINSIZE, + ALLOCSET_DEFAULT_INITSIZE, + ALLOCSET_DEFAULT_MAXSIZE); + oldcontext = MemoryContextSwitchTo(tmpcontext); + + ret = SPI_connect(); + if (ret < 0) + elog(ERROR, "deparse_test_ddl_command_end: SPI_connect returned %d", ret); + + ret = SPI_execute(get_creation_commands, true, 0); + if (ret != SPI_OK_SELECT) + elog(ERROR, "deparse_test_ddl_command_end: SPI_execute returned %d", ret); + + spi_tupdesc = SPI_tuptable->tupdesc; + + for (row = 0; row < SPI_processed; row++) + { + HeapTuple spi_tuple; + Datum json; + Datum command; + bool isnull; + Oid argtypes[1]; + Datum values[1]; + + spi_tuple = SPI_tuptable->vals[row]; + + json = SPI_getbinval(spi_tuple, spi_tupdesc, 1, &isnull); + command = DirectFunctionCall1(pg_event_trigger_expand_command, json); + + argtypes[0] = TEXTOID; + values[0] = command; + + ret = SPI_execute_with_args(save_command_text, 1, argtypes, + values, NULL, false, 0); + } + + SPI_finish(); + MemoryContextSwitchTo(oldcontext); + MemoryContextDelete(tmpcontext); + + PG_RETURN_NULL(); +} + diff --git a/src/test/regress/schedule_ddl_deparse_demo b/src/test/regress/schedule_ddl_deparse_demo new file mode 100644 index 0000000..43574ad --- /dev/null +++ b/src/test/regress/schedule_ddl_deparse_demo @@ -0,0 +1,13 @@ +# schedule + +# Initialise DDL event trigger and table to store expanded DDL deparse +# items; this is not a regression-test per se, but is required by the final +# DDL deparse test +test: deparse_init + +# Normal tests... +test: create_table_ddl_demo +test: create_function_ddl_demo + +# Final DDL deparse test +test: deparse_test \ No newline at end of file diff --git a/src/test/regress/sql/.gitignore b/src/test/regress/sql/.gitignore index 46c8112..374a58d 100644 --- a/src/test/regress/sql/.gitignore +++ b/src/test/regress/sql/.gitignore @@ -6,3 +6,7 @@ /misc.sql /security_label.sql /tablespace.sql +/create_function_ddl_demo.sql +/deparse_init.sql +/deparse_test.sql +/deparse_dump.sql diff --git a/src/test/regress/sql/create_table_ddl_demo.sql b/src/test/regress/sql/create_table_ddl_demo.sql new file mode 100644 index 0000000..b672b26 --- /dev/null +++ b/src/test/regress/sql/create_table_ddl_demo.sql @@ -0,0 +1,6 @@ +CREATE TABLE hobbies_r ( + name text, + person text +); + +DROP TABLE hobbies_r;
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers