Re: proposal: possibility to read dumped table's name from file

Pavel Stehule Mon, 13 Jul 2020 01:22:07 -0700

ne 12. 7. 2020 v 3:43 odesílatel vignesh C <vignes...@gmail.com> napsal:


> On Mon, Jul 6, 2020 at 10:05 AM Pavel Stehule <pavel.steh...@gmail.com>
> wrote:
> >
> > here is support for comment's line - first char should be #
> >
>
> Few comments:
> +               str = fgets(*lineptr + total_chars,
> +                                       *n - total_chars,
> +                                       fp);
> +
> +               if (ferror(fp))
> +                       return -1;
>
> Should we include any error message in the above case.
>
> +               else
> +                       break;
> +       }
> +
> +       if (ferror(fp))
> +               return -1;
>
> Similar to above.
>

it should be ok, both variant finishing by

<-->if (ferror(fp))
<--><-->fatal("could not read from file \"%s\": %m", filename);

%m should to print related error message


>
> +                       /* check, if there is good enough space for
> next content */
> +                       if (*n - total_chars < 2)
> +                       {
> +                               *n += 1024;
> +                               *lineptr = pg_realloc(*lineptr, *n);
> +                       }
> We could use a macro in place of 1024.
>

done


> +                                               if (objecttype == 't')
> +                                               {
> +                                                       if (is_include)
> +                                                       {
> +
> simple_string_list_append(&table_include_patterns,
> +
>                                            objectname);
> +
> dopt.include_everything = false;
> +                                                       }
> +                                                       else
> +
> simple_string_list_append(&table_exclude_patterns,
> +
>                                            objectname);
> +                                               }
> +                                               else if (objecttype == 'n')
> +                                               {
> +                                                       if (is_include)
> +                                                       {
> +
> simple_string_list_append(&schema_include_patterns,
> +
>                                            objectname);
> +
> dopt.include_everything = false;
> +                                                       }
> +                                                       else
> +
> simple_string_list_append(&schema_exclude_patterns,
> +
>                                            objectname);
> +                                               }
> Some of the above code is repetitive in above, can the common code be
> made into a macro and called?
>

There are two same fragments and two different fragments. In this case I
don't think so using macro or auxiliary function can help with readability.
Current code is well structured and well readable.


>
>         printf(_("  --extra-float-digits=NUM     override default
> setting for extra_float_digits\n"));
> +       printf(_("  --filter=FILENAME            read object name
> filter expressions from file\n"));
>         printf(_("  --if-exists                  use IF EXISTS when
> dropping objects\n"));
> Can this be changed to dump objects and data based on the filter
> expressions from the filter file.
>

I am sorry, I don't understand. This should work for data from specified by
filter without any modification.

attached updated patch

Regards

Pavel


> Regards,
> Vignesh
> EnterpriseDB: http://www.enterprisedb.com
>

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 7a37fd8045..2f2bfb4dbf 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -755,6 +755,99 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--filter=<replaceable class="parameter">filename</replaceable></option></term>
+      <listitem>
+       <para>
+        This option ensure reading object's filters from specified file.
+        If you use "-" as filename, then stdin is used as source. This file
+        has to have following line format:
+<synopsis>
+(+|-)[tnfd] <replaceable class="parameter">objectname</replaceable>
+</synopsis>
+        Only one object name can be specified per one line:
+<programlisting>
++t mytable1
++t mytable2
++f some_foreign_table
+-d mytable3
+</programlisting>
+        With this file the dump ensures dump table <literal>mytable1</literal>,
+        <literal>mytable2</literal>. The data of foreign table
+        <literal>some_foreign_table</literal> will be dumped too. And the data
+        of <literal>mytable3</literal> will not be dumped.
+       </para>
+
+       <para>
+        The first char <literal>+</literal> or <literal>-</literal> specifies
+        if object name will be used as include or exclude filter.
+       </para>
+
+       <para>
+        The second char
+        <literal>t</literal>,
+        <literal>n</literal>,
+        <literal>f</literal>,
+        <literal>d</literal>
+        specifies a object type.
+
+        <variablelist>
+         <varlistentry>
+          <term><literal>t</literal></term>
+          <listitem>
+           <para>
+            In inclusive form (<literal>+</literal>) it does same work like
+            <option>--table</option>. In exclusive form (<literal>-</literal>)
+            it is same like <option>--exclude-table</option>.
+           </para>
+          </listitem>
+         </varlistentry>
+
+         <varlistentry>
+          <term><literal>n</literal></term>
+          <listitem>
+           <para>
+            In inclusive form (<literal>+</literal>) it does same work like
+            <option>--schema</option>. In exclusive form (<literal>-</literal>)
+            it is same like <option>--exclude-schema</option>.
+           </para>
+          </listitem>
+         </varlistentry>
+
+         <varlistentry>
+          <term><literal>f</literal></term>
+          <listitem>
+           <para>
+            In inclusive form (<literal>+</literal>) it does same work like
+            <option>--include-foreign-data</option>. The exclusive form
+            (<literal>-</literal>) is not allowed.
+           </para>
+          </listitem>
+         </varlistentry>
+
+         <varlistentry>
+          <term><literal>d</literal></term>
+          <listitem>
+           <para>
+            The inclusive form (<literal>+</literal>) is not allowed.
+            In exclusive form (<literal>-</literal>) it is same like
+            <option>--exclude-table-data</option>.
+           </para>
+          </listitem>
+         </varlistentry>
+        </variablelist>
+       </para>
+
+       <para>
+        The option <option>--filter</option> can be used together with options
+        <option>--table</option>, <option>--exclude-table</option>,
+        <option>--schema</option>, <option>--exclude-schema</option>,
+        <option>--include-foreign-data</option> and
+        <option>--exclude-table-data</option>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--if-exists</option></term>
       <listitem>
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e758b5c50a..fd6b7a174a 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -290,6 +290,7 @@ static void appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
 static char *get_synchronized_snapshot(Archive *fout);
 static void setupDumpWorker(Archive *AHX);
 static TableInfo *getRootTableInfo(TableInfo *tbinfo);
+static void read_patterns_from_file(char *filename, DumpOptions *dopt);
 
 
 int
@@ -364,6 +365,7 @@ main(int argc, char **argv)
 		{"enable-row-security", no_argument, &dopt.enable_row_security, 1},
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"extra-float-digits", required_argument, NULL, 8},
+		{"filter", required_argument, NULL, 12},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
 		{"inserts", no_argument, NULL, 9},
 		{"lock-wait-timeout", required_argument, NULL, 2},
@@ -603,6 +605,10 @@ main(int argc, char **argv)
 										  optarg);
 				break;
 
+			case 12:			/* filter implementation */
+				read_patterns_from_file(optarg, &dopt);
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -1022,6 +1028,7 @@ help(const char *progname)
 			 "                               access to)\n"));
 	printf(_("  --exclude-table-data=PATTERN do NOT dump data for the specified table(s)\n"));
 	printf(_("  --extra-float-digits=NUM     override default setting for extra_float_digits\n"));
+	printf(_("  --filter=FILENAME            read object name filter expressions from file\n"));
 	printf(_("  --if-exists                  use IF EXISTS when dropping objects\n"));
 	printf(_("  --include-foreign-data=PATTERN\n"
 			 "                               include data of foreign tables on foreign\n"
@@ -18597,3 +18604,211 @@ appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
 	if (!res)
 		pg_log_warning("could not parse reloptions array");
 }
+
+#define		FILTER_INITIAL_LINE_SIZE		1024
+#define		PG_GETLINE_EXTEND_LINE_SIZE		1024
+
+/*
+ * getline is originaly GNU function, and should not be everywhere still.
+ * Use own reduced implementation.
+ */
+static size_t
+pg_getline(char **lineptr, size_t *n, FILE *fp)
+{
+	size_t		total_chars = 0;
+
+	while (!feof(fp) && !ferror(fp))
+	{
+		char	   *str;
+		size_t		chars;
+
+		str = fgets(*lineptr + total_chars,
+					*n - total_chars,
+					fp);
+
+		if (ferror(fp))
+			return -1;
+
+		if (str)
+		{
+			chars = strlen(str);
+			total_chars += chars;
+
+			if (chars > 0 && str[chars - 1] == '\n')
+				return total_chars;
+
+			/* check, if there is good enough space for next content */
+			if (*n - total_chars < 2)
+			{
+				*n += PG_GETLINE_EXTEND_LINE_SIZE;
+				*lineptr = pg_realloc(*lineptr, *n);
+			}
+		}
+		else
+			break;
+	}
+
+	if (ferror(fp))
+		return -1;
+
+	return total_chars > 0 ? total_chars : -1;
+}
+
+/*
+ * Print error message and exit.
+ */
+static void
+exit_invalid_filter_format(FILE *fp, char *filename, char *message, char *line, int lineno)
+{
+	pg_log_error("invalid format of filter file \"%s\": %s",
+				 filename,
+				 message);
+
+	fprintf(stderr, "%d: %s\n", lineno, line);
+
+	if (fp != stdin)
+		fclose(fp);
+
+	exit_nicely(-1);
+}
+
+/*
+ * Read dumped object specification from file
+ */
+static void
+read_patterns_from_file(char *filename, DumpOptions *dopt)
+{
+	FILE	   *fp;
+	char	   *line;
+	ssize_t		chars;
+	size_t		line_size = FILTER_INITIAL_LINE_SIZE;
+	int			lineno = 0;
+
+	/* use "-" as symbol for stdin */
+	if (strcmp(filename, "-") != 0)
+	{
+		fp = fopen(filename, "r");
+		if (!fp)
+			fatal("could not open the input file \"%s\": %m",
+				  filename);
+	}
+	else
+		fp = stdin;
+
+	line = pg_malloc(line_size);
+
+	while ((chars = pg_getline(&line, &line_size, fp)) != -1)
+	{
+		bool		is_include;
+		char		objecttype;
+		char	   *objectname;
+
+		lineno += 1;
+
+		if (line[chars - 1] == '\n')
+			line[chars - 1] = '\0';
+
+		/* ignore empty rows */
+		if (*line == '\0')
+			continue;
+
+		/* when first char is hash, ignore whole line */
+		if (*line == '#')
+			continue;
+
+		if (chars < 2)
+			exit_invalid_filter_format(fp,
+									   filename,
+									   "too short line",
+									   line,
+									   lineno);
+
+		if (line[0] == '+')
+			is_include = true;
+		else if (line[0] == '-')
+			is_include = false;
+		else
+			exit_invalid_filter_format(fp,
+									   filename,
+									   "invalid option type (use [+-]",
+									   line,
+									   lineno);
+
+		objecttype = line[1];
+		objectname = &line[2];
+
+		/* skip initial spaces */
+		while (isspace(*objectname))
+			objectname++;
+
+		if (*objectname == '\0')
+			exit_invalid_filter_format(fp,
+									   filename,
+									   "missing object name",
+									   line,
+									   lineno);
+
+		if (objecttype == 't')
+		{
+			if (is_include)
+			{
+				simple_string_list_append(&table_include_patterns,
+										  objectname);
+				dopt->include_everything = false;
+			}
+			else
+				simple_string_list_append(&table_exclude_patterns,
+										  objectname);
+		}
+		else if (objecttype == 'n')
+		{
+			if (is_include)
+			{
+				simple_string_list_append(&schema_include_patterns,
+										  objectname);
+				dopt->include_everything = false;
+			}
+			else
+				simple_string_list_append(&schema_exclude_patterns,
+										  objectname);
+		}
+		else if (objecttype == 'd')
+		{
+			if (is_include)
+				exit_invalid_filter_format(fp,
+										   filename,
+										   "include filter is not supported for this type of object",
+										   line,
+										   lineno);
+			else
+				simple_string_list_append(&tabledata_exclude_patterns,
+										  objectname);
+		}
+		else if (objecttype == 'f')
+		{
+			if (is_include)
+				simple_string_list_append(&foreign_servers_include_patterns,
+										  objectname);
+			else
+				exit_invalid_filter_format(fp,
+										   filename,
+										   "exclude filter is not supported for this type of object",
+										   line,
+										   lineno);
+		}
+		else
+			exit_invalid_filter_format(fp,
+									   filename,
+									   "invalid object type (use [tndf])",
+									   line,
+									   lineno);
+	}
+
+	if (ferror(fp))
+		fatal("could not read from file \"%s\": %m", filename);
+
+	if (fp != stdin)
+		fclose(fp);
+
+	pg_free(line);
+}

Re: proposal: possibility to read dumped table's name from file

Reply via email to