Hello Amit,

Thanks for the feedback.

+ case 11: /* partitions */
+ initialization_option_set = true;
+ partitions = atoi(optarg);
+ if (partitions < 0)
+ {
+ fprintf(stderr, "invalid number of partitions: \"%s\"\n",
+ optarg);
+ exit(1);
+ }
+ break;

Is there a reason why we treat "partitions = 0" as a valid value?

Yes. It is an explicit "do not create partitioned tables", which differ from 1 which says "create a partitionned table with just one partition".

Also, shouldn't we keep some max limit for this parameter as well?

I do not think so. If someone wants to test how terrible it is to use 100000 partitions, we should not prevent it.

Forex. how realistic it will be if the user gives the value of
partitions the same or greater than the number of rows in
pgbench_accounts table?

Although I agree that it does not make much sense, for testing purposes why not, to test overheads in critical cases for instance.

I understand it is not sensible to give such a value, but I guess the API should behave sanely in such cases as well.

Yep, it should work.

I am not sure what will be the good max value for it, but I
think there should be one.

I disagree. Pgbench is a tool for testing performance for given parameters. If postgres accepts a parameter there is no reason why pgbench should reject it.

@@ -3625,6 +3644,7 @@ initCreateTables(PGconn *con)
 const char *bigcols; /* column decls if accountIDs are 64 bits */
 int declare_fillfactor;
 };
+
 static const struct ddlinfo DDLs[] = {

Spurious line change.

Indeed.

*
+    "  --partitions=NUM         partition account table in NUM parts
(defaults: 0)\n"
+    "  --partition-method=(range|hash)\n"
+    "                           partition account table with this
method (default: range)\n"

Refer complete table name like pgbench_accounts instead of just account. It will be clear and in sync with what we display in some other options like --skip-some-updates.

Ok.

*
+    "  --partitions=NUM         partition account table in NUM parts
(defaults: 0)\n"

/defaults/default.

Ok.

I think we should print the information about partitions in
printResults.  It can help users while analyzing results.

Hmmm. Why not, with some hocus-pocus to get the information out of pg_catalog, and trying to fail gracefully so that if pgbench is run against a no partitioning-support version.

*
+enum { PART_NONE, PART_RANGE, PART_HASH }
+ partition_method = PART_NONE;
+

I think it is better to follow the style of QueryMode enum by using
typedef here, that will make look code in sync with nearby code.

Hmmm. Why not. This means inventing a used-once type name for partition_method. My great creativity lead to partition_method_t.

*
- int i;

 fprintf(stderr, "creating tables...\n");

- for (i = 0; i < lengthof(DDLs); i++)
+ for (int i = 0; i < lengthof(DDLs); i++)

This is unnecessary change as far as this patch is concerned.  I
understand there is no problem in writing either way, but let's not
change the coding pattern here as part of this patch.

The reason I did that is that I had a stupid bug in a development version which was due to an accidental reuse of this index, which would have been prevented by this declaration style. Removed anyway.

+ if (partitions >= 1)
+ {
+ int64 part_size = (naccounts * (int64) scale + partitions - 1) / partitions;
+ char ff[64];
+ ff[0] = '\0';
+ append_fillfactor(ff, sizeof(ff));
+
+ fprintf(stderr, "creating %d partitions...\n", partitions);
+
+ for (int p = 1; p <= partitions; p++)
+ {
+ char query[256];
+
+ if (partition_method == PART_RANGE)
+ {

part_size can be defined inside "if (partition_method == PART_RANGE)"
as it is used here.

I just wanted to avoid recomputing the value in the loop, but indeed it may be computed needlessly. Moved.

In general, this part of the code can use some comments.

Ok.

Attached an updated version.

--
Fabien.
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index c857aa3cba..e3a0abb4c7 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -306,6 +306,31 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--partitions=<replaceable>NUM</replaceable></option></term>
+      <listitem>
+       <para>
+        Create a partitioned <literal>pgbench_accounts</literal> table with
+        <replaceable>NUM</replaceable> partitions of nearly equal size for
+        the scaled number of accounts.
+        Default is <literal>0</literal>, meaning no partitioning.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>--partition-method=<replaceable>NAME</replaceable></option></term>
+      <listitem>
+       <para>
+        Create a partitioned <literal>pgbench_accounts</literal> table with
+        <replaceable>NAME</replaceable> method.
+        Expected values are <literal>range</literal> or <literal>hash</literal>.
+        This option requires that <option>--partitions</option> is set to non-zero.
+        If unspecified, default is <literal>range</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--tablespace=<replaceable>tablespace</replaceable></option></term>
       <listitem>
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index ed7652bfbf..673b175522 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -186,6 +186,14 @@ int64		latency_limit = 0;
 char	   *tablespace = NULL;
 char	   *index_tablespace = NULL;
 
+/* partitioning for pgbench_accounts table, 0 for no partitioning */
+int 		partitions = 0;
+
+typedef enum { PART_NONE, PART_RANGE, PART_HASH, PART_UNKNOWN }
+  partition_method_t;
+
+partition_method_t partition_method = PART_NONE;
+
 /* random seed used to initialize base_random_sequence */
 int64		random_seed = -1;
 
@@ -617,6 +625,9 @@ usage(void)
 		   "  --foreign-keys           create foreign key constraints between tables\n"
 		   "  --index-tablespace=TABLESPACE\n"
 		   "                           create indexes in the specified tablespace\n"
+		   "  --partitions=NUM         partition pgbench_accounts in NUM parts (default: 0)\n"
+		   "  --partition-method=(range|hash)\n"
+		   "                           partition pgbench_accounts with this method (default: range)\n"
 		   "  --tablespace=TABLESPACE  create tables in the specified tablespace\n"
 		   "  --unlogged-tables        create tables as unlogged tables\n"
 		   "\nOptions to select what to run:\n"
@@ -3601,6 +3612,17 @@ initDropTables(PGconn *con)
 					 "pgbench_tellers");
 }
 
+/*
+ * add fillfactor percent option if not 100.
+ */
+static void
+append_fillfactor(char *opts, int len)
+{
+	if (fillfactor < 100)
+		snprintf(opts + strlen(opts), len - strlen(opts),
+				 " with (fillfactor=%d)", fillfactor);
+}
+
 /*
  * Create pgbench's standard tables
  */
@@ -3664,9 +3686,18 @@ initCreateTables(PGconn *con)
 
 		/* Construct new create table statement. */
 		opts[0] = '\0';
-		if (ddl->declare_fillfactor)
+
+		/* Partition pgbench_accounts table */
+		if (partitions >= 1 && strcmp(ddl->table, "pgbench_accounts") == 0)
+		{
 			snprintf(opts + strlen(opts), sizeof(opts) - strlen(opts),
-					 " with (fillfactor=%d)", fillfactor);
+					 " partition by %s (aid)",
+					 partition_method == PART_RANGE ? "range" : "hash");
+		}
+		else if (ddl->declare_fillfactor)
+			/* fillfactor is only expected on actual tables */
+			append_fillfactor(opts, sizeof(opts));
+
 		if (tablespace != NULL)
 		{
 			char	   *escape_tablespace;
@@ -3686,6 +3717,56 @@ initCreateTables(PGconn *con)
 
 		executeStatement(con, buffer);
 	}
+
+	/* if needed, pgbench_accounts partitions must be created manually */
+	if (partitions >= 1)
+	{
+		char		ff[64];
+		ff[0] = '\0';
+		append_fillfactor(ff, sizeof(ff));
+
+		fprintf(stderr, "creating %d partitions...\n", partitions);
+
+		for (int p = 1; p <= partitions; p++)
+		{
+			char		query[256];
+
+			if (partition_method == PART_RANGE)
+			{
+				int64		part_size = (naccounts * (int64) scale + partitions - 1) / partitions;
+				char		minvalue[32], maxvalue[32];
+
+				/* For RANGE, we use open-ended partitions at the beginning and end */
+				if (p == 1)
+					sprintf(minvalue, "minvalue");
+				else
+					sprintf(minvalue, INT64_FORMAT, (p-1) * part_size + 1);
+
+				if (p < partitions)
+					sprintf(maxvalue, INT64_FORMAT, p * part_size + 1);
+				else
+					sprintf(maxvalue, "maxvalue");
+
+				snprintf(query, sizeof(query),
+						 "create%s table pgbench_accounts_%d\n"
+						 "  partition of pgbench_accounts\n"
+						 "  for values from (%s) to (%s)%s\n",
+						 unlogged_tables ? " unlogged" : "", p,
+						 minvalue, maxvalue, ff);
+			}
+			else if (partition_method == PART_HASH)
+				snprintf(query, sizeof(query),
+						 "create%s table pgbench_accounts_%d\n"
+						 "  partition of pgbench_accounts\n"
+						 "  for values with (modulus %d, remainder %d)%s\n",
+						 unlogged_tables ? " unlogged" : "", p,
+						 partitions, p-1, ff);
+			else /* cannot get there */
+				Assert(0);
+
+			executeStatement(con, query);
+		}
+	}
 }
 
 /*
@@ -4919,6 +5000,13 @@ printResults(StatsData *total, instr_time total_time,
 	printf("transaction type: %s\n",
 		   num_scripts == 1 ? sql_script[0].desc : "multiple scripts");
 	printf("scaling factor: %d\n", scale);
+	/* only print partitioning information if some partitioning was detected */
+	if (partition_method != PART_NONE && partition_method != PART_UNKNOWN)
+		printf("partition method: %s\n"
+			   "partitions: %d\n",
+			   partition_method == PART_RANGE ? "range" :
+			   partition_method == PART_HASH ? "hash" : "unexpected",
+			   partitions);
 	printf("query mode: %s\n", QUERYMODE[querymode]);
 	printf("number of clients: %d\n", nclients);
 	printf("number of threads: %d\n", nthreads);
@@ -5126,6 +5214,8 @@ main(int argc, char **argv)
 		{"foreign-keys", no_argument, NULL, 8},
 		{"random-seed", required_argument, NULL, 9},
 		{"show-script", required_argument, NULL, 10},
+		{"partitions", required_argument, NULL, 11},
+		{"partition-method", required_argument, NULL, 12},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -5486,6 +5576,29 @@ main(int argc, char **argv)
 					exit(0);
 				}
 				break;
+			case 11:			/* partitions */
+				initialization_option_set = true;
+				partitions = atoi(optarg);
+				if (partitions < 0)
+				{
+					fprintf(stderr, "invalid number of partitions: \"%s\"\n",
+							optarg);
+					exit(1);
+				}
+				break;
+			case 12:			/* partition-method */
+				initialization_option_set = true;
+				if (pg_strcasecmp(optarg, "range") == 0)
+					partition_method = PART_RANGE;
+				else if (pg_strcasecmp(optarg, "hash") == 0)
+					partition_method = PART_HASH;
+				else
+				{
+					fprintf(stderr, "invalid partition type, expecting \"range\" or \"hash\","
+							" got: \"%s\"\n", optarg);
+					exit(1);
+				}
+				break;
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit(1);
@@ -5567,6 +5680,16 @@ main(int argc, char **argv)
 			exit(1);
 		}
 
+		if (partitions == 0 && partition_method != PART_NONE)
+		{
+			fprintf(stderr, "--partition-method requires actual partitioning with --partitions\n");
+			exit(1);
+		}
+
+		/* set default method */
+		if (partitions > 0 && partition_method == PART_NONE)
+			partition_method = PART_RANGE;
+
 		if (initialize_steps == NULL)
 			initialize_steps = pg_strdup(DEFAULT_INIT_STEPS);
 
@@ -5756,6 +5879,48 @@ main(int argc, char **argv)
 			fprintf(stderr,
 					"scale option ignored, using count from pgbench_branches table (%d)\n",
 					scale);
+
+		/*
+		 * Partition information. Assume no partitioning on any failure, so as
+		 * to avoid failing on an older version. We hope that there is only
+		 * one pgbench_accounts table, otherwise which one is used would depend
+		 * on search_path settings.
+		 */
+		res = PQexec(con,
+					 "select p.partstrat, count(*) "
+					 "from pg_catalog.pg_class as c "
+					 "left join pg_catalog.pg_partitioned_table as p on (p.partrelid = c.oid) "
+					 "left join pg_catalog.pg_inherits as i on (c.oid = i.inhparent) "
+					 "where c.relname = 'pgbench_accounts' "
+					 "group by 1, c.oid");
+		if (PQresultStatus(res) != PGRES_TUPLES_OK)
+		{
+			/* probably an older version, coldly assume no partitioning */
+			partition_method = PART_NONE;
+			partitions = 0;
+		}
+		else if (PQntuples(res) != 1)
+		{
+			/* unsure because multiple pgbench_accounts found */
+			partition_method = PART_UNKNOWN;
+			partitions = 0;
+		}
+		else
+		{
+			char *ps = PQgetvalue(res, 0, 0);
+
+			if (ps == NULL)
+				partition_method = PART_NONE;
+			else if (strcmp(ps, "r") == 0)
+				partition_method = PART_RANGE;
+			else if (strcmp(ps, "h") == 0)
+				partition_method = PART_HASH;
+			else /* whatever */
+				partition_method = PART_NONE;
+
+			partitions = atoi(PQgetvalue(res, 0, 1));
+		}
+		PQclear(res);
 	}
 
 	/*
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index b82d3f65c4..4028525118 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -58,6 +58,18 @@ sub pgbench
 	return;
 }
 
+# tablespace for testing
+my $ts = $node->basedir . '/regress_pgbench_tap_1_ts_dir';
+mkdir $ts or die "cannot create directory $ts";
+
+# escape
+my $ets = $ts;
+$ets =~ s/'/''/;
+
+$node->safe_psql('postgres',
+	"CREATE TABLESPACE regress_pgbench_tap_1_ts LOCATION '$ets';"
+);
+
 # Test concurrent OID generation via pg_enum_oid_index.  This indirectly
 # exercises LWLock and spinlock concurrency.
 my $labels = join ',', map { "'l$_'" } 1 .. 1000;
@@ -100,28 +112,30 @@ pgbench(
 
 # Again, with all possible options
 pgbench(
-	'--initialize --init-steps=dtpvg --scale=1 --unlogged-tables --fillfactor=98 --foreign-keys --quiet --tablespace=pg_default --index-tablespace=pg_default',
+	'--initialize --init-steps=dtpvg --scale=1 --unlogged-tables --fillfactor=98 --foreign-keys --quiet --tablespace=regress_pgbench_tap_1_ts --index-tablespace=regress_pgbench_tap_1_ts --partitions=2 --partition-method=hash',
 	0,
 	[qr{^$}i],
 	[
 		qr{dropping old tables},
 		qr{creating tables},
+		qr{creating 2 partitions},
 		qr{vacuuming},
 		qr{creating primary keys},
 		qr{creating foreign keys},
 		qr{(?!vacuuming)}, # no vacuum
 		qr{done in \d+\.\d\d s }
 	],
-	'pgbench scale 1 initialization');
+	'pgbench scale 1 initialization with options');
 
 # Test interaction of --init-steps with legacy step-selection options
 pgbench(
-	'--initialize --init-steps=dtpvgvv --no-vacuum --foreign-keys --unlogged-tables',
+	'--initialize --init-steps=dtpvgvv --no-vacuum --foreign-keys --unlogged-tables --partitions=3',
 	0,
 	[qr{^$}],
 	[
 		qr{dropping old tables},
 		qr{creating tables},
+		qr{creating 3 partitions},
 		qr{creating primary keys},
 		qr{.* of .* tuples \(.*\) done},
 		qr{creating foreign keys},
@@ -833,7 +847,6 @@ pgbench(
 	'pgbench throttling');
 
 pgbench(
-
 	# given the expected rate and the 2 ms tx duration, at most one is executed
 	'-t 10 --rate=100000 --latency-limit=1 -n -r',
 	0,
@@ -909,6 +922,8 @@ pgbench(
 check_pgbench_logs($bdir, '001_pgbench_log_3', 1, 10, 10,
 	qr{^\d \d{1,2} \d+ \d \d+ \d+$});
 
+$node->safe_psql('postgres', 'DROP TABLESPACE regress_pgbench_tap_1_ts');
+
 # done
 $node->stop;
 done_testing();
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index f7fa18418b..998d814232 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -157,6 +157,13 @@ my @options = (
 			qr{error while setting random seed from --random-seed option}
 		]
 	],
+	[ 'bad partition type', '-i --partition-method=BAD', [qr{"range"}, qr{"hash"}, qr{"BAD"}] ],
+	[ 'bad partition number', '-i --partitions -1', [ qr{invalid number of partitions: "-1"} ] ],
+	[
+		'partition method without partitioning',
+		'-i --partition-method=hash',
+		[ qr{partition-method requires actual partitioning} ]
+	],
 
 	# logging sub-options
 	[

Reply via email to