Re: [HACKERS] Prepared statements and generic plans

2016-06-10 Thread 'br...@momjian.us'
On Tue, Jun  7, 2016 at 06:52:15AM +, Albe Laurenz wrote:
> Bruce Momjian wrote:
> >> !distinct column values, a generic plan assumes a column equality
> >> !comparison will match 33% of processed rows.  Column statistics
> >>
> >> ... assumes *that* a column equality comparison will match 33% of *the* 
> >> processed rows.
> > 
> > Uh, that seems overly wordy.  I think the rule is that if the sentence
> > makes sense without the words, you should not use them, but it is
> > clearly a judgement call in this case.  Do you agree?
> 
> My gut feeling is that at least the "the" should be retained, but mine
> are the guts of a German speaker.
> It is clearly a judgement call, so follow your instincts.

I think "that/the" would make sense if this sentence was referencing a
specific result.  The sentence is referencing a hypothetical, so I don't
think "that/the" is needed.

> > One more thing --- there was talk of moving some of this into chapter
> > 66, but as someone already mentioned, there are no subsections there
> > because it is a dedicated topic:
> > 
> > 66. How the Planner Uses Statistics.
> > 
> > I am not inclined to add a prepare-only section to that chapter.  On the
> > other hand, the issues described apply to PREPARE and to protocol-level
> > prepare, so having it in PREPARE also seems illogical.  However, I am
> > inclined to leave it in PREPARE until we are ready to move all of this
> > to chapter 66.
> 
> I think it would be ok to leave it where it is in your patch; while the
> paragraph goes into technical detail, it is still alright in the general
> documentation (but only just).

I researched moving some of this text into chapter 66, but found that
only some of it related to the optimizer.  I also realized that the text
applies to the libpq/wire protocol prepare cases too, so rather than
bounce readers to the PREPARE manual page, and then to chapter 66, I
just kept it all in PREPARE, with a reference from the wire protocol
prepare section.

Also, is it possible to do an EXPLAIN prepare() with the libpq/wire
protocol?  I can't do PREPARE EXPLAIN, but I can do EXPLAIN EXECUTE. 
However, I don't see any way to inject EXPLAIN into the libpq/wire
prepare case.  Can you specify prepare(EXPLAIN SELECT)?  (PREPARE
EXPLAIN SELECT throws a syntax error.)

Looking at how the code behaves, it seems custom plans that are _more_
expensive (plus planning cost) than the generic plan switch to the
generic plan after five executions, as now documented.  Custom plans
that are significantly _cheaper_ than the generic plan _never_ use the
generic plan.

Here is an example --- first load this SQL:

DROP TABLE IF EXISTS test;
CREATE TABLE test (c1 INT, c2 INT);
INSERT INTO test SELECT c1, abs(floor(random() * 4)-1) FROM 
generate_series(1, 1) AS a(c1);
INSERT INTO test SELECT c1, abs(floor(random() * 4)-1) FROM 
generate_series(10001, 15000) AS a(c1);
INSERT INTO test SELECT c1, abs(floor(random() * 4)-1) FROM 
generate_series(15001, 2) AS a(c1);
-- add non-uniformly-distributed values to 'c2'
INSERT INTO test SELECT 20001, 3;
INSERT INTO test SELECT 20002, 4;
CREATE INDEX i_test_c1 ON test (c1);
CREATE INDEX i_test_c2 ON test (c2);
ANALYZE test;
PREPARE prep_c1 AS SELECT * FROM test WHERE c1 = $1;
PREPARE prep_c2 AS SELECT * FROM test WHERE c2 = $1;

prep_c1 references 'c1', which is a unique column.  Any value used in
the EXECUTE, e.g. EXPLAIN EXECUTE prep_c1(1), existent or non-existent,
generates an index scan, and after five executions a generic index
scan is used.

For prep_c2, if you use the 50% common value '1', the first five
executions use a sequential scan, then the sixth is a generic Bitmap
Heap Scan. For the 25% value of '0' or '2',  the first five runs
generate a Bitmap Heap Scan, and a generic Bitmap Heap Scan on the sixth
and after.

For a prep_c2 value of 3 or any non-existent value, an Index Scan is
used, and a generic plan is never chosen, because the Index Scan is
significantly cheaper than the generic plan.

Updated patch attached.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
new file mode 100644
index 3829a14..6285dd0
*** a/doc/src/sgml/libpq.sgml
--- b/doc/src/sgml/libpq.sgml
*** PGresult *PQprepare(PGconn *conn,
*** 2303,2310 
 
  PQprepare creates a prepared statement for later
  execution with PQexecPrepared.  This feature allows
! commands that will be used repeatedly to be parsed and planned just
! once, rather than each time they are executed.
  PQprepare is supported only in protocol 3.0 and later
  connections; it will fail when using protocol 2.

Re: [HACKERS] Prepared statements and generic plans

2016-06-13 Thread 'br...@momjian.us'
On Mon, Jun 13, 2016 at 01:26:04PM +, Albe Laurenz wrote:
> Bruce Momjian wrote:
> > Also, is it possible to do an EXPLAIN prepare() with the libpq/wire
> > protocol?  I can't do PREPARE EXPLAIN, but I can do EXPLAIN EXECUTE.
> > However, I don't see any way to inject EXPLAIN into the libpq/wire
> > prepare case.  Can you specify prepare(EXPLAIN SELECT)?  (PREPARE
> > EXPLAIN SELECT throws a syntax error.)
> 
> I am not sure what you mean:
> EXPLAIN PREPARE to get EXPLAIN for PREPARE, or PREPARE ... FOR EXPLAIN
> to get an EXPLAIN statement with parameters.
> What should EXPLAIN PREPARE show that EXPLAIN SELECT wouldn't?
> Why the need for EXPLAIN statements with parameters?

Well, you can't use EXPLAIN with SQL PREPARE:

test=> EXPLAIN PREPARE SELECT * FROM pg_class;
ERROR:  syntax error at or near "PREPARE"
LINE 1: EXPLAIN PREPARE SELECT * FROM pg_class;
^
test=> PREPARE EXPLAIN SELECT * FROM pg_class;
ERROR:  syntax error at or near "SELECT"
LINE 1: PREPARE EXPLAIN SELECT * FROM pg_class;
^
You can only do EXPLAIN EXECUTE ..., which works fine, e.g.:

EXPLAIN EXECUTE prep_c1(0);

However, for the wire protocol prepare/execute, how do you do EXPLAIN?
The only way I can see doing it is to put the EXPLAIN in the prepare
query, but I wasn't sure that works.  So, I just wrote and tested the
attached C program and it properly output the explain information, e.g.

res = PQprepare(conn, "prep1", "EXPLAIN SELECT * FROM pg_language", 0, 
NULL);
---
generated:

QUERY PLAN

Seq Scan on pg_language  (cost=0.00..1.04 rows=4 width=114)

so that works --- good.

> > Looking at how the code behaves, it seems custom plans that are _more_
> > expensive (plus planning cost) than the generic plan switch to the
> > generic plan after five executions, as now documented.  Custom plans
> > that are significantly _cheaper_ than the generic plan _never_ use the
> > generic plan.
> 
> Yes, that's what the suggested documentation improvement says as well,
> right?

Yes.  What is odd is that it isn't the plan of the actual supplied
parameters that is cheaper, just the generic plan that assumes each
distinct value in the query is equally likely to be used.  So, when we
say the generic plan is cheaper, it is just comparing the custom plan
with the supplied parameters vs. the generic plan --- it is not saying
that running the supplied constants with the generic plan will execute
faster, because in fact we might be using a sub-optimial generic plan.

For example, giving my test table that I posted earlier, if you ran the
most common constant (50% common) the first five time, the custom plan
would use a sequential scan.  On the sixth run of that same constant, a
bitmap scan generic plan would be used.  Now, that does have a lower
cost, but only for the _average_ distinct value, not for the 50%
constant that is being used.  A bitmap scan on a constant that would
normally use a sequential scan will take longer than even a sequential
scan, because if it didn't, the custom plan would have chosen the bitmap
scan.

I am not sure how we can improve things, but I wanted to clarify exactly
what is happening.

> > Updated patch attached.
> 
> Upon re-read, one tiny question:
> 
> !Prepared statements can optionally use generic plans rather than
> !re-planning with each set of supplied EXECUTE values.
> 
> Maybe the "optionally" should be omitted, since the user has no choice.
> 
> It is true that there is a cursor option CURSOR_OPT_CUSTOM_PLAN, but that
> cannot be used on the SQL level.

Right.  Updated patch attached.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
/*
 * src/test/isolation/isolationtester.c
 *
 * isolationtester.c
 *		Runs an isolation test specified by a spec file.
 */

#include 
#include 
#include 

static void printResultSet(PGresult *res);


static void
exit_nicely(PGconn *conn)
{
PQfinish(conn);
exit(1);
}


int
main(int argc, char **argv)
{
const char *conninfo;
PGconn *conn;
	PGresult   *res;

if (argc > 2)
{
	fprintf(stderr, "Usage:  %s connection-string\n", argv[0]);
	exit(1);
}
	else if (argc == 2)
conninfo = argv[1];
else
conninfo = "dbname = postgres";

/* Make a connection to the database */
conn = PQconnectdb(conninfo);

/* Check to see that the backend connection was successfully made */
if (PQstatus(conn) != CONNECTION_OK)
{
fprintf(stderr, "Connection to database failed: %s",
PQerrorMessage(conn));
exit_nicely(conn);
}

	res = PQprepare(conn, "prep1", "EXPLAIN SELECT * FROM pg_language", 0, NULL);
	if (PQresultStatus(res) != PGRES_CO

Re: [HACKERS] Prepared statements and generic plans

2016-06-13 Thread br...@momjian.us
On Mon, Jun 13, 2016 at 04:29:26PM -0400, David G. Johnston wrote:
> On Mon, Jun 13, 2016 at 3:40 PM, br...@momjian.us  wrote:
> I am not sure how we can improve things, but I wanted to clarify exactly
> what is happening.
> 
> 
> ​"""
> Comparisons on non-uniformly-distributed
> columns and specification of non-existent values affects the average
> plan cost, and hence if and when a generic plan is chosen
> ​"""
> 
> If we are going to be more precise lets do so here as well.  I have, just
> reading this, no clue whether having non-uniformity and often searching for
> non-existent value will increase or decrease the average plan cost.

Well, we can't be more precise here as the average plan cost could go up
or down, depending on the constants used, becuase the values are not
uniformly-distributed.

> I'm still not certain how this is particularly useful.  If we are willing to
> draw a conclusion here in what circumstances would I, as an end-user, want to
> forgo using a prepared statement and instead dynamically construct an SQL
> statement?  Because at this point while this seems like good detail often 
> times
> my choice of parameters is influenced by what I consider data external to the
> query proper and not any kind of inherent performance aspect.  I'd consider
> this advanced usage which doesn't neatly fit into the SQL Command section of
> the docs.

True, but we have lumped all the "prepared" information into that
section, and I don't see a more logical location for this tidbit of
information.

The big point is that the constants don't affect the generic plan, they
just choose if and when the generic plan is chosen.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prepared statements and generic plans

2016-06-14 Thread ''br...@momjian.us' *EXTERN*'
On Tue, Jun 14, 2016 at 08:37:12AM +, Albe Laurenz wrote:
> Bruce Momjian wrote:
> > However, for the wire protocol prepare/execute, how do you do EXPLAIN?
> > The only way I can see doing it is to put the EXPLAIN in the prepare
> > query, but I wasn't sure that works.  So, I just wrote and tested the
> > attached C program and it properly output the explain information, e.g.
> > 
> > res = PQprepare(conn, "prep1", "EXPLAIN SELECT * FROM pg_language", 0, 
> > NULL);
> > ---
> > generated:
> > 
> > QUERY PLAN
> > 
> > Seq Scan on pg_language  (cost=0.00..1.04 rows=4 width=114)
> > 
> > so that works --- good.
> 
> Hm, yes.
> 
> Were you just curious or is it relevant for the documentation update?

I was curious because if there was no way to do it, I should document
that.

> >>> Looking at how the code behaves, it seems custom plans that are _more_
> >>> expensive (plus planning cost) than the generic plan switch to the
> >>> generic plan after five executions, as now documented.  Custom plans
> >>> that are significantly _cheaper_ than the generic plan _never_ use the
> >>> generic plan.
> >>
> >> Yes, that's what the suggested documentation improvement says as well,
> >> right?
> > 
> > Yes.  What is odd is that it isn't the plan of the actual supplied
> > parameters that is cheaper, just the generic plan that assumes each
> > distinct value in the query is equally likely to be used.  So, when we
> > say the generic plan is cheaper, it is just comparing the custom plan
> > with the supplied parameters vs. the generic plan --- it is not saying
> > that running the supplied constants with the generic plan will execute
> > faster, because in fact we might be using a sub-optimial generic plan.
> 
> Right, that's why it is important to document that it is estimates that are
> compared, not actual costs.
> 
> This has caused confussion in the past, see
> https://www.postgresql.org/message-id/flat/561E749D.4090301%40socialserve.com#561e749d.4090...@socialserve.com
> 
> > Right.  Updated patch attached.
> 
> I am happy with the patch as it is.

Good.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers