Hi,
please see a revised version yesterday's mail. The patch attached now
provides the following:
EXPLAIN(ANALYZE,SERIALIZE)
and
EXPLAIN(ANALYZE,SERIALIZEBINARY)
and timing output.
Both options perform the serialization during analyze and provide an
additional output in the plan like this:
template1=# explain (analyze,serialize) select * from t12 limit 1;
QUERY PLAN
-------------------------------------------------------------------------------------------------------
...
Serialized Bytes: 36 bytes
Execution Time: 0.035 ms
(5 rows)
or also this
template1=# explain (analyze,serialize) select * from t1;
QUERY PLAN
---------------------------------------------------------------------------------------------
Seq Scan on t1 (cost=0.00..1.02 rows=2 width=19) (actual
time=0.101..0.111 rows=5 loops=1)
Planning Time: 0.850 ms
Serialized Bytes: 85777978 bytes
Execution Time: 354.284 ms
(4 rows)
Its tempting to divide Serialized-Bytes by Execution-Time to get an idea
of the serialization bandwidth. This is /dev/null serialization though.
The results are length-counted and then discarded.
Since detoasting happens implicitly during serialization, the number of
bytes becomes huge in this case and accounts for the detoasted lengths
as well. I tried to get the number of bytes send for the protocol's
messages and the attribute headers correctly. For the actual values I am
quite sure I get the correct measures, as one can really tell by sending
more values across. Null is 4 bytes on the wire interestingly. I didn't
know that, but it makes sense, since its using the same prefix
length-field as all values do.
I have checked the JBDC driver and it uses binary and text formats
depending on an attribute's type oid. So having the SERIALIZEBINARY
option is not accurate, as in reality both formats can be occur for the
same tuple.
Please provide some feedback on the new patch and let me know if this
makes sense. In general this kind of option for EXPLAIN is a good thing
for sure.
Greetings,
Stepan
On 14.09.23 21:27, stepan rutz wrote:
Hi Tom, Hi Matthias,
you are right of course. I have looked at the code from printtup.c and
made a new version of the patch.
Thanks for the MemoryContextReset hint too (@Matthias)
This time is called EXPLAIN(ANALYZE,SERIALIZE) (hey, it also sounds
nicer phonetically)
If the option SERIALIZE is active, the output functions are called and
they perform the detoasting, which I have even checked.
So things are better this way, however I hardcoded the output option
"Text" (format=0). In printtup.c there is an incoming array which
applies Text (format=0) or Binary (format=1) for each column
individually. I am not sure whether this is even needed. I left in the
if-statement from printtup.c which calls the binary output method of a
given type. The result of the output is ignored and apparently free'd
because of the memory-context-reset at the end.
Please also note, that I added a call to DestReceiver's rDestroy hook,
which was missing from explain.c before altogether.
Feedback is appreciated.
/Stepan
On 12.09.23 17:26, Tom Lane wrote:
Matthias van de Meent <boekewurm+postg...@gmail.com> writes:
Hmm, maybe we should measure the overhead of serializing the tuples
instead.
The difference between your patch and "serializing the tuples, but not
sending them" is that serializing also does the detoasting, but also
includes any time spent in the serialization functions of the type. So
an option "SERIALIZE" which measures all the time the server spent on
the query (except the final step of sending the bytes to the client)
would likely be more useful than "just" detoasting.
+1, that was my immediate reaction to the proposal as well. Some
output functions are far from cheap. Doing only the detoast part
seems like it's still misleading.
Do we need to go as far as offering both text-output and binary-output
options?
regards, tom lane
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 8570b14f62..9090e0f114 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -26,6 +26,7 @@
#include "nodes/nodeFuncs.h"
#include "parser/analyze.h"
#include "parser/parsetree.h"
+#include "mb/pg_wchar.h"
#include "rewrite/rewriteHandler.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
@@ -47,7 +48,6 @@ ExplainOneQuery_hook_type ExplainOneQuery_hook = NULL;
/* Hook for plugins to get control in explain_get_index_name() */
explain_get_index_name_hook_type explain_get_index_name_hook = NULL;
-
/* OR-able flags for ExplainXMLTag() */
#define X_OPENING 0
#define X_CLOSING 1
@@ -154,7 +154,8 @@ static void ExplainJSONLineEnding(ExplainState *es);
static void ExplainYAMLLineStarting(ExplainState *es);
static void escape_yaml(StringInfo buf, const char *str);
-
+static DestReceiver *CreateSerializeDestReceiver(int16 format);
+static uint64 GetNumberOfSerializedBytes(DestReceiver *dest);
/*
* ExplainQuery -
@@ -192,6 +193,12 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
es->settings = defGetBoolean(opt);
else if (strcmp(opt->defname, "generic_plan") == 0)
es->generic = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "serialize") == 0)
+ es->serialize = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "serializebinary") == 0)
+ {
+ es->serialize = es->serializebinary = defGetBoolean(opt);
+ }
else if (strcmp(opt->defname, "timing") == 0)
{
timing_set = true;
@@ -244,6 +251,17 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("EXPLAIN option TIMING requires ANALYZE")));
+ /* check that serializebinary is used with EXPLAIN ANALYZE */
+ if (es->serializebinary && !es->analyze)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("EXPLAIN option SERIALIZEBINARY requires ANALYZE")));
+ /* check that serialize is used with EXPLAIN ANALYZE */
+ else if (es->serialize && !es->analyze)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("EXPLAIN option SERIALIZE requires ANALYZE")));
+
/* check that GENERIC_PLAN is not used with EXPLAIN ANALYZE */
if (es->generic && es->analyze)
ereport(ERROR,
@@ -535,6 +553,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
double totaltime = 0;
int eflags;
int instrument_option = 0;
+ uint64 bytesSend = 0;
Assert(plannedstmt->commandType != CMD_UTILITY);
@@ -565,9 +584,15 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
/*
* Normally we discard the query's output, but if explaining CREATE TABLE
* AS, we'd better use the appropriate tuple receiver.
+ * Also if we choose to serialize during explain, we need to send the
+ * tuples to a dest-receiver that performs serializing
+ * into the protocol receiving format. this implicitly also detoasts
+ * values.
*/
if (into)
dest = CreateIntoRelDestReceiver(into);
+ else if (es->analyze && es->serialize)
+ dest = CreateSerializeDestReceiver(es->serializebinary ? 1 : 0);
else
dest = None_Receiver;
@@ -606,6 +631,15 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
/* run cleanup too */
ExecutorFinish(queryDesc);
+ if (es->serialize)
+ {
+ bytesSend = GetNumberOfSerializedBytes(dest);
+ }
+
+
+ /* call the dest receiver's destroy method even during explain */
+ dest->rDestroy(dest);
+
/* We can't run ExecutorEnd 'till we're done printing the stats... */
totaltime += elapsed_time(&starttime);
}
@@ -643,6 +677,11 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
if (es->costs)
ExplainPrintJITSummary(es, queryDesc);
+ if (es->analyze && es->serialize)
+ {
+ ExplainPropertyUInteger("Serialized Bytes", "bytes", bytesSend, es);
+ }
+
/*
* Close down the query and free resources. Include time for this in the
* total execution time (although it should be pretty minimal).
@@ -5052,3 +5091,239 @@ escape_yaml(StringInfo buf, const char *str)
{
escape_json(buf, str);
}
+
+
+/* Serializing DestReceiver functions
+ *
+ * EXPLAIN (ANALYZE) sometimes gives results that unrealistic.
+ * During EXPLAIN(ANALYZE) there is no detoasting of values.
+ * That means you get no idea what your actual query time is going
+ * to be, nor what the actual overhead of detoasting is.
+ *
+ * EXPLAIN (ANAYLZE, SERIALIZE) will perform detoasting implicitly and
+ * will also do count the number of bytes being send to the client but
+ * will not actually send them. SERIALIZE uses text format for all
+ * attributes.
+ *
+ * EXPLAIN (ANAYLZE, SERIALIZEBINARY) serializes using binary format.
+ */
+
+/* an attribute info cached for each column */
+typedef struct
+{ /* Per-attribute information */
+ Oid typoutput; /* Oid for the type's text output fn */
+ Oid typsend; /* Oid for the type's binary output fn */
+ bool typisvarlena; /* is it varlena (ie possibly toastable)? */
+ int16 format; /* format code for this column */
+ FmgrInfo finfo; /* Precomputed call info for output fn */
+} SerializeAttrInfo;
+
+typedef struct
+{
+ /* receiver for the tuples, that just serializes */
+ DestReceiver destRecevier;
+ MemoryContext memoryContext;
+ MemoryContext oldMemoryContext; /* restore to old when done */
+ int16 format; /* text of binary */
+ TupleDesc attrinfo;
+ int nattrs;
+ uint64 bytesSend; /* Num of bytes over the wire */
+ SerializeAttrInfo *infos; /* Cached info about each attr */
+} SerializeDestReceiver;
+
+/*
+ * Get the lookup info that the row-callback of the receiver needs. this code
+ * is similar to the code from printup.c except that it doesn't do any actual
+ * output.
+ */
+static void
+serialize_prepare_info(SerializeDestReceiver *receiver, TupleDesc typeinfo,
+ int nattrs)
+{
+ int i;
+
+ /* get rid of any old data */
+ if (receiver->infos)
+ pfree(receiver->infos);
+ receiver->infos = NULL;
+
+ receiver->attrinfo = typeinfo;
+ receiver->nattrs = nattrs;
+ if (nattrs <= 0)
+ return;
+
+ receiver->infos = (SerializeAttrInfo *) palloc0(
+ nattrs * sizeof(SerializeAttrInfo));
+ for (i = 0; i < nattrs; i++)
+ {
+ SerializeAttrInfo *info = &(receiver->infos[i]);
+ Form_pg_attribute attr = TupleDescAttr(typeinfo, i);
+ info->format = receiver->format;
+ if (info->format == 0)
+ {
+ getTypeOutputInfo(attr->atttypid,
+ &info->typoutput,
+ &info->typisvarlena);
+ fmgr_info(info->typoutput, &info->finfo);
+ }
+ else if (info->format == 1)
+ {
+ getTypeBinaryOutputInfo(attr->atttypid,
+ &info->typsend,
+ &info->typisvarlena);
+ fmgr_info(info->typsend, &info->finfo);
+ }
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("unsupported format code: %d", info->format)));
+ }
+}
+
+
+
+/* this method receives the tuples/records during explain (analyze, serialize)
+ * and serializes them.
+ * this method is otherwise not called and its purpose is to serialize to
+ * make analyze optionally more realistic. serialization includes detoasting
+ * and invoking the type's output (or send) function, which is what happens
+ * when you actually run the query. both steps cost time and thus
+ * analyze can be very misleading if those two steps are omitted.
+ */
+static bool
+serializeAnalyzeReceive(TupleTableSlot *slot, DestReceiver *self)
+{
+ int i;
+ TupleDesc tupdesc;
+ SerializeDestReceiver *receiver = (SerializeDestReceiver*) self;
+
+ tupdesc = slot->tts_tupleDescriptor;
+ /* Cache attribute infos and function oid if outdated */
+ if (receiver->attrinfo != tupdesc || receiver->nattrs != tupdesc->natts)
+ {
+ serialize_prepare_info(receiver, tupdesc, tupdesc->natts);
+ }
+
+ /* Fill all of the slot's attributes, we can now use slot->tts_values
+ * and its tts_isnull array which should be long enough even if added
+ * a null-column to the table */
+ slot_getallattrs(slot);
+
+ receiver->oldMemoryContext = MemoryContextSwitchTo(
+ receiver->memoryContext);
+
+ /* Iterate over all attributes of the tuple and invoke the output func
+ * (or send function in case of a binary format). completely ignore the
+ * result. MemoryContext is free'd via reset at the end of the this
+ * per-tuple callback anyhow.
+ * NOTE: when invoking the output function detoasting is implicitly
+ * performed when needed. This adds major overhead to plain
+ * explain-analyze and this is what is desired here.
+ */
+
+ receiver->bytesSend += 3; /* protocol message type and attribute-count */
+ for (i = 0; i < tupdesc->natts; i++)
+ {
+ SerializeAttrInfo *info = &(receiver->infos[i]);
+ Datum attr = slot->tts_values[i];
+ if (slot->tts_isnull[i])
+ {
+ receiver->bytesSend += 4; /* null is send as -1 length */
+ }
+ else
+ {
+ if (info->format == 0)
+ {
+ /* Text output, ignore the result but count it */
+ char* output = OutputFunctionCall(&info->finfo, attr);
+ size_t outputlen = strlen(output);
+ /* convert from server encoding to client encoding if needed */
+ char *converted = pg_server_to_client(output, outputlen);
+ if (converted != output)
+ {
+ /* converted encoding, get new length */
+ outputlen = strlen(converted);
+ pfree(converted);
+ }
+ /* append length that would have gone over the wire, 4
+ * the number of preceding bytes that contain the length */
+ receiver->bytesSend += 4 + outputlen;
+ }
+ else
+ {
+ /* Binary output, ignore the result but count it
+ * length is included in varsize */
+ bytea* bytes = SendFunctionCall(&info->finfo, attr);
+ uint64 n = VARSIZE(bytes);
+ receiver->bytesSend += n;
+ }
+ }
+ }
+ MemoryContextSwitchTo(receiver->oldMemoryContext);
+ MemoryContextReset(receiver->memoryContext);
+ return true;
+}
+
+static void
+serializeAnalyzeStartup(DestReceiver *self, int operation, TupleDesc typeinfo)
+{
+ SerializeDestReceiver *receiver = (SerializeDestReceiver*) self;
+ int i;
+ /* memory context for our work */
+ receiver->memoryContext = AllocSetContextCreate(CurrentMemoryContext,
+ "SerializeTupleReceive", ALLOCSET_DEFAULT_SIZES);
+
+ /* account for the attribute headers for send bytes */
+ receiver->bytesSend += 3; /* protocol message type and attribute-count */
+ for (i = 0; i < typeinfo->natts; ++i)
+ {
+ Form_pg_attribute att = TupleDescAttr(typeinfo, i);
+ char* name = NameStr(att->attname);
+ size_t namelen = strlen(name);
+ /* convert from server encoding to client encoding if needed */
+ char *converted = pg_server_to_client(name, namelen);
+ if (converted != name)
+ {
+ namelen = strlen(converted);
+ pfree(converted);
+ }
+ /* see printtup.h why we add 18 bytes here. These are the infos
+ * needed for each attribute plus the attributes's name */
+ receiver->bytesSend += namelen + 1 + 18;
+ }
+}
+
+static void
+serializeAnalyzeCleanup(DestReceiver *self)
+{
+ SerializeDestReceiver *receiver = (SerializeDestReceiver*) self;
+ MemoryContextDelete(receiver->memoryContext);
+}
+
+static void
+serializeAnalyzeDestroy(DestReceiver *self)
+{
+ pfree(self);
+}
+
+/* format is either 0 (text) or 1 (binary) */
+static DestReceiver *CreateSerializeDestReceiver(int16 format)
+{
+ /* Attention must palloc0 (like calloc) here.
+ * infos and nattrs are expected to be 0 or NULL initially */
+ SerializeDestReceiver *self = (SerializeDestReceiver*) palloc0(
+ sizeof(SerializeDestReceiver));
+ self->destRecevier.receiveSlot = serializeAnalyzeReceive;
+ self->destRecevier.rStartup = serializeAnalyzeStartup;
+ self->destRecevier.rShutdown = serializeAnalyzeCleanup;
+ self->destRecevier.rDestroy = serializeAnalyzeDestroy;
+ self->destRecevier.mydest = DestNone;
+ self->format = format;
+ return (DestReceiver *) self;
+}
+
+static uint64 GetNumberOfSerializedBytes(DestReceiver *dest)
+{
+ SerializeDestReceiver *receiver = (SerializeDestReceiver*) dest;
+ return receiver->bytesSend;
+}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 3d3e632a0c..777c7578cc 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -47,6 +47,16 @@ typedef struct ExplainState
bool summary; /* print total planning and execution timing */
bool settings; /* print modified settings */
bool generic; /* generate a generic plan */
+ bool serialize; /* serialize all values, only applicable
+ * when "analyze" is active. serializing
+ * invokes the type's output function in
+ * the same way printtup.c does then tuples
+ * are send to the client. this includes
+ * detoasting of toasted values and makes
+ * EXPLAIN(ANALYZE,SERIALIZE) return realistic
+ * (longer runtimes */
+ bool serializebinary;/* serialize in binary format */
+
ExplainFormat format; /* output format */
/* state for output formatting --- not reset for each new plan tree */
int indent; /* current indentation level */
diff --git a/src/test/regress/expected/explain.out b/src/test/regress/expected/explain.out
index 1aca77491b..76e8932a7c 100644
--- a/src/test/regress/expected/explain.out
+++ b/src/test/regress/expected/explain.out
@@ -559,3 +559,40 @@ select explain_filter('explain (verbose) select * from int8_tbl i8');
Query Identifier: N
(3 rows)
+-- Test that SERIALIZE is accepted as a parameter to explain
+-- timings are filtered out by explain_filter
+create table test_serialize(id bigserial, val text);
+select explain_filter('explain (analyze,serialize) select * from test_serialize');
+ explain_filter
+--------------------------------------------------------------------------------------------------
+ Seq Scan on test_serialize (cost=N.N..N.N rows=N width=N) (actual time=N.N..N.N rows=N loops=N)
+ Planning Time: N.N ms
+ Serialized Bytes: N bytes
+ Execution Time: N.N ms
+(4 rows)
+
+drop table test_serialize;
+-- Test that SERIALIZEBINARY is accepted as a parameter to explain
+create table test_serialize(id bigserial, val text);
+select explain_filter('explain (analyze,serializebinary) select * from test_serialize');
+ explain_filter
+--------------------------------------------------------------------------------------------------
+ Seq Scan on test_serialize (cost=N.N..N.N rows=N width=N) (actual time=N.N..N.N rows=N loops=N)
+ Planning Time: N.N ms
+ Serialized Bytes: N bytes
+ Execution Time: N.N ms
+(4 rows)
+
+drop table test_serialize;
+-- Test SERIALIZE is _not_ accepted as a parameter to explain unless ANALYZE is specified
+create table test_serialize(id bigserial, val text);
+select explain_filter('explain (serialize) select * from test_serialize');
+ERROR: EXPLAIN option SERIALIZE requires ANALYZE
+CONTEXT: PL/pgSQL function explain_filter(text) line 5 at FOR over EXECUTE statement
+drop table test_serialize;
+-- Test SERIALIZEBINARY is _not_ accepted as a parameter to explain unless ANALYZE is specified
+create table test_serialize(id bigserial, val text);
+select explain_filter('explain (serializebinary) select * from test_serialize');
+ERROR: EXPLAIN option SERIALIZEBINARY requires ANALYZE
+CONTEXT: PL/pgSQL function explain_filter(text) line 5 at FOR over EXECUTE statement
+drop table test_serialize;
diff --git a/src/test/regress/sql/explain.sql b/src/test/regress/sql/explain.sql
index b6b7beab27..bdabc0c36e 100644
--- a/src/test/regress/sql/explain.sql
+++ b/src/test/regress/sql/explain.sql
@@ -154,3 +154,26 @@ select explain_filter('explain (verbose) select * from t1 where pg_temp.mysin(f1
-- Test compute_query_id
set compute_query_id = on;
select explain_filter('explain (verbose) select * from int8_tbl i8');
+
+-- Test that SERIALIZE is accepted as a parameter to explain
+-- timings are filtered out by explain_filter
+create table test_serialize(id bigserial, val text);
+select explain_filter('explain (analyze,serialize) select * from test_serialize');
+drop table test_serialize;
+
+-- Test that SERIALIZEBINARY is accepted as a parameter to explain
+create table test_serialize(id bigserial, val text);
+select explain_filter('explain (analyze,serializebinary) select * from test_serialize');
+drop table test_serialize;
+
+-- Test SERIALIZE is _not_ accepted as a parameter to explain unless ANALYZE is specified
+create table test_serialize(id bigserial, val text);
+select explain_filter('explain (serialize) select * from test_serialize');
+drop table test_serialize;
+
+-- Test SERIALIZEBINARY is _not_ accepted as a parameter to explain unless ANALYZE is specified
+create table test_serialize(id bigserial, val text);
+select explain_filter('explain (serializebinary) select * from test_serialize');
+drop table test_serialize;
+
+