On Mon, Jun 24, 2019 at 01:42:32AM +0200, Tomas Vondra wrote:
On Sun, Jun 23, 2019 at 10:23:19PM +0200, Tomas Vondra wrote:
On Sun, Jun 23, 2019 at 08:48:26PM +0100, Dean Rasheed wrote:
On Sat, 22 Jun 2019 at 15:10, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:
One annoying thing I noticed is that the base_frequency tends to end up
being 0, most likely due to getting too small. It's a bit strange, though,
because with statistic target set to 10k the smallest frequency for a
single column is 1/3e6, so for 2 columns it'd be ~1/9e12 (which I think is
something the float8 can represent).
Yeah, it should be impossible for the base frequency to underflow to
0. However, it looks like the problem is with mcv_list_items()'s use
of %f to convert to text, which is pretty ugly.
Yeah, I realized that too, eventually. One way to fix that would be
adding %.15f to the sprintf() call, but that just adds ugliness. It's
probably time to rewrite the function to build the tuple from datums,
instead of relying on BuildTupleFromCStrings.
OK, attached is a patch doing this. It's pretty simple, and it does
resolve the issue with frequency precision.
There's one issue with the signature, though - currently the function
returns null flags as bool array, but values are returned as simple
text value (formatted in array-like way, but still just a text).
In the attached patch I've reworked both to proper arrays, but obviously
that'd require a CATVERSION bump - and there's not much apetite for that
past beta2, I suppose. So I'll just undo this bit.
Meh, I forgot to attach the patch, of couse ...
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 9aa3ce9bfdcab50a6d5dcf289bcf715e4a8aed9c Mon Sep 17 00:00:00 2001
From: Tomas Vondra <to...@2ndquadrant.com>
Date: Mon, 24 Jun 2019 00:50:53 +0200
Subject: [PATCH 1/3] rework pg_mcv_list_items
---
src/backend/statistics/mcv.c | 106 ++++++++++++--------------------
src/include/catalog/pg_proc.dat | 2 +-
2 files changed, 42 insertions(+), 66 deletions(-)
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 5fe61ea0a4..1b1a5c9846 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -1134,9 +1134,6 @@ Datum
pg_stats_ext_mcvlist_items(PG_FUNCTION_ARGS)
{
FuncCallContext *funcctx;
- int call_cntr;
- int max_calls;
- AttInMetadata *attinmeta;
/* stuff done only on the first call of the function */
if (SRF_IS_FIRSTCALL())
@@ -1166,13 +1163,13 @@ pg_stats_ext_mcvlist_items(PG_FUNCTION_ARGS)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("function returning record
called in context "
"that cannot accept
type record")));
+ tupdesc = BlessTupleDesc(tupdesc);
/*
* generate attribute metadata needed later to produce tuples
from raw
* C strings
*/
- attinmeta = TupleDescGetAttInMetadata(tupdesc);
- funcctx->attinmeta = attinmeta;
+ funcctx->attinmeta = TupleDescGetAttInMetadata(tupdesc);
MemoryContextSwitchTo(oldcontext);
}
@@ -1180,18 +1177,17 @@ pg_stats_ext_mcvlist_items(PG_FUNCTION_ARGS)
/* stuff done on every call of the function */
funcctx = SRF_PERCALL_SETUP();
- call_cntr = funcctx->call_cntr;
- max_calls = funcctx->max_calls;
- attinmeta = funcctx->attinmeta;
-
- if (call_cntr < max_calls) /* do when there is more left to send */
+ if (funcctx->call_cntr < funcctx->max_calls) /* do when there is
more left to send */
{
- char **values;
+ Datum values[5];
+ bool nulls[5];
HeapTuple tuple;
Datum result;
+ int dims[1];
+ int lbs[1];
- StringInfoData itemValues;
- StringInfoData itemNulls;
+ Datum *itemNulls;
+ Datum *itemValues;
int i;
@@ -1203,24 +1199,16 @@ pg_stats_ext_mcvlist_items(PG_FUNCTION_ARGS)
mcvlist = (MCVList *) funcctx->user_fctx;
- Assert(call_cntr < mcvlist->nitems);
-
- item = &mcvlist->items[call_cntr];
-
- /*
- * Prepare a values array for building the returned tuple. This
should
- * be an array of C strings which will be processed later by
the type
- * input functions.
- */
- values = (char **) palloc0(5 * sizeof(char *));
+ Assert(funcctx->call_cntr < mcvlist->nitems);
- values[0] = (char *) palloc(64 * sizeof(char)); /* item index */
- values[3] = (char *) palloc(64 * sizeof(char)); /* frequency */
- values[4] = (char *) palloc(64 * sizeof(char)); /* base
frequency */
+ item = &mcvlist->items[funcctx->call_cntr];
outfuncs = (Oid *) palloc0(sizeof(Oid) * mcvlist->ndimensions);
fmgrinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) *
mcvlist->ndimensions);
+ itemValues = (Datum *) palloc(sizeof(Datum) *
mcvlist->ndimensions);
+ itemNulls = (Datum *) palloc(sizeof(Datum) *
mcvlist->ndimensions);
+
for (i = 0; i < mcvlist->ndimensions; i++)
{
bool isvarlena;
@@ -1230,61 +1218,49 @@ pg_stats_ext_mcvlist_items(PG_FUNCTION_ARGS)
fmgr_info(outfuncs[i], &fmgrinfo[i]);
}
- /* build the arrays of values / nulls */
- initStringInfo(&itemValues);
- initStringInfo(&itemNulls);
-
- appendStringInfoChar(&itemValues, '{');
- appendStringInfoChar(&itemNulls, '{');
-
for (i = 0; i < mcvlist->ndimensions; i++)
{
- Datum val,
- valout;
-
- if (i > 0)
- {
- appendStringInfoString(&itemValues, ", ");
- appendStringInfoString(&itemNulls, ", ");
- }
+ Datum valout;
if (item->isnull[i])
- valout = CStringGetDatum("NULL");
+ valout = (Datum) 0;
else
- {
- val = item->values[i];
- valout = FunctionCall1(&fmgrinfo[i], val);
- }
+ valout = FunctionCall1(&fmgrinfo[i],
item->values[i]);
- appendStringInfoString(&itemValues,
DatumGetCString(valout));
- appendStringInfoString(&itemNulls, item->isnull[i] ?
"t" : "f");
+ itemNulls[i] = BoolGetDatum(item->isnull[i]);
+ itemValues[i] = valout;
}
- appendStringInfoChar(&itemValues, '}');
- appendStringInfoChar(&itemNulls, '}');
+ dims[0] = mcvlist->ndimensions;
+ lbs[0] = 1;
+
+ values[0] = Int32GetDatum(funcctx->call_cntr);
- snprintf(values[0], 64, "%d", call_cntr);
- snprintf(values[3], 64, "%f", item->frequency);
- snprintf(values[4], 64, "%f", item->base_frequency);
+ values[1] = PointerGetDatum(construct_md_array(itemValues,
+
item->isnull,
+
1,
+
dims,
+
lbs,
+
CSTRINGOID, -2,
+
false, 'c'));
- values[1] = itemValues.data;
- values[2] = itemNulls.data;
+ values[2] = PointerGetDatum(construct_array(itemNulls,
+
mcvlist->ndimensions,
+
BOOLOID, sizeof(bool),
+
true, 'c'));
+
+ values[3] = Float8GetDatum(item->frequency);
+ values[4] = Float8GetDatum(item->base_frequency);
+
+ /* no NULLs in the tuple */
+ memset(nulls, 0, sizeof(nulls));
/* build a tuple */
- tuple = BuildTupleFromCStrings(attinmeta, values);
+ tuple = heap_form_tuple(funcctx->attinmeta->tupdesc, values,
nulls);
/* make the tuple into a datum */
result = HeapTupleGetDatum(tuple);
- /* clean up (this is not really necessary) */
- pfree(itemValues.data);
- pfree(itemNulls.data);
-
- pfree(values[0]);
- pfree(values[3]);
- pfree(values[4]);
- pfree(values);
-
SRF_RETURN_NEXT(funcctx, result);
}
else /* do when there is no
more left */
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 87335248a0..36ae2ac9d8 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5019,7 +5019,7 @@
{ oid => '3427', descr => 'details about MCV list items',
proname => 'pg_mcv_list_items', prorows => '1000', proretset => 't',
provolatile => 's', prorettype => 'record', proargtypes => 'pg_mcv_list',
- proallargtypes => '{pg_mcv_list,int4,text,_bool,float8,float8}',
+ proallargtypes => '{pg_mcv_list,int4,_text,_bool,float8,float8}',
proargmodes => '{i,o,o,o,o,o}',
proargnames => '{mcv_list,index,values,nulls,frequency,base_frequency}',
prosrc => 'pg_stats_ext_mcvlist_items' },
--
2.20.1