On 8/31/20 6:19 PM, Ashutosh Bapat wrote:
On Mon, Aug 31, 2020 at 3:36 PM Andrey V. Lepikhov
<a.lepik...@postgrespro.ru> wrote:
I agreed that this task needs to implement an API for
serialization/deserialization of statistics:
pg_load_relation_statistics(json_string text);
pg_get_relation_statistics(relname text);
We can use a version number for resolving conflicts with different
statistics implementations.
"Load" function will validate the values[] anyarray while deserializing
the input json string to the datatype of the relation column.

This is a valuable feature. Analysing a foreign table by fetching rows
from the foreign server isn't very efficient. In fact the current FDW
API for doing that forges that in-efficiency by requiring the FDW to
return a sample of rows that will be analysed by the core. That's why
I see that your patch introduces a new API to get foreign rel stat. I
don't think there's any point in maintaining these two APIs just for

ANALYSING table. Instead we should have only one FDW API which will do
whatever it wants and return statistics that can be understood by the
core and let core install it in the catalogs. I believe that's doable.
I think the same.

In case of PostgreSQL it could get the stats available as is from the
foreign server, convert it into a form that the core understands and
returns. The patch introduces a new function postgres_fdw_stat() which
will be available only from version 14 onwards. Can we use
row_to_json(), which is available in all the supported versions,
instead?
I started from here. But we need to convert starelid, staop[] stacoll[] oids into portable format. Also we need to explicitly specify the type of each values[] array. And no one guaranteed that anyarray values[] can't contained an array of complex type values, containing oids, that can't be correctly converted to database objects on another server... These considerations required me to add new postgres_fdw_stat() routine that can be moved into the core.

In case of some other foreign server, an FDW will be responsible to
return statistics in a form that the core will understand. It may
fetch rows from the foreign server or be a bit smart and fetch the
statistics and convert.
I don't think I fully understood your idea. Please explain in more detail if possible.

This also means that FDWs will have to deal with the statistics format
that the core understands and thus will need changes in their code
with every version in the worst case. But AFAIR, PostgreSQL supports
different forms of statistics so the problem may not remain that
severe if FDWs and core agree on some bare minimum format that the
core supports for long.
I don't think FDW needs to know anything about the internals of statistics. It only need to execute query like
"SELECT extract_statistics(namespace.relation);"
and apply the text representation by the function call like this:
store_statistics(const char *stat);
All validation and update pg_statistic operations will be performed into the core.

I think the patch has some other problems like it works only for
regular tables on foreign server but a foreign table can be pointing
to any relation like a materialized view, partitioned table or a
foreign table on the foreign server all of which have statistics
associated with them.
Ok. It was implemented for discussion, test and as a base of development.

I didn't look closely but it does not consider
that the foreign table may not have all the columns from the relation
on the foreign server or may have different names.
Here we get full statistics from remote server and extract statistics only for columns, included into the tuple descriptor of foreign table.

But I think those
problems are kind of secondary. We have to agree on the design first.
+1.
I only want to point out the following. In previous threads statistics was converted row-by-row. I want to suggest to serialize all statistics tuples for the relation into single json string. On apply phase we can filter unneeded attributes.

--
regards,
Andrey Lepikhov
Postgres Professional


Reply via email to