How to know referenced sub-fields of a composite type?

Kohei KaiGai Tue, 28 May 2019 20:14:42 -0700

Hello,

A recent revision of PG-Strom has its columnar-storage using Apache
Arrow format files on
FDW infrastructure. Because of the columnar nature, it allows to load
the values which are
referenced by the query, thus, maximizes efficiency of the storage bandwidth.
http://heterodb.github.io/pg-strom/arrow_fdw/


Apache Arrow defines various primitive types that can be mapped on
PostgreSQL data types.
For example, FloatingPoint (precision=Single) on Arrow is equivalent
to float4 of PostgreSQL.
One interesting data type in Apache Arrow is "Struct" data type. It is
equivalent to composite
type in PostgreSQL. The "Struct" type has sub-fields, and individual
sub-fields have its own
values array for each.

It means we can skip to load the sub-fields unreferenced, if
query-planner can handle
referenced and unreferenced sub-fields correctly.
On the other hands, it looks to me RelOptInfo or other optimizer
related structure don't have
this kind of information. RelOptInfo->attr_needed tells extension
which attributes are referenced
by other relation, however, its granularity is not sufficient for sub-fields.

Probably, all we can do right now is walk-on the RelOptInfo list to
lookup FieldSelect node
to see the referenced sub-fields. Do we have a good idea instead of
this expensive way?
# Right now, PG-Strom loads all the sub-fields of Struct column from
arrow_fdw foreign-table
# regardless of referenced / unreferenced sub-fields. Just a second best.

Best regards,
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kai...@heterodb.com>

How to know referenced sub-fields of a composite type?

Reply via email to