Nicholas Jiang created FLINK-29756:
--------------------------------------

             Summary: Support materialized column to improve query performance 
for complex types
                 Key: FLINK-29756
                 URL: https://issues.apache.org/jira/browse/FLINK-29756
             Project: Flink
          Issue Type: New Feature
          Components: Table Store
    Affects Versions: table-store-0.3.0
            Reporter: Nicholas Jiang
             Fix For: table-store-0.3.0


In the world of data warehouse, it is very common to use one or more columns 
from a complex type such as a map, or to put many subfields into it. These 
operations can greatly affect query performance because:
 # These operations are very wasteful IO. For example, if we have a field type 
of Map, which contains dozens of subfields, we need to read the entire column 
when reading this column. And Spark will traverse the entire map to get the 
value of the target key.
 # Cannot take advantage of vectorized reads when reading nested type columns.
 # Filter pushdown cannot be used when reading nested columns.

It is necessary to introduce the materialized column feature in Flink Table 
Store, which transparently solves the above problems of arbitrary columnar 
storage (not just Parquet).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to