[
https://issues.apache.org/jira/browse/IMPALA-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xuebin Su reassigned IMPALA-9874:
---------------------------------
Assignee: Xuebin Su (was: Abhishek Rawat)
> Reduce or avoid I/O for pruned columns
> --------------------------------------
>
> Key: IMPALA-9874
> URL: https://issues.apache.org/jira/browse/IMPALA-9874
> Project: IMPALA
> Issue Type: Sub-task
> Components: Backend
> Reporter: Tim Armstrong
> Assignee: Xuebin Su
> Priority: Critical
> Labels: parquet
>
> Skipping decoding of values may not be effective at reducing I/O in some
> cases, because we start the I/O in StartScans(). We don't wait for the I/O
> until we actually read the first data page from the column reader. So there
> is a race to determine whether the I/O happens in some cases.
> There are a couple of things we can do here.
> * The basic thing is to issue reads for the column readers in the order in
> which they are needed. We may be able to get this for free by ordering the
> column readers based on materialisation order.
> * We also want to avoid issuing I/O for columns that are not needed, if
> predicates are highly selective. This is maybe a bit harder and avoids more
> trade-offs, since delaying issuing of the reads may impact scan latency.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]