[
https://issues.apache.org/jira/browse/IMPALA-12961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Carlin updated IMPALA-12961:
----------------------------------
Parent: IMPALA-14404
Issue Type: Sub-task (was: Bug)
> Use a Map instead of an ArrayList for Expr in HDFS RelNode
> ----------------------------------------------------------
>
> Key: IMPALA-12961
> URL: https://issues.apache.org/jira/browse/IMPALA-12961
> Project: IMPALA
> Issue Type: Sub-task
> Reporter: Steve Carlin
> Priority: Major
>
> This came up in code review in ImpalaHdfsScanRel:
> "For wide tables where we are only needing a few columns projected, we will
> end up with a long list with mostly Nulls. A LinkedHashMap (preserves
> Insertion order) where the key is position and value is the SlotRef would be
> better suited despite the cpu cost of hashing. In general, in a query
> planner, memory is the most precious commodity since the plan search space
> can be large, so anything we can do to reduce memory footprint would be
> preferred."
> One counter argument: The list is used in other Rel Nodes, and it seems more
> natural. For instance, the Project RelNode will have a RexInputRef RexNode
> which is "$2". It seems more natural to have an array in this case. Every
> other RelNode works this way except for the ScanNode.
> To add to the counter argument: Let's take a worst case scenario of a query
> that has 10 tables with 500 columns apiece. If we are allocating 8 byte
> pointers, we would need 10*500*8 to hold this information, which is 40,000
> bytes. While reducing the memory footprint is more important, reducing it by
> 40,000 bytes really isn't going to make an impact. Even if we take into
> account that multiple queries would be running simultaneously, this is a very
> shortlived code path. So should we go with the more natural approach versus
> the less memory intensive approach?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]