[ https://issues.apache.org/jira/browse/HIVE-13920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15311812#comment-15311812 ]
Gopal V commented on HIVE-13920: -------------------------------- Preallocating might actually allocate large number of arrays without actual need - a 10,000 row split shouldn't need 128 vector batches pre-created and adding to the garbage. The allocation is currently inside the actual reader inner loop, which is a problem because for the first 128 invocations (& more), the allocation is actually performed. The 128x more allocations than the regular Tez codepath is the problem, moving it around to a different part of code doesn't fix the actual perf issue. > LLAP: pre-populate CVB cache > ---------------------------- > > Key: HIVE-13920 > URL: https://issues.apache.org/jira/browse/HIVE-13920 > Project: Hive > Issue Type: Bug > Reporter: Sergey Shelukhin > > To avoid allocations on the main path, we should pre-populate the CVB cache. > The main difficulty is propagating column indexes earlier to allocate correct > vectors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)