Ádám Szita created HIVE-22583:
---------------------------------
Summary: LLAP cache always misses with non-vectorized serde
readers such as OpenCSV
Key: HIVE-22583
URL: https://issues.apache.org/jira/browse/HIVE-22583
Project: Hive
Issue Type: Bug
Components: llap
Reporter: Ádám Szita
Assignee: Ádám Szita
Although after the first read LLAP cache stores data of tables that are not
using the LazySimple serde, the stored data is then never used in the future
subsequent queries, causing a full cache miss and re-read each time.
Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care of
creating an entry for the root/struct column of the table. The only cases this
is taken care of are when a vectorized reader is used _(e.g. LazySimpleSerde's
LazySimpleDeserializeRead)_, where SerdeEncodedDataReader#processAsyncCacheData
takes care of this.
This can be reproduced by either using a custom serde, like OpenCSV or using
LazySimpleSerde, but turning off _hive.llap.io.encode.vector.serde.enabled_.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)