[ https://issues.apache.org/jira/browse/HIVE-17669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mithun Radhakrishnan updated HIVE-17669: ---------------------------------------- Attachment: HIVE-17699.2.patch Is this more palatable? I've switched over to using a configurably bounded Guava {{Cache}}. I'm not sure what the default max-size should be. I've left it at 100. Is that acceptable? [~prasanth_j], do you suppose we should leave out doing the MD5/SHA-1 hashing? Would that be overkill, given the small cache-size? I'm working on a {{branch-2}} port. (I'll have to rewrite the lambda bits.) > Cache to optimize SearchArgument deserialization > ------------------------------------------------ > > Key: HIVE-17669 > URL: https://issues.apache.org/jira/browse/HIVE-17669 > Project: Hive > Issue Type: Improvement > Components: ORC, Query Processor > Affects Versions: 2.2.0, 3.0.0 > Reporter: Mithun Radhakrishnan > Assignee: Chris Drome > Attachments: HIVE-17699.1.patch, HIVE-17699.2.patch > > > And another, from [~selinazh] and [~cdrome]. (YHIVE-927) > When a mapper needs to process multiple ORC files, it might land up having > use essentially the same {{SearchArgument}} over several files. It would be > good not to have to deserialize from string, over and over again. Caching the > object against the string-form should speed things up. -- This message was sent by Atlassian JIRA (v6.4.14#64029)