[ https://issues.apache.org/jira/browse/HIVE-23006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Panagiotis Garefalakis updated HIVE-23006: ------------------------------------------ Attachment: (was: HIVE-23006.01.patch) > Compiler support for Probe MapJoin > ---------------------------------- > > Key: HIVE-23006 > URL: https://issues.apache.org/jira/browse/HIVE-23006 > Project: Hive > Issue Type: Sub-task > Reporter: Panagiotis Garefalakis > Assignee: Panagiotis Garefalakis > Priority: Major > Labels: pull-request-available > Attachments: HIVE-23006.01.patch > > Time Spent: 1h 40m > Remaining Estimate: 0h > > The decision of pushing down information to the Record reader (potentially > reducing decoding time by row-level filtering) should be done at query > compilation time. > This patch adds an extra optimisation step with the goal of finding Table > Scan operators that could reduce the number of rows decoded at runtime using > extra available information. > It currently looks for all the available MapJoin operators that could use the > smaller HashTable on the probing side (where TS is) to filter-out rows that > would never match. > To do so the HashTable information is pushed down to the TS properties and > then propagated as part of MapWork. > If the a single TS is used by multiple operators (shared-word), this rule can > not be applied. > This rule can be extended to support static filter expressions like: > _select * from sales where sold_state = 'PR';_ > This optimisation manly targets the Tez execution engine running on Llap. -- This message was sent by Atlassian Jira (v8.3.4#803005)