[
https://issues.apache.org/jira/browse/PIG-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082373#comment-14082373
]
Rohini Palaniswamy commented on PIG-4094:
-----------------------------------------
Initial thoughts on what needs to be done:
- List<String> getPredicateFields(String location, Job job) needs to be
changed to return the field information for complex types. For eg: Loader
should be able to specify that it accepts predicate pushdown for a specific
field in tuple or for a specific key in map or the whole tuple or map. Support
for multiple levels of nested reference for tuple.
- New OpType in Expression other than TERM_COL to support columns with field
references. Support for multiple levels of nested reference for tuple.
- FilterExtractor to consider complex data types in filter conditions and
extract them.
I haven't thought much about bags, as filtering on bags is always following a
FLATTEN statement. Do we want to support predicate pushdown on bags as well in
future?
> Predicate pushdown to support complex data types
> ------------------------------------------------
>
> Key: PIG-4094
> URL: https://issues.apache.org/jira/browse/PIG-4094
> Project: Pig
> Issue Type: Sub-task
> Reporter: Rohini Palaniswamy
> Fix For: 0.14.0
>
>
> Parquet has support for pushing predicates on tuples, maps and bags
> according to [~aniket486]. ORC currently only supports primitives, but will
> add support for structs(tuples) in the future. The API needs to be there
> even if not implemented as it will hard to change the interface once released.
--
This message was sent by Atlassian JIRA
(v6.2#6252)