[ 
https://issues.apache.org/jira/browse/PIG-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082373#comment-14082373
 ] 

Rohini Palaniswamy commented on PIG-4094:
-----------------------------------------

Initial thoughts on what needs to be done:
   - List<String> getPredicateFields(String location, Job job) needs to be 
changed to return the field information for complex types. For eg:  Loader 
should be able to specify that it accepts predicate pushdown for a specific 
field in tuple or for a specific key in map or the whole tuple or map. Support 
for multiple levels of nested reference for tuple.
   - New OpType in Expression other than TERM_COL to support columns with field 
references. Support for multiple levels of nested reference for tuple.
   - FilterExtractor to consider complex data types in filter conditions and 
extract them.

I haven't thought much about bags, as filtering on bags is always following a 
FLATTEN statement. Do we want to support predicate pushdown on bags as well in 
future?

> Predicate pushdown to support complex data types
> ------------------------------------------------
>
>                 Key: PIG-4094
>                 URL: https://issues.apache.org/jira/browse/PIG-4094
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Rohini Palaniswamy
>             Fix For: 0.14.0
>
>
>   Parquet has support for pushing predicates on tuples, maps and bags 
> according to [~aniket486]. ORC currently only supports primitives, but will 
> add support for structs(tuples) in the future.  The API needs to be there 
> even if not implemented as it will hard to change the interface once released.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to