Yes, I do use AS in the load statement. I thought Filters are always pushed as close to the Load operators as possible? What kind of Foreach is added?
Thanks, Jeff On Fri, Mar 15, 2013 at 10:57 AM, Daniel Dai <[email protected]> wrote: > getPartitionKeys should be called by default. Did you use "AS" clause > in load statement? That could add a foreach between Load and Filter, > and getPartitionKeys will only be invoked if filter is right after > load. Do an explain to check for it. > > Thanks, > Daniel > > On Thu, Mar 14, 2013 at 8:37 PM, Jeff Yuan <[email protected]> wrote: >> Hi all, >> >> For CustomLoader (a class I'm implementing) which extends LoadFunct, >> implemented LoadMetadata, the "getPartitionKeys" function is supposed >> to be called by "PartitionFilterOptimizer", right? I put some debug >> statements in "getPartitionKeys", but this function doesn't seem like >> it's ever called. >> >> I've read through some Pig source, optimization rules can be disabled >> by properties, but by default the "PartitionFilterOptimizer" should be >> enabled. Also, in "PartitionFilterOptimizer", I saw checks to saw some >> other checks, like the Filter operator cannot have another dependency >> other than load, which is true in my case. Anyway, can someone shed >> some light on this? Am I understanding this interface incorrectly? >> >> My script is very simple (line 1 is load, line 2 is filter, and line 3 >> is store), so the Logical Plan should be very simple. Also, I'm >> testing this in Pig local mode, not sure if that matters. >> >> Greatly appreciate any hints!
