For some reason, I thought there was a blocker there. As Iceberg is not using org.apache.parquet.filter2.predicate.FilterApi in its Parquet reader then makes sense to fix, of course.
> On 5 Mar 2019, at 18:38, Ryan Blue <rb...@netflix.com.INVALID> wrote: > > Would it make sense to add support for IN expressions instead? I'd rather get > that done than build work-arounds. > > On Tue, Mar 5, 2019 at 10:33 AM Anton Okolnychyi > <aokolnyc...@apple.com.invalid> wrote: > Hey, > > Iceberg Spark data source rewrites IN predicates as a mix of OR/EQ. I am > wondering if it makes sense to introduce a threshold when this rewrite > happens until [1] is resolved. We can have something similar to > “spark.sql.parquet.pushdown.inFilterThreshold” in Spark. > > We have experienced a performance degradation on a few queries. One of the > queries had 5 predicates and 2 of them were IN. In this specific case, IN > predicates didn’t help to filter out files and just made the overall row > filter more complicated. > > Thanks, > Anton > > > [1] - https://github.com/apache/incubator-iceberg/issues/39 > <https://github.com/apache/incubator-iceberg/issues/39> > > > > -- > Ryan Blue > Software Engineer > Netflix