You should see at it both levels: there is one bloom filter for Orc data and one for data in-memory.
It is already a good step towards an integration of format and in-memory representation for columnar data. > On 22 Jun 2016, at 14:01, BaiRan <liz...@icloud.com> wrote: > > After building bloom filter on existing data, does spark engine utilise bloom > filter during query processing? > Is there any plan about predicate push down by using bloom filter in ORC / > Parquet? > > Thanks > Ran >> On 22 Jun, 2016, at 10:48 am, Reynold Xin <r...@databricks.com> wrote: >> >> SPARK-12818 is about building a bloom filter on existing data. It has >> nothing to do with the ORC bloom filter, which can be used to do predicate >> pushdown. >> >> >>> On Tue, Jun 21, 2016 at 7:45 PM, BaiRan <liz...@icloud.com> wrote: >>> Hi all, >>> >>> I have a question about bloom filter implementation in Spark-12818 issue. >>> If I have a ORC file with bloom filter metadata, how can I utilise it by >>> Spark SQL? >>> Thanks. >>> >>> Best, >>> Ran >