Re: Question about Bloom Filter in Spark 2.0

2016-06-22 Thread Jörn Franke
You should see at it both levels: there is one bloom filter for Orc data and one for data in-memory. It is already a good step towards an integration of format and in-memory representation for columnar data. > On 22 Jun 2016, at 14:01, BaiRan wrote: > > After building bloom filter on existi

Re: Question about Bloom Filter in Spark 2.0

2016-06-22 Thread BaiRan
After building bloom filter on existing data, does spark engine utilise bloom filter during query processing? Is there any plan about predicate push down by using bloom filter in ORC / Parquet? Thanks Ran > On 22 Jun, 2016, at 10:48 am, Reynold Xin wrote: > > SPARK-12818 is about building a bl

Re: Question about Bloom Filter in Spark 2.0

2016-06-21 Thread Reynold Xin
SPARK-12818 is about building a bloom filter on existing data. It has nothing to do with the ORC bloom filter, which can be used to do predicate pushdown. On Tue, Jun 21, 2016 at 7:45 PM, BaiRan wrote: > Hi all, > > I have a question about bloom filter implementation in Spark-12818 issue. > If