Re: Question about Bloom Filter in Spark 2.0

Jörn Franke Wed, 22 Jun 2016 11:08:19 -0700

You should see at it both levels: there is one bloom filter for Orc data and 
one for data in-memory.


It is already a good step towards an integration of format and in-memory 
representation for columnar data. 

> On 22 Jun 2016, at 14:01, BaiRan <liz...@icloud.com> wrote:
> 
> After building bloom filter on existing data, does spark engine utilise bloom 
> filter during query processing?
> Is there any plan about predicate push down by using bloom filter in ORC / 
> Parquet?
> 
> Thanks
> Ran
>> On 22 Jun, 2016, at 10:48 am, Reynold Xin <r...@databricks.com> wrote:
>> 
>> SPARK-12818 is about building a bloom filter on existing data. It has 
>> nothing to do with the ORC bloom filter, which can be used to do predicate 
>> pushdown.
>> 
>> 
>>> On Tue, Jun 21, 2016 at 7:45 PM, BaiRan <liz...@icloud.com> wrote:
>>> Hi all,
>>> 
>>> I have a question about bloom filter implementation in Spark-12818 issue. 
>>> If I have a ORC file with bloom filter metadata, how can I utilise it by 
>>> Spark SQL?
>>> Thanks.
>>> 
>>> Best,
>>> Ran
>

Re: Question about Bloom Filter in Spark 2.0

Reply via email to