Hi Nishadi,
I have not seen bloom filters in Spark. They are mentioned as part of the Orc
file format, but I don't know if Spark uses them:
https://orc.apache.org/docs/spec-index.html. Parquet has block-level min/max
values, null counts, etc for leaf columns in its metadata. I don't believe
Sp
Thank you for the response.
Can I please know the reason why bit map indexes are not appropriate for
big data.
Rather than using the traditional bitmap indexing techniques we are
planning to implement a combination of novel bitmap indexing techniques
like bit sliced indexes and projection indexes.
Is it the traditional bitmap indexing? I would not recommend it for big data.
You could use bloom filters and min/max indexes in-memory which look to be more
appropriate. However, if you want to use bitmap indexes then you would have to
do it as you say. However, bitmap indexes may consume a lo
Hi All,
I am a CSE undergraduate and as for our final year project, we are
expecting to construct a cluster based, bit-oriented analytic platform
(storage engine) to provide fast query performance when used for OLAP with
the use of novel bitmap indexing techniques when and where appropriate.
For