Re: Bitmap Indexing to increase OLAP query performance

2016-06-30 Thread Michael Allman
Hi Nishadi, I have not seen bloom filters in Spark. They are mentioned as part of the Orc file format, but I don't know if Spark uses them: https://orc.apache.org/docs/spec-index.html. Parquet has block-level min/max values, null counts, etc for leaf columns in its metadata. I don't believe Sp

Re: Bitmap Indexing to increase OLAP query performance

2016-06-29 Thread Nishadi Kirielle
Thank you for the response. Can I please know the reason why bit map indexes are not appropriate for big data. Rather than using the traditional bitmap indexing techniques we are planning to implement a combination of novel bitmap indexing techniques like bit sliced indexes and projection indexes.

Re: Bitmap Indexing to increase OLAP query performance

2016-06-29 Thread Jörn Franke
Is it the traditional bitmap indexing? I would not recommend it for big data. You could use bloom filters and min/max indexes in-memory which look to be more appropriate. However, if you want to use bitmap indexes then you would have to do it as you say. However, bitmap indexes may consume a lo

Bitmap Indexing to increase OLAP query performance

2016-06-29 Thread Nishadi Kirielle
Hi All, I am a CSE undergraduate and as for our final year project, we are expecting to construct a cluster based, bit-oriented analytic platform (storage engine) to provide fast query performance when used for OLAP with the use of novel bitmap indexing techniques when and where appropriate. For