On Mon, Apr 25, 2016 at 11:20 AM Alvaro Herrera <alvhe...@2ndquadrant.com> wrote:
> Bráulio Bhavamitra wrote: > > Hi all, > > > > I'm finally having performance issues with PostgreSQL when doing big > > analytics queries over almost the entire database of more than 100gb of > > data. > > > > And what I keep reading all over the web is many databases switching to > > columnar store (RedShift, Cassandra, cstore_fdw, etc) and having great > > performance on queries in general and giant boosts with big analytics > > queries. > > > > I wonder if there is any plans to move postgresql entirely to a columnar > > store (or at least make it an option), maybe for version 10? > > This is a pretty interesting question. I wrote an answer, then thought > it would make a good blog post, so it's at > http://blog.2ndquadrant.com/column-store-plans/ > I reproduce it below. > > Completely replacing the current row-based store wouldn't be a good > idea: it has served us extremely well and I’m pretty sure that replacing > it entirely with a columnar store would be disastrous performance-wise > for OLTP use cases. > > That doesn't mean columnar stores are a bad idea in general -- because > they aren't. They just have a more limited use case than "the whole > database". For analytical queries on append-mostly data, a columnar > store is a much more appropriate representation than the regular > row-based store, but not all databases are analytical. > > However, in order to attain interesting performance gains you need to do > a lot more than just change the underlying storage: you need to ensure > that the rest of the system can take advantage of the changed > representation, so that it can execute queries optimally; for instance, > you may want aggregates that operate in a SIMD mode rather than > one-value-at-a-time as it is today. This, in itself, is a large > undertaking, and there are other challenges too. > > As it turns out, there's a team at 2ndQuadrant working precisely on > these matters. We posted a patch last year, but it wasn’t terribly > interesting -— it only made a single-digit percentage improvement in > TPC-H scores; not enough to bother the development community with (it > was a fairly invasive patch). We want more than that. > > In our design, columnar or not is going to be an option: you're going to > be able to say "Dear server, for this table kindly set up columnar > storage for me, would you? Thank you very much." And then you’re going > to get a table which may be slower for regular usage but which will rock > for analytics. For most of your tables the current row-based store will > still likely be the best option, because row-based storage is much > better suited to the more general cases. > Nice Alvaro, I think that's the right approach. Wish a good work for you on that :) cheers, bráulio > > We don’t have a timescale yet. Stay tuned. > > -- > Álvaro Herrera http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >