On Sun, Apr 01, 2018 at 03:48:07PM +0300, Konstantin Knizhnik wrote: > Hi hackers, > > Vertical (columnar) storage mode is most optimal for analytic and this is why > it is widely used in databases oriented on OLAP, such as Vertica, > HyPer,KDB,... > In Postgres we have cstore extension which is not able to provide all > benefits of vertical model because of lack of support of vector operations in > executor. > Situation can be changed if we will have pluggable storage API with support > of vectorized execution. > > But veritcal model is not so good for updates and load of data (because data > is mostly imported in horizontal format). > This is why in most of the existed systems data is presentin both formats (at > least for some time). > > I want to announce new model, "diagonal storage" which combines benefits of > both approaches. > The idea is very simple: we first store column 1 of first record, then column > 2 of second record, ... and so on until we reach the last column. > After it we store second column of first record, third column of the second > record,... > > Profiling of TPC-H queries shows that mode of the time of query exectution > (about 17%) is spent is heap_deform_tuple. > New format will allow to significantly reduce time of heap deforming, because > there is just of column if the particular record in each tile. > Moreover over we can perform deforming of many tuples in parallel, which ids > especially efficient at quantum computers. > > Attach please find patch with first prototype implementation. It provides > about 3.14 times improvement of performance at most of TPC-H queries.
You're sure it's not 3.14159265358979323...? Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate