Re: [HACKERS] On columnar storage (2)

Konstantin Knizhnik Wed, 23 Dec 2015 05:50:24 -0800

Hi Alvaro,

May be you know, that I have implemented IMCS (in-memory-columnar-store)as PostgreSQL extension.It was not so successful, mostly because people prefer to use standardSQL rather than using some special functions for accessing columnarstorage (CS). Now I am thinking about second reincarnation of IMCS,based on FDW and CSP (custom nodes). This is why I am very interested inyour patch. I have investigated previous version of the patch and havesome questions.

I will be pleased if you can clarify them to me:


1. CS API.

I agree with you that FDW API seems to be not enough to efficientlysupport work with CS.

At least we need batch insert.

But may be it is better to extend FDW API rather than creating specialAPI for CS?

2. Horizontal<->Vertical data mapping. As far as I understand thispatch, the model of CS assumes that some table columns are stored inhorizontal format (in heap), some - in vertical format (in CS). Andthere is one-to-one mapping between horizontal and vertical parts of rowusing CTID. But been involved in several projects requiring OLAP, Ifound out that in most cases it is more convenient to have one-to-manymapping. Assume some trading system dealing with stock quotes.

Data looks something like this:

Symbol  Day            Open Close High  Low  Volume
AAA       12/22/2015  10.0   12.0   13.0  8.0   100
AAB       12/22/2015  9.0       8.0   10.0  9.0   200
...
AAA       12/23/2015  12.0   11.5   12.5 11.0    50
AAB       12/23/2015  8.0       8.8    8.5    8.0  300

So it can be represented using the following table:

create table Quote (Symbol char(10), Day date, Open real, Highreal, Low real, Close real, Volume integer);

Most likely we need to calculate some statistic for particular symbol orset of symbols.For example, portfolio is set of symbols and we need to somehow analyzeinstruments in this portfolio.

There are about several thousands symbols, tens instruments inportfolio and tens of thousands quotes per symbol (in other cases sizeof timeseries are much larger - millions elements).

How can we efficiently execute query like:

select Symbol,sum(Close*Volume)/sum(Volume) as VWAP from Quotegroup by Symbolwhere day between '01/01/2001' and '01/01/2010' and Symbol in('AAA', 'AAB','ABB',...);

If we have index by Symbol, then it will contain a lot of duplicates.And it is not clear how to efficiently combine index scan by symbol nameand time slice.


One of the possible solution is to embed timeseries into tuples.
In this case we will have something like this:

create table Quote (Symbol char(10), Day timeseries(date), Opentimeseries(real), High timeseries(real),Low timeseries(real), Closetimeseries(real), Volume timeseries(integer));

We are using here unexisted type timeseries. It is something similarwith array, but its content in stored in columnar storage rather than inrecord's TOAST.In this case we can efficiently locate records by symbol (there are onlyfew thousands entries in the table) and then perform CS operations withlocated timeseries.

So here we also split tuple into horizontal and vertical part. Inhorizontal part we store just identifier of timeseries.Query plan should combine standard nodes with custom CS nodes. Mixinghorizontal and vertical operations significantly complicates optimizerand restricts flexibility: having proposed representation it isdifficult to efficiently calculate some characteristic for all symbolsin specified time range. This is why I am not sure that it is the onlypossible and most efficient approach. But in any case there should besome efficient plan for queries like above.


3. Transpose of data and role of CS.

Let's look once again on Quote example above. Data is received in timeascending order. But most queries require grouping it by symbol. So atsome stage we have to "transpose" data. To efficiently append data totimeseries we need to buffer it somewhere and then use append range ofvalues. In Fujitsu approach two different representations of data areused: reader and writer optimized. In IMCS approach, CS is justtemporary projection of normal PostgreSQL tables. So we do not need toworry about durability - it is enforced by PostgreSQL.

So the question is whether CS should be only storage for the data orjust copy (may be transient) of normal table?


Best regards,
Konstantin

On 22.12.2015 17:43, Alvaro Herrera wrote:

Michael Paquier wrote:

On Wed, Dec 9, 2015 at 3:10 PM, Jeff Janes <jeff.ja...@gmail.com> wrote:

Could we get this rebased past the merge of the parallel execution commits?

+1. Alvaro, Tomas, Simon, what are the next plans with those patches?

Yeah, I've been working intermittently on getting the whole tree rebased
and squashed, because after the last submission we made a lot of
progress.  I'll repost later.  I think it should be marked "returned
with feedback" for now.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On columnar storage (2)

Reply via email to