On Thu, Feb 15, 2018 at 8:00 AM, Shay Rojansky <r...@roji.org> wrote:
> I am an author of one of the proposal (autoprepare which is in commit fest >> now), but I think that sooner or later Postgres has to come to solution >> with shared DB caches/prepared plans. >> Please correct me if I am wrong, but it seems to me that most of all >> other top DBMSes having something like this. >> Such decision can provide a lot of different advantages: >> 1. Better memory utilization: no need to store the same data N times >> where N is number of backends and spend time for warming cache. >> 2. Optimizer can spend more time choosing better plan which then can be >> used by all clients. Even now time of compilation of some queries several >> times exceeds time of their execution. >> 3. It is simpler to add facilities for query plan tuning and maintaining >> (storing, comparing,...) >> 4. It make is possible to control size of memory used by caches. Right >> now catalog cache for DB with hundred thousands and tables and indexes >> multiplied by hundreds of backends can consume terabytes of memory. >> 5. Shared caches can simplify invalidation mechanism. >> 6. Almost all enterprise systems working with Postgres has to use some >> kind of connection pooling (pgbouncer, pgpool,...). It almost exclude >> possibility to use prepared statements. Which can slow down performance up >> to two times. >> > > Just wanted to say I didn't see this email before my previous response, > but I agree with all of the above. The last point is particularly > important, especially for short-lived connection scenarios, the most > typical of which is web. > > >> There is just one (but very important) problem which needs to be solved: >> access to shared cache should be synchronized. >> But there are a lot of other shared resources in Postgres (procarray, >> shared buffers,...). So I do not think that it is unsolvable problem and >> that it can cause degrade of performance. >> >> So it seems to be obvious that shared caches/plans can provide a lot of >> advantages. But it is still not clear to me the value of this advantages >> for real customers. >> Using -M prepared protocol in pgbench workload can improve speed up to >> two times. But I have asked real Postgres users in Avito, Yandex, MyOffice >> and them told me >> that on their workloads advantage of prepared statements is about 10%. >> 10% performance improvement is definitely not a good compensation for >> rewriting substantial part of Postgres core... >> > > Just wanted to say that I've seen more than 10% improvement in some > real-world application when preparation was done properly. Also, I'm > assuming that implementing this wouldn't involve "rewriting substantial > part of Postgres core", and that even 10% is quite a big gain, especially > if it's a transparent/free one as far as the user is concerned (no > application changes). > 10% of improvement in real-world can be pretty significant, I ignore how complicated can be to implement this in Postgres core, how about add this to the GSoC 2018 ideas[1]? [1] https://wiki.postgresql.org/wiki/GSoC_2018