2017-12-21 14:25 GMT+01:00 Konstantin Knizhnik <k.knizh...@postgrespro.ru>:
> I continue experiments with my pthread prototype. > Latest results are the following: > > 1. I have eliminated all (I hope) calls of non-reentrant functions > (getopt, setlocale, setitimer, localtime, ...). So now parallel tests are > passed. > > 2. I have implemented deallocation of top memory context (at thread exit) > and cleanup of all opened file descriptors. > I have to replace several place where malloc is used with top_malloc: > allocation in top context. > > 3. Now my prototype is passing all regression tests now. But handling of > errors is still far from completion. > > 4. I have performed experiments with replacing synchronization primitives > used in Postgres with pthread analogues. > Unfortunately it has almost now influence on performance. > > 5. Handling large number of connections. > The maximal number of postgres connections is almost the same: 100k. > But memory footprint in case of pthreads was significantly smaller: 18Gb > vs 38Gb. > And difference in performance was much higher: 60k TPS vs . 600k TPS. > Compare it with performance for 10k clients: 1300k TPS. > It is read-only pgbench -S test with 1000 connections. > As far as pgbench doesn't allow to specify more than 1000 clients, I > spawned several instances of pgbench. > > Why handling large number of connections is important? > It allows applications to access postgres directly, not using pgbouncer or > any other external connection pooling tool. > In this case an application can use prepared statements which can reduce > speed of simple queries almost twice. > What I know MySQL has not good experience with high number of threads - and there is thread pool in enterprise (and now in Mariadb0 versions. Regards Pavel > Unfortunately Postgres sessions are not lightweight. Each backend > maintains its private catalog and relation caches, prepared statement > cache,... > For real database size of this caches in memory will be several megabytes > and warming this caches can take significant amount of time. > So if we really want to support large number of connections, we should > rewrite caches to be global (shared). > It will allow to save a lot of memory but add synchronization overhead. > Also at NUMA private caches may be more efficient than one global cache. > > My proptotype can be found at: git://github.com/postgrespro/p > ostgresql.pthreads.git > > > -- > > Konstantin Knizhnik > Postgres Professional: http://www.postgrespro.com > The Russian Postgres Company > > >