On Fri, Dec 22, 2017 at 10:07 AM, Konstantin Knizhnik < k.knizh...@postgrespro.ru> wrote:
> While my experiments with pthreads version of Postgres I find out that I > can not create more than 100k backends even at the system with 4Tb of RAM. > I do not want to discuss now the idea of creating so large number of > backends - yes, most of the real production systems are using pgbouncer or > similar connection pooling > tool allowing to restrict number of connections to the database. But there > are 144 cores at this system and if we want to utilize all system resources > then optimal number of > backends will be several hundreds (especially taken in account that > Postgres backends are usually not CPU bounded and have to read data from > the disk, so number of backends > should be much larger than number of cores). > > There are several per-backend arrays in postgres which size depends on > maximal number of backends. > For max_connections=100000 Postgres allocates 26Mb for each snapshot: > > CurrentRunningXacts->xids = (TransactionId *) > malloc(TOTAL_MAX_CACHED_SUBXIDS * sizeof(TransactionId)); > > It seems to be too overestimated value, because TOTAL_MAX_CACHED_SUBXIDS > is defined as: > > /* > * During Hot Standby processing we have a data structure called > * KnownAssignedXids, created in shared memory. Local data structures > are > * also created in various backends during GetSnapshotData(), > * TransactionIdIsInProgress() and GetRunningTransactionData(). All of > the > * main structures created in those functions must be identically > sized, > * since we may at times copy the whole of the data structures around. > We > * refer to this size as TOTAL_MAX_CACHED_SUBXIDS. > * > * Ideally we'd only create this structure if we were actually doing > hot > * standby in the current run, but we don't know that yet at the time > * shared memory is being set up. > */ > #define TOTAL_MAX_CACHED_SUBXIDS \ > ((PGPROC_MAX_CACHED_SUBXIDS + 1) * PROCARRAY_MAXPROCS) > > > Another 12Mb array is used for deadlock detection: > > #2 0x00000000008ac397 in InitDeadLockChecking () at deadlock.c:196 > 196 (EDGE *) palloc(maxPossibleConstraints * sizeof(EDGE)); > (gdb) list > 191 * last MaxBackends entries in possibleConstraints[] are > reserved as > 192 * output workspace for FindLockCycle. > 193 */ > 194 maxPossibleConstraints = MaxBackends * 4; > 195 possibleConstraints = > 196 (EDGE *) palloc(maxPossibleConstraints * sizeof(EDGE)); > 197 > > > As result amount of dynamic memory allocated for each backend exceeds > 50Mb and so 100k backends can not be launched even at the system with 4Tb! > I think that we should use more accurate allocation policy in this places > and do not waste memory in such manner (even if it is virtual). > Don't forget each thread also has its own stack. I don't think you can expect 100k threads to ever work. If you get to that point, you really need to consider async query execution. There was a lot of work related to that in other threads, you may want to take a look.