On Thu, Oct 24, 2019 at 03:48:57PM -0300, Alvaro Herrera wrote:
On 2019-Oct-23, kato-...@fujitsu.com wrote:
Hello
To benchmark with tpcb model, I tried to create a foreign key in the
partitioned history table, but backend process killed by OOM.
the number of partitions is 8192. I tried in master(commit: ad4b7aeb84).
I did the same thing in another server which has 200GB memory, but creating
foreign key did not end in 24 hours.
Thanks for reporting. It sounds like there must be a memory leak here.
I am fairly pressed for time at present so I won't be able to
investigate this until, at least, mid November.
I've briefly looked into this, and I think the main memory leak is in
RelationBuildPartitionDesc. It gets called with PortalContext, it
allocates a lot of memory building the descriptor, copies it into
CacheContext but does not even try to free anything. So we end up with
something like this:
TopMemoryContext: 215344 total in 11 blocks; 47720 free (12 chunks); 167624 used
pgstat TabStatusArray lookup hash table: 32768 total in 3 blocks; 9160 free (4
chunks); 23608 used
TopTransactionContext: 4194304 total in 10 blocks; 1992968 free (18 chunks);
2201336 used
RowDescriptionContext: 8192 total in 1 blocks; 6880 free (0 chunks); 1312 used
MessageContext: 8192 total in 1 blocks; 3256 free (1 chunks); 4936 used
Operator class cache: 8192 total in 1 blocks; 512 free (0 chunks); 7680 used
smgr relation table: 32768 total in 3 blocks; 16768 free (8 chunks); 16000 used
TransactionAbortContext: 32768 total in 1 blocks; 32504 free (0 chunks); 264
used
Portal hash: 8192 total in 1 blocks; 512 free (0 chunks); 7680 used
TopPortalContext: 8192 total in 1 blocks; 7648 free (0 chunks); 544 used
PortalContext: 1557985728 total in 177490 blocks; 9038656 free (167645 chunks); 1548947072 used:
Relcache by OID: 16384 total in 2 blocks; 3424 free (3 chunks); 12960 used
CacheMemoryContext: 17039424 total in 13 blocks; 7181480 free (9 chunks);
9857944 used
partition key: 1024 total in 1 blocks; 168 free (0 chunks); 856 used: history
index info: 2048 total in 2 blocks; 568 free (1 chunks); 1480 used:
pg_class_tblspc_relfilenode_index
...
index info: 2048 total in 2 blocks; 872 free (0 chunks); 1176 used:
pg_class_oid_index
WAL record construction: 49776 total in 2 blocks; 6344 free (0 chunks); 43432
used
PrivateRefCount: 8192 total in 1 blocks; 2584 free (0 chunks); 5608 used
MdSmgr: 8192 total in 1 blocks; 5976 free (0 chunks); 2216 used
LOCALLOCK hash: 65536 total in 4 blocks; 18584 free (12 chunks); 46952 used
Timezones: 104128 total in 2 blocks; 2584 free (0 chunks); 101544 used
ErrorContext: 8192 total in 1 blocks; 6840 free (4 chunks); 1352 used
Grand total: 1580997216 bytes in 177834 blocks; 18482808 free (167857 chunks);
1562514408 used
(At which point I simply interrupted the query, it'd allocate more and
more memory until an OOM).
The attached patch trivially fixes that by adding a memory context
tracking all the temporary data, and then just deletes it as a whole at
the end of the function. This significantly reduces the memory usage for
me, not sure it's 100% correct.
FWIW, even with this fix it still takes an awful lot to create the
foreign key, because the CPU is stuck doing this
60.78% 60.78% postgres postgres [.] bms_equal
32.58% 32.58% postgres postgres [.] get_eclass_for_sort_expr
3.83% 3.83% postgres postgres [.]
add_child_rel_equivalences
0.23% 0.00% postgres [unknown] [.] 0x0000000000000005
0.22% 0.00% postgres [unknown] [.] 0000000000000000
0.18% 0.18% postgres postgres [.] AllocSetCheck
...
Haven't looked into the details yet.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/partitioning/partdesc.c
b/src/backend/partitioning/partdesc.c
index b207b765f2..7f15774266 100644
--- a/src/backend/partitioning/partdesc.c
+++ b/src/backend/partitioning/partdesc.c
@@ -72,6 +72,15 @@ RelationBuildPartitionDesc(Relation rel)
MemoryContext oldcxt;
int *mapping;
+ MemoryContext tmp;
+ MemoryContext old;
+
+ tmp = AllocSetContextCreate(CurrentMemoryContext,
+ "tmp context",
+
ALLOCSET_SMALL_SIZES);
+
+ old = MemoryContextSwitchTo(tmp);
+
/*
* Get partition oids from pg_inherits. This uses a single snapshot to
* fetch the list of children, so while more children may be getting
added
@@ -231,6 +240,9 @@ RelationBuildPartitionDesc(Relation rel)
}
}
+ MemoryContextSwitchTo(old);
+ MemoryContextDelete(tmp);
+
rel->rd_partdesc = partdesc;
}