On 11/19/18 10:28 AM, Masahiko Sawada wrote:
On Mon, Nov 19, 2018 at 6:52 AM Tomas Vondra <[email protected]> wrote:Hi, It seems we have pretty annoying problem with logical decoding when performing VACUUM FULL / CLUSTER on a table with toast-ed data. The trouble is that the rewritten heap is WAL-logged using XLOG/FPI records, the TOAST data is logged as regular INSERT records. XLOG/FPI is ignored in logical decoding, and so reorderbuffer never gets those records. But we do decode the TOAST data, and reorderbuffer stashes them in toast_hash hash table. Which gets reset only when handling a row from the main heap, and that never arrives. So we end up stashing all the TOAST data in memory :-( So do VACUUM FULL (or CLUSTER) on a sufficiently large table, and you're likely to break any logical replication connection. And it does not matter if you replicate this particular table. Luckily enough, this can leverage some of the pieces introduced by e9edc1ba which was meant to deal with rewrites of system tables, and in raw_heap_insert it added this: /* * The new relfilenode's relcache entrye doesn't have the necessary * information to determine whether a relation should emit data for * logical decoding. Force it to off if necessary. */ if (!RelationIsLogicallyLogged(state->rs_old_rel)) options |= HEAP_INSERT_NO_LOGICAL; As raw_heap_insert is used only for heap rewrites, we can simply remove the if condition and use the HEAP_INSERT_NO_LOGICAL flag for all TOAST data logged from here.This fix seems fine to me.This does fix the issue, because we still decode the TOAST changes but there are no data and so if (change->data.tp.newtuple != NULL) { dlist_delete(&change->node); ReorderBufferToastAppendChunk(rb, txn, relation, change); } ends up not stashing the change in the hash table.With the below change you proposed can we remove the above condition because toast-insertion changes being processed by the reorder buffer always have a new tuple? If a toast-insertion record doesn't have a new tuple it has already ignored when decoding.
Good point. I think you're right the reorderbuffer part may be simplified as you propose.
regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
