On July 23, 2021 6:16 PM Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Fri, Jul 23, 2021 at 2:27 PM Rahila Syed <rahilasye...@gmail.com> wrote:
> >
> > The column comparison for row filtering happens before the unchanged
> > toast columns are filtered. Unchanged toast columns are filtered just
> > before writing the tuple to output stream.
> >
> 
> To perform filtering, you need to use the tuple from WAL and that tuple 
> doesn't
> seem to have unchanged toast values, so how can we do filtering? I think it 
> is a
> good idea to test this once.

I agreed.

Currently, both unchanged toasted key column and unchanged toasted non-key
column is not logged. So, we cannot get the toasted value directly for these
columns when doing row filtering.

I tested the current patch for toasted data and found a problem: In the current
patch, it will try to fetch the toast data from toast table when doing row
filtering[1]. But, it's unsafe to do that in walsender. We can see it use
HISTORIC snapshot in heap_fetch_toast_slice() and also the comments of
init_toast_snapshot() have said "Detoasting *must* happen in the same
transaction that originally fetched the toast pointer.". The toast data could
have been changed when doing row filtering. For exmaple, I tested the following
steps and get an error.

1) UPDATE a nonkey column in publisher.
2) Use debugger to block the walsender process in function
   pgoutput_row_filter_exec_expr().
3) Open another psql to connect the publisher, and drop the table which updated
   in 1).
4) Unblock the debugger in 2), and then I can see the following error:
---
ERROR:  could not read block 0 in file "base/13675/16391"
---

[1]
(1)------publisher------
CREATE TABLE toasted_key (
    id serial,
    toasted_key text PRIMARY KEY,
    toasted_col1 text,
    toasted_col2 text
);
select repeat('9999999999', 200) as tvalue \gset
CREATE PUBLICATION pub FOR TABLE toasted_key WHERE (toasted_col2 = :'tvalue');
ALTER TABLE toasted_key REPLICA IDENTITY USING INDEX toasted_key_pkey;
ALTER TABLE toasted_key ALTER COLUMN toasted_key SET STORAGE EXTERNAL;
ALTER TABLE toasted_key ALTER COLUMN toasted_col1 SET STORAGE EXTERNAL;
ALTER TABLE toasted_key ALTER COLUMN toasted_col2 SET STORAGE EXTERNAL;
INSERT INTO toasted_key(toasted_key, toasted_col1, toasted_col2) 
VALUES(repeat('1234567890', 200), repeat('9876543210', 200), 
repeat('9999999999', 200));

(2)------subscriber------
CREATE TABLE toasted_key (
    id serial,
    toasted_key text PRIMARY KEY,
    toasted_col1 text,
    toasted_col2 text
);

CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres port=10000' PUBLICATION pub;

(3)------publisher------
UPDATE toasted_key SET toasted_col1 = repeat('1111113113', 200);

Based on the above steps, the row filter will ge through the following path
and fetch toast data in walsender.
------
pgoutput_row_filter_exec_expr
    ...
    texteq
        ...
        text *targ1 = DatumGetTextPP(arg1);
            pg_detoast_datum_packed
                detoast_attr
------

Reply via email to