On 2/16/23 2:43 PM, Jonathan S. Katz wrote:
On 2/16/23 2:38 PM, Alvaro Herrera wrote:On 2023-Feb-16, Jonathan S. Katz wrote:On 2/16/23 12:53 PM, Alvaro Herrera wrote:I don't think this is the fault of logical replication. Consider that for the backend server, the function source code is just an opaque string that is given to the plpgsql engine to interpret. So there's no way for the logical DDL replication engine to turn this into runnable code if the table name is not qualified.Sure, that's fair. That said, the example above would fall under a "typical use case", i.e. I'm replicating functions that call tables without schemaqualification. This is pretty common, and as logical replication becomesused for more types of workloads (e.g. high availability), we'll definitelysee this.Hmm, I think you're saying that replay should turn check_function_bodies off, and I think I agree with that.Yes, exactly. +1
I drilled into this a bit more using the SQL standard bodies (BEGIN ATOMIC) to see if there were any other behaviors we needed to account for. Overall, it worked well but I ran into one issue.
First, functions with "BEGIN ATOMIC" ignores "check_function_bodies" which is by design based on how this feature works. We should still turn "check_function_bodies" to "off" though, per above discussion.
In the context of DDL replication, "BEGIN ATOMIC" does support schema-unqualified functions, presumably because it includes the parsed content?
I created an updated example[1] where I converted the SQL functions to use the standard syntax and I returned the table names to be schema unqualified. This seemed to work, but I ran into a weird case with this function:
CREATE OR REPLACE FUNCTION public.calendar_manage(room_id int, calendar_date date)
RETURNS void
LANGUAGE SQL
BEGIN ATOMIC
WITH delete_calendar AS (
DELETE FROM calendar
WHERE
room_id = $1 AND
calendar_date = $2
)
INSERT INTO calendar (room_id, status, calendar_date, calendar_range)
SELECT $1, c.status, $2, c.calendar_range
FROM calendar_generate_calendar($1, tstzrange($2, $2 + 1)) c;
END;
This produced an error on the subscriber, with the following message:
2023-02-16 20:58:24.096 UTC [26864] ERROR: missing FROM-clause entry
for table "calendar_1" at character 322
2023-02-16 20:58:24.096 UTC [26864] CONTEXT: processing remote data for
replication origin "pg_18658" during message type "DDL" in transaction
980, finished at 0/C099A7D8
2023-02-16 20:58:24.096 UTC [26864] STATEMENT: CREATE OR REPLACE
FUNCTION public.calendar_manage ( IN room_id pg_catalog.int4, IN
calendar_date pg_catalog.date ) RETURNS pg_catalog.void LANGUAGE sql
VOLATILE PARALLEL UNSAFE CALLED ON NULL INPUT SECURITY INVOKER COST 100
BEGIN ATOMIC
WITH delete_calendar AS (
DELETE FROM public.calendar
WHERE ((calendar_1.room_id OPERATOR(pg_catalog.=)
calendar_manage.room_id) AND (calendar_1.calendar_date
OPERATOR(pg_catalog.=) calendar_manage.calendar_date))
)
INSERT INTO public.calendar (room_id, status, calendar_date,
calendar_range) SELECT calendar_manage.room_id,
c.status,
calendar_manage.calendar_date,
c.calendar_range
FROM
public.calendar_generate_calendar(calendar_manage.room_id,
pg_catalog.tstzrange((calendar_manage.calendar_date)::timestamp with
time zone, ((calendar_manage.calendar_date OPERATOR(pg_catalog.+)
1))::timestamp with time zone)) c(status, calendar_range);
END
This seemed to add an additional, incorrect reference to the origin
table for the "room_id" and "calendar_date" attributes within the CTE of
this function. I don't know if this is directly related to the DDL
replication patch, but reporting it as I triggered the behavior through it.
Thanks, Jonathan [1] https://gist.github.com/jkatz/fe29006b724fd6f32ee849a96dc01608
OpenPGP_signature
Description: OpenPGP digital signature
