On Fri, Nov 13, 2020 at 01:39:31PM -0300, Alvaro Herrera wrote:
> On 2020-Nov-13, Justin Pryzby wrote:
> 
> > I saw a bunch of these in my logs:
> > 
> > log_time | 2020-10-25 22:59:45.619-07
> > database | 
> > left     | could not open relation with OID 292103095
> > left     | processing work entry for relation 
> > "ts.child.alarms_202010_alarm_clear_time_idx"
> > 
> > Those happen following a REINDEX job on that index.
> > 
> > I think that should be more like an INFO message, since that's what vacuum 
> > does
> > (vacuum_open_relation), and a queued work item is even more likely to hit a
> > dropped relation.
> 
> Ah, interesting.  Yeah, I agree this is a bug.  I think it can be fixed
> by using try_relation_open() on the index; if that returns NULL, discard
> the work item.
> 
> Does this patch solve the problem?

Your patch didn't actually say "try_relation_open", so didn't work.
But it does works if I do that, and close the table.

I tested like:

pryzbyj=# ALTER SYSTEM SET 
backtrace_functions='try_relation_open,relation_open';
pryzbyj=# ALTER SYSTEM SET autovacuum_naptime=3; SELECT pg_reload_conf();
pryzbyj=# CREATE TABLE tt AS SELECT generate_series(1,9999)i;
pryzbyj=# CREATE INDEX ON tt USING brin(i) 
WITH(autosummarize,pages_per_range=1);
pryzbyj=# \! while :; do psql -h /tmp -qc 'SET client_min_messages=info' -c 
'REINDEX INDEX CONCURRENTLY tt_i_idx'; done&

-- run this 5-10 times and hit the "...was not recorded" message, which for
-- whatever reason causes the race condition involving work queue
pryzbyj=# UPDATE tt SET i=1+i;

2020-11-13 11:50:46.093 CST [30687] ERROR:  could not open relation with OID 
1110882
2020-11-13 11:50:46.093 CST [30687] CONTEXT:  processing work entry for 
relation "pryzbyj.public.tt_i_idx"
2020-11-13 11:50:46.093 CST [30687] BACKTRACE:
        postgres: autovacuum worker pryzbyj(+0xb9ce8) [0x55acf2af0ce8]
        postgres: autovacuum worker pryzbyj(index_open+0xb) [0x55acf2bab59b]
        postgres: autovacuum worker pryzbyj(brin_summarize_range+0x8f) 
[0x55acf2b5b5bf]
        postgres: autovacuum worker pryzbyj(DirectFunctionCall2Coll+0x62) 
[0x55acf2f40372]
        ...

-- 
Justin
>From e08c6d3e2b10964633904ff247e70330077d31b4 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvhe...@alvh.no-ip.org>
Date: Fri, 13 Nov 2020 13:39:31 -0300
Subject: [PATCH v2] error_severity of brin work item

---
 src/backend/access/brin/brin.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 1f72562c60..8278a5209c 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -887,8 +887,10 @@ brin_summarize_range(PG_FUNCTION_ARGS)
 	/*
 	 * We must lock table before index to avoid deadlocks.  However, if the
 	 * passed indexoid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
+	 * Rather than emitting a not-very-helpful error message, prepare to
+	 * return without doing anything.  This allows autovacuum work-items to be
+	 * silently discarded rather than uselessly accumulating error messages in
+	 * the server log.
 	 */
 	heapoid = IndexGetRelation(indexoid, true);
 	if (OidIsValid(heapoid))
@@ -896,7 +898,14 @@ brin_summarize_range(PG_FUNCTION_ARGS)
 	else
 		heapRel = NULL;
 
-	indexRel = index_open(indexoid, ShareUpdateExclusiveLock);
+	indexRel = try_relation_open(indexoid, ShareUpdateExclusiveLock);
+	if (heapRel == NULL || indexRel == NULL)
+	{
+		if (heapRel != NULL)
+			table_close(heapRel, ShareUpdateExclusiveLock);
+
+		PG_RETURN_INT32(0);
+	}
 
 	/* Must be a BRIN index */
 	if (indexRel->rd_rel->relkind != RELKIND_INDEX ||
-- 
2.17.0

Reply via email to