Re: reporting TID/table with corruption error

Andrey Borodin Mon, 10 Jan 2022 01:11:11 -0800

> 19 авг. 2021 г., в 21:37, Alvaro Herrera <[email protected]> написал(а):
> 
> A customer recently hit this error message:
> 
> ERROR:  t_xmin is uncommitted in tuple to be updated

Hi!

Currently I'm observing this on one of our production clusters. The problem 
occurs at random points in time, seems to be covered by retries on client's 
side and so far did not inflict any harm (except woken engineers).

Few facts:
0. PostgreSQL 12.9 (with some unrelated patches)
1. amcheck\heapcheck\pg_visibility never suspected the cluster and remain silent
2. I observe the problem ~once a day
3. The tuple seems to be updated in a high-contention concurrency trigger 
function, autovacuum keeks in ~20-30 seconds after the message in logs

[ 2022-01-10 09:07:17.671 MSK [unknown],????,????_????s,310759,XX001 ]:ERROR:  
t_xmin 696079792 is uncommitted in tuple (1419011,109) to be updated in table 
"????s_statistics"
[ 2022-01-10 09:07:17.671 MSK [unknown],????,????_????s,310759,XX001 ]:CONTEXT: 
 SQL statement "UPDATE ????_????s.????s_statistics os
             SET ????_????_found_ts = COALESCE(os.????_????_found_ts, NOW()),
                 last_????_found_ts = NOW(),
                 num_????s = os.num_????s + 1
             WHERE ????_id = NEW.????_id"
        PL/pgSQL function statistics__update_from_new_????() line 3 at SQL 
statement
[ 2022-01-10 09:07:17.671 MSK [unknown],????,????_????s,310759,XX001 
]:STATEMENT:  
        INSERT INTO ????_????s.????s_????s AS ????s 

4. t_xmin is relatevely new, not ancient
5. pageinspect shows dead tuple after some time
6. no suspicious activity in logs nearby
7. vacuum (disable_page_skipping) and repack of indexes did not change anything


I suspect this can be relatively new concurrency stuff. At least I never saw 
this before on clusters with clean amcheck and heapcheck results.

Alvaro, did you observe this on binaries from August 13 minor release or older?

Thanks!

Best regards, Andrey Borodin.
Re: reporting TID/table with corruption error

Reply via email to