My first question is why are you not using "WHERE CURRENT OF" cursor_name?
The link to the page follows. But effectively, you are locking the row and that is the row you want to update (the current row of the cursor). I wonder if that addresses the problem... https://www.postgresql.org/docs/current/sql-update.html On Fri, Nov 25, 2022 at 2:58 AM Ryo Yamaji (Fujitsu) <yamaji....@fujitsu.com> wrote: > Hi All. > > When multiple sessions tried to acquire a row lock on the same tuple > sequentially, I expected > that the first session waiting for the lock would acquire the lock first > (FIFO). However, when we > actually tested it, there were cases where a session that was behind a > first session acquired > a row lock ahead of first session. > Is this behavior is expected specifications or is it a bug? > > The following is a list of the cases where the next session gets the lock > first. > ・Multiple sessions request row locks for the same tuple > ・Update occurs for target tuple > > The above behavior may have the following problems: > ・If the lock acquiring order is FIFO in Postgres, it is not satisfied. > ・Any sessions which cannot acquire a lock for a long time result in a long > transaction > > If the above behavior is a specification, I think it is necessary to > document design considerations > when dealing with row locks. > > [Condition] > ・Version > - PostgreSQL 14.0 (compiled with LOCK_DEBUG) > ・postgresql.conf > - log_lock_waits = on > - deadlock_timeout = 10 > - log_line_prefix = '%m [%p:%l:%x] ' > - trace_locks = on > - trace_lock_oidmin = 16000 > > [Table Definition] > create table t1 (col1 int, col2 int, col3 char (10)); > insert into t1 values (1,0, ''); insert into t1 values (100, 9999, ''); > > [Application] > <a.sql> --------------------------- > BEGIN; > SELECT * FROM t1 WHERE col1 = 1 FOR UPDATE; > SELECT current_timestamp AS TRANSTART, clock_timestamp() AS NOW; > UPDATE t1 set col2 = :val, col3 = :valb WHERE col1 = 1; > SELECT pg_sleep(1); > COMMIT; > ------------------------------------ > > [Test] > The test set is attached. > 1. Run the following in 100 multiplex > for i in `seq 100`; do psql postgres -f ./a.sql -v val=${i} -v > valb="'a.sh'" &; done > 2.After 10 seconds (1. After completion of execution), run the following > in 10 multiplexes > for i in `seq 10`; do ii=`expr ${i} + 900`; psql postgres -f ./a.sql -v > val=${ii} -v valb="'b.sh'" &; done > > All a.sh were expected to complete processing first (FIFO). However, b.sh > completed the process > before part of a.sh. > > [Log] > We are checking two types of logs. > > 1. ShareLock has one wait, the rest is in AccessExclusiveLock > > 1-1. Only 1369555 is aligned with ShareLock, the transaction ID obtained > by 1369547, and the rest with > AccessExclusiveLock, the tuple obtained by 1369555. > This is similar to a pattern in which no updates have occurred to the > tuple. > -------------------------------------------------------------- > 2022-10-26 01:20:08.881 EDT [1369555:19:0] LOG: process 1369555 still > waiting for ShareLock on transaction 2501 after 10.072 ms > 2022-10-26 01:20:08.881 EDT [1369555:20:0] DETAIL: Process holding the > lock: 1369547. Wait queue: 1369555. > ~ > 2022-10-26 01:21:58.918 EDT [1369898:17:0] LOG: process 1369898 acquired > AccessExclusiveLock on tuple (1, 0) of relation 16546 of database 13779 > after 10.321 ms > 2022-10-26 01:21:58.918 EDT [1369898:18:0] DETAIL: Process holding the > lock: 1369555. Wait queue: 1369558, 1369561, 1369564, 1369567, 1369570, > 1369573, 1369576, ... > -------------------------------------------------------------- > > > 2. All processes wait with ShareLock > > 2-1. With 1369558 holding the t1 (0, 4) lock, the queue head is 1369561. > -------------------------------------------------------------- > 2022-10-26 01:22:27.230 EDT [1369623:46:2525] LOG: process 1369623 still > waiting for ShareLock on transaction 2504 after 10.133 msprocess 1369623 > still waiting for ShareLock on transaction 2504 after 10.133 ms > 2022-10-26 01:22:27.242 EDT [1369877:47:2604] DETAIL: Process holding the > lock: 1369558. Wait queue: 1369561, 1369623, 1369626, ... > -------------------------------------------------------------- > > 2-2. When 1369558 locks are released, the first 1369561 in the Wait queue > was expected to acquire the lock, > but the process actually acquired 1369787 > -------------------------------------------------------------- > 2022-10-26 01:22:28.237 EDT [1369623:63:2525] LOG: process 1369623 still > waiting for ShareLock on transaction 2577 after 10.028 ms > 2022-10-26 01:22:28.237 EDT [1369623:64:2525] DETAIL: Process holding the > lock: 1369787. Wait queue: 1369623, 1369610, 1369614, 1369617, 1369620. > -------------------------------------------------------------- > > 2-3. Checking that the 1369561 is rearranging. > -------------------------------------------------------------- > 2022-10-26 01:22:28.237 EDT [1369629:64:2527] DETAIL: Process holding the > lock: 1369623. Wait queue: 1369629, 1369821, 1369644, ... 1369561, ... > -------------------------------------------------------------- > > > Regerds, Ryo >