Re: Increase NUM_XLOGINSERT_LOCKS

Japin Li Wed, 22 Jan 2025 18:30:31 -0800

On Sat, 18 Jan 2025 at 14:53, Yura Sokolov <y.soko...@postgrespro.ru> wrote:
> Since it seems Andres missed my request to send answer's copy,
> here it is:
>
> On 2025-01-16 18:55:47 +0300, Yura Sokolov wrote:
>> 16.01.2025 18:36, Andres Freund пишет:
>>> Hi,
>>>
>>> On 2025-01-16 16:52:46 +0300, Yura Sokolov wrote:
>>>> Good day, hackers.
>>>>
>>>> Zhiguo Zhow proposed to transform xlog reservation to lock-free
>     algorighm to
>>>> increment NUM_XLOGINSERT_LOCKS on very huge (480vCPU) servers. [1]
>>>>
>>>> While I believe lock-free reservation make sense on huge server,
>     it is hard
>>>> to measure on small servers and personal computers/notebooks.
>>>>
>>>> But increase of NUM_XLOGINSERT_LOCKS have measurable performance
>     gain (using
>>>> synthetic test) even on my working notebook:
>>>>
>>>>    Ryzen-5825U (8 cores, 16 threads) limited to 2GHz , Ubuntu 24.04
>>>
>>> I've experimented with this in the past.
>>>
>>>
>>> Unfortunately increasing it substantially can make the contention on the
>>> spinlock *substantially* worse.
>>>
>>> c=80 && psql -c checkpoint -c 'select pg_switch_wal()' && pgbench
>    -n -M prepared -c$c -j$c -f <(echo "SELECT
>    pg_logical_emit_message(true, 'test', repeat('0', 1024*1024));";)
>   -P1 -T15
>>>
>>> On a 2x Xeon Gold 5215, with max_wal_size = 150GB and the workload
>    ran a few
>>> times to ensure WAL is already allocated.
>>>
>>> With
>>> NUM_XLOGINSERT_LOCKS = 8:       1459 tps
>>> NUM_XLOGINSERT_LOCKS = 80:      2163 tps
>>
>> So, even in your test you have +50% gain from increasing
>> NUM_XLOGINSERT_LOCKS.
>>
>> (And that is why I'm keen on smaller increase, like upto 64, not 128).
>
> Oops, I swapped the results around when reformatting the results,
> sorry! It's
> the opposite way.  I.e. increasing the locks hurts.
>
> Here's that issue fixed and a few more NUM_XLOGINSERT_LOCKS.  This is a
> slightly different disk (the other seems to have to go the way of the dodo),
> so the results aren't expected to be exactly the same.
>
> NUM_XLOGINSERT_LOCKS  TPS
> 1                       2583
> 2                       2524
> 4                       2711
> 8                     2788
> 16                      1938
> 32                      1834
> 64                      1865
> 128                     1543
>
>
>>>
>>> The main reason is that the increase in insert locks puts a lot
>    more pressure
>>> on the spinlock.
>>
>> That it addressed by Zhiguo Zhow and me in other thread [1]. But
>   increasing
>> NUM_XLOGINSERT_LOCKS gives benefits right now (at least on smaller
>> installations), and "lock-free reservation" should be measured
>   against it.
>
> I know that there's that thread, I just don't see how we can increase
> NUM_XLOGINSERT_LOCKS due to the regressions it can cause.
>
>
>>> Secondarily it's also that we spend more time iterating
>>> through the insert locks when waiting, and that that causes a lot
>    of cacheline
>>> pingpong.
>>
>> Waiting is done with LWLockWaitForVar, and there is no wait if
>   `insertingAt`
>> is in future. It looks very efficient in master branch code.
>
> But LWLockWaitForVar is called from WaitXLogInsertionsToFinish, which just
> iterates over all locks.
>


Hi, Yura Sokolov

I tested the patch on Hygon C86 7490 64-core using benchmarksql 5.0 with
500 warehouses and 256 terminals run time 10 mins:

| case               | min          | avg          | max          |
|--------------------+--------------+--------------+--------------|
| master (4108440)   | 891,225.77   | 904,868.75   | 913,708.17   |
| lock 64            | 1,007,716.95 | 1,012,013.22 | 1,018,674.00 |
| lock 64 attempt 1  | 1,016,716.07 | 1,017,735.55 | 1,019,328.36 |
| lock 64 attempt 2  | 1,015,328.31 | 1,018,147.74 | 1,021,513.14 |
| lock 128           | 1,010,147.38 | 1,014,128.11 | 1,018,672.01 |
| lock 128 attempt 1 | 1,018,154.79 | 1,023,348.35 | 1,031,365.42 |
| lock 128 attempt 2 | 1,013,245.56 | 1,018,984.78 | 1,023,696.00 |

I didn't NUM_XLOGINSERT_LOCKS with 16 and 32, however, I tested it with 256,
and got the following error:

2025-01-23 02:23:23.828 CST [333524] PANIC:  too many LWLocks taken

I hope this test will be helpful.

-- 
Regrads,
Japin Li

Re: Increase NUM_XLOGINSERT_LOCKS

Reply via email to