Nice work.

> 2023年3月20日 20:58,Hang Chen <chenh...@apache.org> 写道:
> 
> If there are no objections, I will send out the vote.
> 
> Thanks,
> Hang
> 
> Hang Chen <chenh...@apache.org> 于2023年3月16日周四 10:15写道:
>> 
>> ### Motivation
>> The bookie server process add-entry requests pipeline:
>> - Get one request from the Netty socket channel
>> - Choose one thread to process the written request
>> - Write the entry into the target ledger entry logger's write
>> cache(memory table)
>> - Put the entry into the journal's pending queue
>> - Journal thread takes the entry from the pending queue and writes it
>> into PageCache/Journal Disk
>> - Write the callback response to the Netty buffer and flush to the
>> bookie client side.
>> 
>> For every add entry request, the bookie server needs to go through the
>> above steps one by one. It will introduce a lot of thread context
>> switches.
>> 
>> We can batch the add requests according to the Netty socket channel,
>> and write a batch of entries into the ledger entry logger and journal
>> disk.
>> 
>> ### Modifications
>> The PR will change the add requests processing pipeline into the
>> following steps.
>> - Get a batch of add-entry requests from the socket channel until the
>> socket channel is empty or reached the max capacity (default is 1_000)
>> - Choose one thread to process the batch of add-entry requests.
>> - Write the entries into the target ledger entry logger's write cache one by 
>> one
>> - Put the batch of entries into the journal's pending queue
>> - Journal thread drains a batch of entries from the pending queue and
>> writes them into PageCache/Journal disk
>> - Write the callback response to the Netty buffer and flush to the
>> bookie client side.
>> 
>> With this change, we can save a lot of thread context switches.
>> 
>> 
>> ### Performance
>> I start one bookie on my laptop and use the Bookkeeper benchmark to
>> test the performance
>> ```shell
>> bin/benchmark writes -ensemble 1 -quorum 1 -ackQuorum 1 -ledgers 50
>> -throttle 20000
>> ```
>> 
>> **Before this change**
>> 
>> | times | ops/sec | p95 latency | p99 latency |
>> | --- | --- | --- | --- |
>> | 1 | 147507 | 114.93 | 122.42 |
>> | 2 | 154571 |  111.46 | 115.86 |
>> | 3 | 141459 | 117.23 | 124.18 |
>> | 4 | 142037 | 121.75 | 128.54 |
>> | 5 | 143682 | 121.05 | 127.97 |
>> 
>> 
>> **After this change**
>> 
>> | times | ops/sec | p95 latency | p99 latency |
>> | --- | --- | --- | --- |
>> | 1 | 157328 | 118.30 | 121.79 |
>> | 2 | 165774 |  112.86 | 115.69 |
>> | 3 | 144790 | 128.94 | 133.24 |
>> | 4 | 151984 | 121.88 | 125.32 |
>> | 5 | 154574 | 121.57 | 124.57 |
>> 
>> The new change has a 2.2% improvement.
>> 
>> Do you guys have any ideas?
>> 
>> Thanks,
>> Hang

Reply via email to