Re: [PATCH v1 00/11] dm-pcache – persistent-memory cache for block devices

Dongsheng Yang Mon, 30 Jun 2025 07:17:08 -0700


在 6/30/2025 9:40 PM, Dongsheng Yang 写道:


在 6/30/2025 9:30 PM, Mikulas Patocka 写道:


On Tue, 24 Jun 2025, Dongsheng Yang wrote:

Hi Mikulas,
    This is V1 for dm-pcache, please take a look.

Code:
     https://github.com/DataTravelGuide/linux tags/pcache_v1

Changelogs from RFC-V2:
    - use crc32c to replace crc32

- only retry pcache_req when cache full, add pcache_req intodefer_list,

      and wait cache invalidation happen.
    - new format for pcache table, it is more easily extended with
      new parameters later.
    - remove __packed.
    - use spin_lock_irq in req_complete_fn to replace
      spin_lock_irqsave.
    - fix bug in backing_dev_bio_end with spin_lock_irqsave.
    - queue_work() inside spinlock.
    - introduce inline_bvecs in backing_dev_req.
    - use kmalloc_array for bvecs allocation.
    - calculate ->off with dm_target_offset() before use it.

Hi

The out-of-memory handling still doesn't seem right.

If the GFP_NOWAIT allocation doesn't succeed (which may happen anytime,
for example it happens when the machine is receiving network packets
faster than the swapper is able to swap out data), create_cache_miss_req
returns NULL, the caller changes it to -ENOMEM, cache_read returns
-ENOMEM, -ENOMEM is propagated up to end_req and end_req will set the
status to BLK_STS_RESOURCE. So, it may randomly fail I/Os with an error.

Properly, you should use mempools. The mempool allocation will waituntil

some other process frees data into the mempool.

If you need to allocate memory inside a spinlock, you can't do itreliably

(because you can't sleep inside a spinlock and non-sleepng memory
allocation may fail anytime). So, in this case, you should drop the
spinlock, allocate the memory from a mempool with GFP_NOIO and jump back
to grab the spinlock - and now you holding the allocated object, so you
can use it while you hold the spinlock.



Hi Mikulas,

Thanx for your suggestion, I will cook a GFP_NOIO version for thememory allocation for pcache data path.


Hi Mikulas,

The reason why we don’t release the spinlock here is that if we do,the subtree could change.

For example, in the `fixup_overlap_contained()` function, we may need tosplit a certain `cache_key`, and that requires allocating a new`cache_key`.

If we drop the spinlock at this point and then re-acquire it after theallocation, the subtree might already have been modified, and we cannotsafely continue with the split operation.

In this case, we would have to restart the entire subtree searchand walk. But the new walk might require more memory—or less,

so it's very difficult to know in advance how much memory will be neededbefore acquiring the spinlock.

So allocating memory inside a spinlock is actually a more directand feasible approach. `GFP_NOWAIT` fails too easily, maybe `GFP_ATOMIC`is more appropriate.



What do you think?



Another comment:
set_bit/clear_bit use atomic instructions which are slow. As you already
hold a spinlock when calling them, you don't need the atomicity, so you
can replace them with __set_bit and __clear_bit.



Good idea.


Thanx

Dongsheng


Mikulas

Re: [PATCH v1 00/11] dm-pcache – persistent-memory cache for block devices

Reply via email to