在 7/9/2025 5:45 PM, Dongsheng Yang 写道:

在 7/8/2025 4:16 AM, Mikulas Patocka 写道:

On Mon, 7 Jul 2025, Dongsheng Yang wrote:

Hi Mikulas,
    This is V2 for dm-pcache, please take a look.

Code:
     https://github.com/DataTravelGuide/linux tags/pcache_v2

Changelogs

V2 from V1:
    - introduce req_alloc() and req_init() in backing_dev.c, then we
      can do req_alloc() before holding spinlock and do req_init()
      in subtree_walk().
    - introduce pre_alloc_key and pre_alloc_req in walk_ctx, that
      means we can pre-allocate cache_key or backing_dev_request
      before subtree walking.
    - use mempool_alloc() with NOIO for the allocation of cache_key
      and backing_dev_req.
    - some coding style changes from comments of Jonathan.
Hi

mempool_alloc with GFP_NOIO never fails - so you don't have to check the
returned value for NULL and propagate the error upwards.


Hi Mikulas:

   I noticed that the implementation of mempool_alloc—it waits for 5 seconds and retries when allocation fails.

With this in mind, I propose that we handle -ENOMEM inside defer_req() using a similar mechanism. something like this commit:


https://github.com/DataTravelGuide/linux/commit/e6fc2e5012b1fe2312ed7dd02d6fbc2d038962c0


Here are two key reasons why:

(1) If we manage -ENOMEM in defer_req(), we don’t need to modify every lower-level allocation to use mempool to avoid failures—for example,

cache_key, backing_req, and the kmem.bvecs you mentioned. More importantly, there’s no easy way to prevent allocation failure in some places—for instance, bio_init_clone() could still return -ENOMEM.

(2) If we use a mempool, it will block and wait indefinitely when memory is unavailable, preventing the process from exiting.

But with defer_req(), the user can still manually stop the pcache device using dmsetup remove, releasing some memory if user want.


What do you think?


BTW, I added a test case for NOMEM scenario by using failslab:


https://github.com/DataTravelGuide/dtg-tests/blob/main/pcache.py.data/pcache_failslab.sh


Thanx

Dongsheng


"backing_req->kmem.bvecs = kmalloc_array(n_vecs, sizeof(struct bio_vec),
GFP_NOIO)" - this call may fail and you should handle the error gracefully
(i.e. don't end the bio with an error). Would it be possible to trim the
request to BACKING_DEV_REQ_INLINE_BVECS vectors and retry it?
Alternativelly, you can create a mempool for the largest possible n_vecs
and allocate from this mempool if kmalloc_array fails.

I'm sending two patches for dm-pcache - the first patch adds the include
file linux/bitfield.h - it is needed in my config. The second patch makes
slab caches per-module rather than per-device, if you have them
per-device, there are warnings about duplicate cache names.


BTW. What kind of persistent memory do you use? (afaik Intel killed the
Optane products and I don't know of any replacement)

Some times ago I created a filesystem for persistent memory - see
git://leontynka.twibright.com/nvfs.git - I'd be interested if you can test
it on your persistent memory implementation.

Mikulas



Reply via email to