We see high order allocation warnings: kernel: order 10 >= 10, gfp 0x40c00 kernel: WARNING: CPU: 5 PID: 182 at mm/page_alloc.c:5630 __alloc_pages+0x1d7/0x3f0 kernel: process_compressed_read+0x6f/0x590 [dm_qcow2]
This is because we have 1M clusters and in case of zstd compression the buffer size used for decompression is (clu_size + sizeof(ZSTD_DCtx) + ZSTD_BLOCKSIZE_MAX + clu_size + ZSTD_BLOCKSIZE_MAX + 64 = 2520776) which requires 4M allocation. This is a really big continious allocation and it has very high probability to fail. Let's fix it by switching to kvmalloc. note 1: It looks like we can't instead decrease the buffer as compression buffer size is equal to the cluster size. note 2: There already are several places in mainstream kernel of kvmalloc(GFP_NOIO), it should be fine to use it since we have commit 451769ebb7e79 ("mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc"). note 3: Other option here is to switch to a custom memory pool for these allocations, but downside of this approach is that a) if we do a big pool it will always consume a lot of memory, b) if we do a small pool it may slow down compressed reads in case of multiple concurent reads, and c) if we do scalable pools, scaling them in this stack would require kvmalloc(GFP_NOIO) anyway, so we also need to implement some monitor to scale buffer on the side which might be an overkill for this problem. https://virtuozzo.atlassian.net/browse/VSTOR-94596 Signed-off-by: Pavel Tikhomirov <ptikhomi...@virtuozzo.com> --- drivers/md/dm-qcow2-map.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/md/dm-qcow2-map.c b/drivers/md/dm-qcow2-map.c index 6585f3fac6e7b..112f6dde44af1 100644 --- a/drivers/md/dm-qcow2-map.c +++ b/drivers/md/dm-qcow2-map.c @@ -3671,7 +3671,7 @@ static void process_compressed_read(struct qcow2 *qcow2, struct list_head *read_ dctxlen = zlib_inflate_workspacesize(); - buf = kmalloc(qcow2->clu_size + dctxlen, GFP_NOIO); + buf = kvmalloc(qcow2->clu_size + dctxlen, GFP_NOIO); if (!buf) { end_qios(read_list, BLK_STS_RESOURCE); return; @@ -3681,7 +3681,7 @@ static void process_compressed_read(struct qcow2 *qcow2, struct list_head *read_ arg = zstd_init_dstream(qcow2->clu_size, buf + qcow2->clu_size, dctxlen); if (!arg) { end_qios(read_list, BLK_STS_RESOURCE); - kfree(buf); + kvfree(buf); return; } } else { @@ -3716,7 +3716,7 @@ static void process_compressed_read(struct qcow2 *qcow2, struct list_head *read_ list_add_tail(&qio->link, cow_list); } - kfree(buf); + kvfree(buf); } static int prepare_sliced_data_write(struct qcow2 *qcow2, struct qio *qio, -- 2.47.0 _______________________________________________ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel