Re: [PATCH 13/15] raid: Remove now superfluous sentinel element from ctl_table array
On Thu, Sep 28, 2023 at 6:20 AM Joel Granados via B4 Relay wrote: > > From: Joel Granados > > This commit comes at the tail end of a greater effort to remove the > empty elements at the end of the ctl_table arrays (sentinels) which > will reduce the overall build time size of the kernel and run time > memory bloat by ~64 bytes per sentinel (further information Link : > https://lore.kernel.org/all/zo5yx5jfoggi%2f...@bombadil.infradead.org/) > > Remove sentinel from raid_table > > Signed-off-by: Joel Granados > --- > drivers/md/md.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index a104a025084d..3866d8f754a0 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -304,8 +304,7 @@ static struct ctl_table raid_table[] = { > .maxlen = sizeof(int), > .mode = S_IRUGO|S_IWUSR, > .proc_handler = proc_dointvec, > - }, > - { } > + } > }; Please keep "}," as Greg suggested. Otherwise, Acked-by: Song Liu Thanks, Song
Re: [PATCH] treewide: const qualify ctl_tables where applicable
On Thu, Jan 9, 2025 at 5:16 AM Joel Granados wrote: > [...] > drivers/base/firmware_loader/fallback_table.c | 2 +- > drivers/cdrom/cdrom.c | 2 +- > drivers/char/hpet.c | 2 +- > drivers/char/ipmi/ipmi_poweroff.c | 2 +- > drivers/char/random.c | 2 +- > drivers/gpu/drm/i915/i915_perf.c | 2 +- > drivers/gpu/drm/xe/xe_observation.c | 2 +- > drivers/hv/hv_common.c| 2 +- > drivers/infiniband/core/iwcm.c| 2 +- > drivers/infiniband/core/ucma.c| 2 +- > drivers/macintosh/mac_hid.c | 2 +- > drivers/md/md.c | 2 +- For md bits: Reviewed-by: Song Liu Thanks, Song [...]
Re: [PATCH 2/4] bpf: Add dmabuf iterator
On Thu, Apr 17, 2025 at 9:05 AM T.J. Mercier wrote: > > On Wed, Apr 16, 2025 at 9:56 PM Song Liu wrote: > > > > On Wed, Apr 16, 2025 at 7:09 PM T.J. Mercier wrote: > > > > > > On Wed, Apr 16, 2025 at 6:26 PM Song Liu wrote: > > [...] > > > > > > > > Here is another rookie question, it appears to me there is a file > > > > descriptor > > > > associated with each DMA buffer, can we achieve the same goal with > > > > a task-file iterator? > > > > > > That would find almost all of them, but not the kernel-only > > > allocations. (kernel_rss in the dmabuf_dump output I attached earlier. > > > If there's a leak, it's likely to show up in kernel_rss because some > > > driver forgot to release its reference(s).) Also wouldn't that be a > > > ton more iterations since we'd have to visit every FD to find the > > > small portion that are dmabufs? I'm not actually sure if buffers that > > > have been mapped, and then have had their file descriptors closed > > > would show up in task_struct->files; if not I think that would mean > > > scanning both files and vmas for each task. > > > > I don't think scanning all FDs to find a small portion of specific FDs > > is a real issue. We have a tool that scans all FDs in the system and > > only dump data for perf_event FDs. I think it should be easy to > > prototype a tool by scanning all files and all vmas. If that turns out > > to be very slow, which I highly doubt will be, we can try other > > approaches. > > But this will not find *all* the buffers, and that defeats the purpose > of having the iterator. Do you mean this approach cannot get kernel only allocations? If that's the case, we probably do need a separate iterator. I am interested in other folks thoughts on this. > > OTOH, I am wondering whether we can build a more generic iterator > > for a list of objects. Adding a iterator for each important kernel lists > > seems not scalable in the long term. > > I think the wide variety of differences in locking for different > objects would make this difficult to do in a generic way. Agreed it is not easy to build a generic solution. But with the help from BTF, we can probably build something that covers a large number of use cases. Thanks, Song
Re: [PATCH 2/4] bpf: Add dmabuf iterator
On Wed, Apr 16, 2025 at 4:40 PM T.J. Mercier wrote: > > On Wed, Apr 16, 2025 at 4:08 PM Song Liu wrote: > > > > On Wed, Apr 16, 2025 at 3:51 PM T.J. Mercier wrote: > > [...] > > > > > > > > IIUC, the iterator simply traverses elements in a linked list. I feel > > > > it is > > > > an overkill to implement a new BPF iterator for it. > > > > > > Like other BPF iterators such as kmem_cache_iter or task_iter. > > > Cgroup_iter iterates trees instead of lists. This is iterating over > > > kernel objects just like the docs say, "A BPF iterator is a type of > > > BPF program that allows users to iterate over specific types of kernel > > > objects". More complicated iteration should not be a requirement here. > > > > > > > Maybe we simply > > > > use debugging tools like crash or drgn for this? The access with > > > > these tools will not be protected by the mutex. But from my personal > > > > experience, this is not a big issue for user space debugging tools. > > > > > > drgn is *way* too slow, and even if it weren't the dependencies for > > > running it aren't available. crash needs debug symbols which also > > > aren't available on user builds. This is not just for manual > > > debugging, it's for reporting memory use in production. Or anything > > > else someone might care to extract like attachment info or refcounts. > > > > Could you please share more information about the use cases and > > the time constraint here, and why drgn is too slow. Is most of the delay > > comes from parsing DWARF? This is mostly for my curiosity, because > > I have been thinking about using drgn to do some monitoring in > > production. > > > > Thanks, > > Song > > These RunCommands have 10 second timeouts for example. It's rare that > I see them get exceeded but it happens occasionally.: > https://cs.android.com/android/platform/superproject/main/+/main:frameworks/native/cmds/dumpstate/dumpstate.cpp;drc=98bdc04b7658fde0a99403fc052d1d18e7d48ea6;l=2008 Thanks for sharing this information. > The last time I used drgn (admittedly back in 2023) it took over a > minute to iterate through less than 200 cgroups. I'm not sure what the > root cause of the slowness was, but I'd expect the DWARF processing to > be done up-front once and the slowness I experienced was not just at > startup. Eventually I switched over to tracefs for that issue, which > we still use for some telemetry. I haven't tried drgn on Android. On server side, iterating should 200 cgroups should be fairly fast (< 5 seconds, where DWARF parsing is the most expensive part). > Other uses are by statsd for telemetry, memory reporting on app kills > or death, and for "dumpsys meminfo". Here is another rookie question, it appears to me there is a file descriptor associated with each DMA buffer, can we achieve the same goal with a task-file iterator? Thanks, Song
Re: [PATCH 2/4] bpf: Add dmabuf iterator
On Wed, Apr 16, 2025 at 7:09 PM T.J. Mercier wrote: > > On Wed, Apr 16, 2025 at 6:26 PM Song Liu wrote: [...] > > > > Here is another rookie question, it appears to me there is a file descriptor > > associated with each DMA buffer, can we achieve the same goal with > > a task-file iterator? > > That would find almost all of them, but not the kernel-only > allocations. (kernel_rss in the dmabuf_dump output I attached earlier. > If there's a leak, it's likely to show up in kernel_rss because some > driver forgot to release its reference(s).) Also wouldn't that be a > ton more iterations since we'd have to visit every FD to find the > small portion that are dmabufs? I'm not actually sure if buffers that > have been mapped, and then have had their file descriptors closed > would show up in task_struct->files; if not I think that would mean > scanning both files and vmas for each task. I don't think scanning all FDs to find a small portion of specific FDs is a real issue. We have a tool that scans all FDs in the system and only dump data for perf_event FDs. I think it should be easy to prototype a tool by scanning all files and all vmas. If that turns out to be very slow, which I highly doubt will be, we can try other approaches. OTOH, I am wondering whether we can build a more generic iterator for a list of objects. Adding a iterator for each important kernel lists seems not scalable in the long term. Thanks, Song
Re: [PATCH 2/4] bpf: Add dmabuf iterator
On Wed, Apr 16, 2025 at 3:51 PM T.J. Mercier wrote: [...] > > > > IIUC, the iterator simply traverses elements in a linked list. I feel it is > > an overkill to implement a new BPF iterator for it. > > Like other BPF iterators such as kmem_cache_iter or task_iter. > Cgroup_iter iterates trees instead of lists. This is iterating over > kernel objects just like the docs say, "A BPF iterator is a type of > BPF program that allows users to iterate over specific types of kernel > objects". More complicated iteration should not be a requirement here. > > > Maybe we simply > > use debugging tools like crash or drgn for this? The access with > > these tools will not be protected by the mutex. But from my personal > > experience, this is not a big issue for user space debugging tools. > > drgn is *way* too slow, and even if it weren't the dependencies for > running it aren't available. crash needs debug symbols which also > aren't available on user builds. This is not just for manual > debugging, it's for reporting memory use in production. Or anything > else someone might care to extract like attachment info or refcounts. Could you please share more information about the use cases and the time constraint here, and why drgn is too slow. Is most of the delay comes from parsing DWARF? This is mostly for my curiosity, because I have been thinking about using drgn to do some monitoring in production. Thanks, Song
Re: [PATCH 2/4] bpf: Add dmabuf iterator
On Mon, Apr 14, 2025 at 3:53 PM T.J. Mercier wrote: [...] > + > +BTF_ID_LIST_GLOBAL_SINGLE(bpf_dmabuf_btf_id, struct, dma_buf) > +DEFINE_BPF_ITER_FUNC(dmabuf, struct bpf_iter_meta *meta, struct dma_buf > *dmabuf) > + > +static void *dmabuf_iter_seq_start(struct seq_file *seq, loff_t *pos) > +{ > + struct dma_buf *dmabuf, *ret = NULL; > + > + if (*pos) { > + *pos = 0; > + return NULL; > + } > + /* Look for the first buffer we can obtain a reference to. > +* The list mutex does not protect a dmabuf's refcount, so it can be > +* zeroed while we are iterating. Therefore we cannot call > get_dma_buf() > +* since the caller of this program may not already own a reference to > +* the buffer. > +*/ > + mutex_lock(&dmabuf_debugfs_list_mutex); > + list_for_each_entry(dmabuf, &dmabuf_debugfs_list, list_node) { > + if (file_ref_get(&dmabuf->file->f_ref)) { > + ret = dmabuf; > + break; > + } > + } > + mutex_unlock(&dmabuf_debugfs_list_mutex); IIUC, the iterator simply traverses elements in a linked list. I feel it is an overkill to implement a new BPF iterator for it. Maybe we simply use debugging tools like crash or drgn for this? The access with these tools will not be protected by the mutex. But from my personal experience, this is not a big issue for user space debugging tools. Thanks, Song > + > + return ret; > +}
Re: [PATCH 2/4] bpf: Add dmabuf iterator
On Mon, Apr 14, 2025 at 3:53 PM T.J. Mercier wrote: > > The dmabuf iterator traverses the list of all DMA buffers. The list is > maintained only when CONFIG_DEBUG_FS is enabled. > > DMA buffers are refcounted through their associated struct file. A > reference is taken on each buffer as the list is iterated to ensure each > buffer persists for the duration of the bpf program execution without > holding the list mutex. > > Signed-off-by: T.J. Mercier > --- > include/linux/btf_ids.h | 1 + > kernel/bpf/Makefile | 3 + > kernel/bpf/dmabuf_iter.c | 130 +++ > 3 files changed, 134 insertions(+) > create mode 100644 kernel/bpf/dmabuf_iter.c > > diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h > index 139bdececdcf..899ead57d89d 100644 > --- a/include/linux/btf_ids.h > +++ b/include/linux/btf_ids.h > @@ -284,5 +284,6 @@ extern u32 bpf_cgroup_btf_id[]; > extern u32 bpf_local_storage_map_btf_id[]; > extern u32 btf_bpf_map_id[]; > extern u32 bpf_kmem_cache_btf_id[]; > +extern u32 bpf_dmabuf_btf_id[]; This is not necessary. See below. > > #endif > diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile > index 70502f038b92..5b30d37ef055 100644 > --- a/kernel/bpf/Makefile > +++ b/kernel/bpf/Makefile > @@ -53,6 +53,9 @@ obj-$(CONFIG_BPF_SYSCALL) += relo_core.o > obj-$(CONFIG_BPF_SYSCALL) += btf_iter.o > obj-$(CONFIG_BPF_SYSCALL) += btf_relocate.o > obj-$(CONFIG_BPF_SYSCALL) += kmem_cache_iter.o > +ifeq ($(CONFIG_DEBUG_FS),y) > +obj-$(CONFIG_BPF_SYSCALL) += dmabuf_iter.o > +endif > > CFLAGS_REMOVE_percpu_freelist.o = $(CC_FLAGS_FTRACE) > CFLAGS_REMOVE_bpf_lru_list.o = $(CC_FLAGS_FTRACE) > diff --git a/kernel/bpf/dmabuf_iter.c b/kernel/bpf/dmabuf_iter.c > new file mode 100644 > index ..b4b8be1d6aa4 > --- /dev/null > +++ b/kernel/bpf/dmabuf_iter.c Maybe we should add this file to drivers/dma-buf. I would like to hear other folks thoughts on this. > @@ -0,0 +1,130 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* Copyright (c) 2025 Google LLC */ > +#include > +#include > +#include > +#include > +#include > + > +BTF_ID_LIST_GLOBAL_SINGLE(bpf_dmabuf_btf_id, struct, dma_buf) Use BTF_ID_LIST_SINGLE(), then we don't need this in btf_ids.h > +DEFINE_BPF_ITER_FUNC(dmabuf, struct bpf_iter_meta *meta, struct dma_buf > *dmabuf) > + > +static void *dmabuf_iter_seq_start(struct seq_file *seq, loff_t *pos) > +{ > + struct dma_buf *dmabuf, *ret = NULL; > + > + if (*pos) { > + *pos = 0; > + return NULL; > + } > + /* Look for the first buffer we can obtain a reference to. > +* The list mutex does not protect a dmabuf's refcount, so it can be > +* zeroed while we are iterating. Therefore we cannot call > get_dma_buf() > +* since the caller of this program may not already own a reference to > +* the buffer. > +*/ We should use kernel comment style for new code. IOW, /* * Look for ... */ Thanks, Song [...]
Re: [PATCH 3/4] selftests/bpf: Add test for dmabuf_iter
On Mon, Apr 14, 2025 at 3:53 PM T.J. Mercier wrote: > > This test creates a udmabuf and uses a BPF program that prints dmabuf > metadata with the new dmabuf_iter to verify it can be found. > > Signed-off-by: T.J. Mercier > --- [...] > + > + > +static void subtest_dmabuf_iter_check_udmabuf(struct dmabuf_iter *skel) > +{ > + static const char test_buffer_name[] = "udmabuf_test_buffer_for_iter"; > + const size_t test_buffer_size = 10 * getpagesize(); > + > + ASSERT_LE(sizeof(test_buffer_name), DMA_BUF_NAME_LEN, "NAMETOOLONG"); > + > + int memfd = memfd_create("memfd_test", MFD_ALLOW_SEALING); > + ASSERT_OK_FD(memfd, "memfd_create"); Please do not mix variable declaration with the rest of the code. Also, please run checkpatch.pl on the patches. I see a few warnings. > + > + ASSERT_OK(ftruncate(memfd, test_buffer_size), "ftruncate"); > + > + ASSERT_OK(fcntl(memfd, F_ADD_SEALS, F_SEAL_SHRINK), "seal"); > + > + int dev_udmabuf = open("/dev/udmabuf", O_RDONLY); > + ASSERT_OK_FD(dev_udmabuf, "open udmabuf"); > + [...] > + > + ASSERT_TRUE(found_test_udmabuf, "found_test_buffer"); > + > + free(line); > + fclose(iter_file); > + close(iter_fd); > + close(udmabuf); > + close(memfd); > +} > + > +void test_dmabuf_iter(void) > +{ > + struct dmabuf_iter *skel = NULL; > + char buf[256]; > + int iter_fd; > + > + skel = dmabuf_iter__open_and_load(); > + if (!ASSERT_OK_PTR(skel, "dmabuf_iter__open_and_load")) > + return; > + > + if (!ASSERT_OK(dmabuf_iter__attach(skel), "skel_attach")) > + goto destroy; > + > + iter_fd = bpf_iter_create(bpf_link__fd(skel->links.dmabuf_collector)); > + if (!ASSERT_GE(iter_fd, 0, "iter_create")) Use ASSERT_OK_FD here? > + goto destroy; > + > + memset(buf, 0, sizeof(buf)); > + while (read(iter_fd, buf, sizeof(buf) > 0)) { > + /* Read out all contents */ > + } > + [...]
Re: [PATCH 2/4] bpf: Add dmabuf iterator
On Fri, Apr 18, 2025 at 8:25 AM T.J. Mercier wrote: > > On Thu, Apr 17, 2025 at 1:26 PM Song Liu wrote: > > > > On Thu, Apr 17, 2025 at 9:05 AM T.J. Mercier wrote: > > > > > > On Wed, Apr 16, 2025 at 9:56 PM Song Liu wrote: > > > > > > > > On Wed, Apr 16, 2025 at 7:09 PM T.J. Mercier > > > > wrote: > > > > > > > > > > On Wed, Apr 16, 2025 at 6:26 PM Song Liu wrote: > > > > [...] > > > > > > > > > > > > Here is another rookie question, it appears to me there is a file > > > > > > descriptor > > > > > > associated with each DMA buffer, can we achieve the same goal with > > > > > > a task-file iterator? > > > > > > > > > > That would find almost all of them, but not the kernel-only > > > > > allocations. (kernel_rss in the dmabuf_dump output I attached earlier. > > > > > If there's a leak, it's likely to show up in kernel_rss because some > > > > > driver forgot to release its reference(s).) Also wouldn't that be a > > > > > ton more iterations since we'd have to visit every FD to find the > > > > > small portion that are dmabufs? I'm not actually sure if buffers that > > > > > have been mapped, and then have had their file descriptors closed > > > > > would show up in task_struct->files; if not I think that would mean > > > > > scanning both files and vmas for each task. > > > > > > > > I don't think scanning all FDs to find a small portion of specific FDs > > > > is a real issue. We have a tool that scans all FDs in the system and > > > > only dump data for perf_event FDs. I think it should be easy to > > > > prototype a tool by scanning all files and all vmas. If that turns out > > > > to be very slow, which I highly doubt will be, we can try other > > > > approaches. > > > > > > But this will not find *all* the buffers, and that defeats the purpose > > > of having the iterator. > > > > Do you mean this approach cannot get kernel only allocations? If > > that's the case, we probably do need a separate iterator. I am > > interested in other folks thoughts on this. > > Correct. I read more into the code, and realized that udmabuf fd is not for the same file here. I guess this also justifies a separate iterator. Thanks, Song
Re: [PATCH bpf-next v4 5/5] selftests/bpf: Add test for open coded dmabuf_iter
On Thu, May 8, 2025 at 11:21 AM T.J. Mercier wrote: > > Use the same test buffers as the traditional iterator and a new BPF map > to verify the test buffers can be found with the open coded dmabuf > iterator. The way we split 4/5 and 5/5 makes the code tricker to follow. I guess the motivation is to back port default iter along to older kernels. But I think we can still make the code cleaner. > > Signed-off-by: T.J. Mercier > --- [...] > > -static int create_udmabuf(void) > +static int create_udmabuf(int map_fd) > { > struct udmabuf_create create; > int dev_udmabuf; > + bool f = false; > > udmabuf_test_buffer_size = 10 * getpagesize(); > > @@ -63,10 +64,10 @@ static int create_udmabuf(void) > if (!ASSERT_OK(ioctl(udmabuf, DMA_BUF_SET_NAME_B, > udmabuf_test_buffer_name), "name")) > return 1; > > - return 0; > + return bpf_map_update_elem(map_fd, udmabuf_test_buffer_name, &f, > BPF_ANY); We don't really need this bpf_map_update_elem() inside create_udmabuf(), right? > } > > -static int create_sys_heap_dmabuf(void) > +static int create_sys_heap_dmabuf(int map_fd) > { > sysheap_test_buffer_size = 20 * getpagesize(); > > @@ -77,6 +78,7 @@ static int create_sys_heap_dmabuf(void) > .heap_flags = 0, > }; > int heap_fd, ret; > + bool f = false; > > if (!ASSERT_LE(sizeof(sysheap_test_buffer_name), DMA_BUF_NAME_LEN, > "NAMETOOLONG")) > return 1; > @@ -95,18 +97,18 @@ static int create_sys_heap_dmabuf(void) > if (!ASSERT_OK(ioctl(sysheap_dmabuf, DMA_BUF_SET_NAME_B, > sysheap_test_buffer_name), "name")) > return 1; > > - return 0; > + return bpf_map_update_elem(map_fd, sysheap_test_buffer_name, &f, > BPF_ANY); Same for this bpf_map_update_elem(), we can call this directly from create_test_buffers(). > } > > -static int create_test_buffers(void) > +static int create_test_buffers(int map_fd) > { > int ret; > > - ret = create_udmabuf(); > + ret = create_udmabuf(map_fd); > if (ret) > return ret; > > - return create_sys_heap_dmabuf(); > + return create_sys_heap_dmabuf(map_fd); Personally, I would prefer we just merge all the logic of create_udmabuf() and create_sys_heap_dmabuf() into create_test_buffers(). > } > [...] > + > void test_dmabuf_iter(void) > { > struct dmabuf_iter *skel = NULL; > + int iter_fd, map_fd; > char buf[256]; > - int iter_fd; > > skel = dmabuf_iter__open_and_load(); > if (!ASSERT_OK_PTR(skel, "dmabuf_iter__open_and_load")) > return; > > - if (!ASSERT_OK(create_test_buffers(), "create_buffers")) > + map_fd = bpf_map__fd(skel->maps.testbuf_hash); > + if (!ASSERT_OK_FD(map_fd, "map_fd")) > + goto destroy_skel; > + > + if (!ASSERT_OK(create_test_buffers(map_fd), "create_buffers")) > goto destroy; > > if (!ASSERT_OK(dmabuf_iter__attach(skel), "skel_attach")) > @@ -215,10 +246,13 @@ void test_dmabuf_iter(void) > > if (test__start_subtest("default_iter")) > subtest_dmabuf_iter_check_default_iter(skel); > + if (test__start_subtest("open_coded")) > + subtest_dmabuf_iter_check_open_coded(skel, map_fd); > > close(iter_fd); > > destroy: > destroy_test_buffers(); > +destroy_skel: > dmabuf_iter__destroy(skel); > } [...]
Re: [PATCH bpf-next v4 4/5] selftests/bpf: Add test for dmabuf_iter
On Thu, May 8, 2025 at 11:20 AM T.J. Mercier wrote: [...] > + > +void test_dmabuf_iter(void) > +{ > + struct dmabuf_iter *skel = NULL; > + char buf[256]; > + int iter_fd; > + > + skel = dmabuf_iter__open_and_load(); > + if (!ASSERT_OK_PTR(skel, "dmabuf_iter__open_and_load")) > + return; > + > + if (!ASSERT_OK(create_test_buffers(), "create_buffers")) > + goto destroy; > + > + if (!ASSERT_OK(dmabuf_iter__attach(skel), "skel_attach")) > + goto destroy; >From here... > + iter_fd = bpf_iter_create(bpf_link__fd(skel->links.dmabuf_collector)); > + if (!ASSERT_OK_FD(iter_fd, "iter_create")) > + goto destroy; > + > + while (read(iter_fd, buf, sizeof(buf)) > 0) > + ; /* Read out all contents */ > + > + /* Next reads should return 0 */ > + ASSERT_EQ(read(iter_fd, buf, sizeof(buf)), 0, "read"); to here, can be a separate subtest. Then iter_fd can be moved to that subtest. > + > + if (test__start_subtest("default_iter")) > + subtest_dmabuf_iter_check_default_iter(skel); > + > + close(iter_fd); > + > +destroy: > + destroy_test_buffers(); > + dmabuf_iter__destroy(skel); [...]
Re: [PATCH bpf-next v5 5/5] selftests/bpf: Add test for open coded dmabuf_iter
On Mon, May 12, 2025 at 10:41 AM T.J. Mercier wrote: > > Use the same test buffers as the traditional iterator and a new BPF map > to verify the test buffers can be found with the open coded dmabuf > iterator. > > Signed-off-by: T.J. Mercier > Acked-by: Christian König Acked-by: Song Liu With a nitpick below. [...] > > -static int create_test_buffers(void) > +static int create_test_buffers(int map_fd) > { > + bool f = false; > + > udmabuf = create_udmabuf(); > sysheap_dmabuf = create_sys_heap_dmabuf(); > > if (udmabuf < 0 || sysheap_dmabuf < 0) > return -1; > > - return 0; > + return bpf_map_update_elem(map_fd, udmabuf_test_buffer_name, &f, > BPF_ANY) || > + bpf_map_update_elem(map_fd, sysheap_test_buffer_name, &f, > BPF_ANY); nit: Instead of passing map_fd in here, we can just call bpf_map_update_elem() in test_dmabuf_iter() [...]
Re: [PATCH bpf-next v4 1/5] dma-buf: Rename debugfs symbols
On Thu, May 8, 2025 at 11:20 AM T.J. Mercier wrote: > > Rename the debugfs list and mutex so it's clear they are now usable > without the need for CONFIG_DEBUG_FS. The list will always be populated > to support the creation of a BPF iterator for dmabufs. > > Signed-off-by: T.J. Mercier > Reviewed-by: Christian König Acked-by: Song Liu
Re: [PATCH bpf-next v4 5/5] selftests/bpf: Add test for open coded dmabuf_iter
On Fri, May 9, 2025 at 2:43 PM T.J. Mercier wrote: > [...] > > > > Personally, I would prefer we just merge all the logic of > > create_udmabuf() and create_sys_heap_dmabuf() > > into create_test_buffers(). > > That's a lot of different stuff to put in one place. How about > returning file descriptors from the buffer create functions while > having them clean up after themselves: I do like this version better. Some nitpicks though. > > -static int memfd, udmabuf; > +static int udmabuf; About this, and ... > static const char udmabuf_test_buffer_name[DMA_BUF_NAME_LEN] = > "udmabuf_test_buffer_for_iter"; > static size_t udmabuf_test_buffer_size; > static int sysheap_dmabuf; > static const char sysheap_test_buffer_name[DMA_BUF_NAME_LEN] = > "sysheap_test_buffer_for_iter"; > static size_t sysheap_test_buffer_size; > > -static int create_udmabuf(int map_fd) > +static int create_udmabuf(void) > { > struct udmabuf_create create; > - int dev_udmabuf; > - bool f = false; > + int dev_udmabuf, memfd, udmabuf; .. here. It is not ideal to have a global udmabuf and a local udmabuf. If we want the global version, let's rename the local one. [...] > > static int create_test_buffers(int map_fd) > { > - int ret; > + bool f = false; > + > + udmabuf = create_udmabuf(); > + sysheap_dmabuf = create_sys_heap_dmabuf(); > > - ret = create_udmabuf(map_fd); > - if (ret) > - return ret; > + if (udmabuf < 0 || sysheap_dmabuf < 0) > + return -1; We also need destroy_test_buffers() on the error path here, or at the caller. > > - return create_sys_heap_dmabuf(map_fd); > + return bpf_map_update_elem(map_fd, udmabuf_test_buffer_name, > &f, BPF_ANY) || > + bpf_map_update_elem(map_fd, sysheap_test_buffer_name, > &f, BPF_ANY); > } > > static void destroy_test_buffers(void) > { > close(udmabuf); > - close(memfd); > close(sysheap_dmabuf); For the two global fds, let's reset them to -1 right after close(). Thanks, Song
Re: [PATCH bpf-next v5 4/5] selftests/bpf: Add test for dmabuf_iter
On Mon, May 12, 2025 at 10:41 AM T.J. Mercier wrote: [...] > + > +static int udmabuf; static int udmabuf = -1; > +static const char udmabuf_test_buffer_name[DMA_BUF_NAME_LEN] = > "udmabuf_test_buffer_for_iter"; > +static size_t udmabuf_test_buffer_size; > +static int sysheap_dmabuf; static int sysheap_dmabuf = -1; > +static const char sysheap_test_buffer_name[DMA_BUF_NAME_LEN] = > "sysheap_test_buffer_for_iter"; > +static size_t sysheap_test_buffer_size; > + > +static int create_udmabuf(void) > +{ > + struct udmabuf_create create; nit: zero initialize create to be future proof. > + int dev_udmabuf, memfd, local_udmabuf; > + > + udmabuf_test_buffer_size = 10 * getpagesize(); [...] > +static void subtest_dmabuf_iter_check_default_iter(struct dmabuf_iter *skel) > +{ > + bool found_test_sysheap_dmabuf = false; > + bool found_test_udmabuf = false; > + struct DmabufInfo bufinfo; > + size_t linesize = 0; > + char *line = NULL; > + FILE *iter_file; > + int iter_fd, f = INODE; > + > + iter_fd = bpf_iter_create(bpf_link__fd(skel->links.dmabuf_collector)); > + ASSERT_OK_FD(iter_fd, "iter_create"); Should we check ASSERT_OK_FD() and exit early on failures? > + > + iter_file = fdopen(iter_fd, "r"); > + ASSERT_OK_PTR(iter_file, "fdopen"); Same here. [...] > +/* > + * Fields output by this iterator are delimited by newlines. Convert any > + * newlines in user-provided printed strings to spaces. > + */ > +static void sanitize_string(char *src, size_t size) > +{ > + for (char *c = src; c && (size_t)(c - src) < size; ++c) Should this be: for (char *c = src; *c && (size_t)(c - src) < size; ++c) ? Thanks, Song [...]
Re: [PATCH bpf-next v4 2/5] bpf: Add dmabuf iterator
On Thu, May 8, 2025 at 11:20 AM T.J. Mercier wrote: > > The dmabuf iterator traverses the list of all DMA buffers. > > DMA buffers are refcounted through their associated struct file. A > reference is taken on each buffer as the list is iterated to ensure each > buffer persists for the duration of the bpf program execution without > holding the list mutex. > > Signed-off-by: T.J. Mercier > Reviewed-by: Christian König Acked-by: Song Liu With one nitpick below. > --- [...] > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > index 8ff4add71f88..7af2ea839f58 100644 > --- a/include/linux/dma-buf.h > +++ b/include/linux/dma-buf.h > @@ -634,4 +634,6 @@ int dma_buf_vmap(struct dma_buf *dmabuf, struct iosys_map > *map); > void dma_buf_vunmap(struct dma_buf *dmabuf, struct iosys_map *map); > int dma_buf_vmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map); > void dma_buf_vunmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map); > +struct dma_buf *dma_buf_iter_begin(void); > +struct dma_buf *dma_buf_iter_next(struct dma_buf *dmbuf); > #endif /* __DMA_BUF_H__ */ > diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile > index 70502f038b92..3a335c50e6e3 100644 > --- a/kernel/bpf/Makefile > +++ b/kernel/bpf/Makefile > @@ -53,6 +53,9 @@ obj-$(CONFIG_BPF_SYSCALL) += relo_core.o > obj-$(CONFIG_BPF_SYSCALL) += btf_iter.o > obj-$(CONFIG_BPF_SYSCALL) += btf_relocate.o > obj-$(CONFIG_BPF_SYSCALL) += kmem_cache_iter.o > +ifeq ($(CONFIG_DMA_SHARED_BUFFER),y) > +obj-$(CONFIG_BPF_SYSCALL) += dmabuf_iter.o > +endif > > CFLAGS_REMOVE_percpu_freelist.o = $(CC_FLAGS_FTRACE) > CFLAGS_REMOVE_bpf_lru_list.o = $(CC_FLAGS_FTRACE) > diff --git a/kernel/bpf/dmabuf_iter.c b/kernel/bpf/dmabuf_iter.c > new file mode 100644 > index ..96b4ba7f0b2c > --- /dev/null > +++ b/kernel/bpf/dmabuf_iter.c > @@ -0,0 +1,102 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* Copyright (c) 2025 Google LLC */ > +#include > +#include > +#include > +#include > +#include > + > +BTF_ID_LIST_SINGLE(bpf_dmabuf_btf_id, struct, dma_buf) > +DEFINE_BPF_ITER_FUNC(dmabuf, struct bpf_iter_meta *meta, struct dma_buf > *dmabuf) nit: It is better to move these two lines later, to where they are about to be used. > + > +static void *dmabuf_iter_seq_start(struct seq_file *seq, loff_t *pos) > +{ > + if (*pos) > + return NULL; > + > + return dma_buf_iter_begin(); > +} > + [...]
Re: [PATCH bpf-next v4 3/5] bpf: Add open coded dmabuf iterator
On Thu, May 8, 2025 at 11:20 AM T.J. Mercier wrote: > > This open coded iterator allows for more flexibility when creating BPF > programs. It can support output in formats other than text. With an open > coded iterator, a single BPF program can traverse multiple kernel data > structures (now including dmabufs), allowing for more efficient analysis > of kernel data compared to multiple reads from procfs, sysfs, or > multiple traditional BPF iterator invocations. > > Signed-off-by: T.J. Mercier Acked-by: Song Liu With one nitpick below: > --- > kernel/bpf/dmabuf_iter.c | 47 > kernel/bpf/helpers.c | 5 + > 2 files changed, 52 insertions(+) > > diff --git a/kernel/bpf/dmabuf_iter.c b/kernel/bpf/dmabuf_iter.c > index 96b4ba7f0b2c..8049bdbc9efc 100644 > --- a/kernel/bpf/dmabuf_iter.c > +++ b/kernel/bpf/dmabuf_iter.c > @@ -100,3 +100,50 @@ static int __init dmabuf_iter_init(void) > } > > late_initcall(dmabuf_iter_init); > + > +struct bpf_iter_dmabuf { > + /* opaque iterator state; having __u64 here allows to preserve correct > +* alignment requirements in vmlinux.h, generated from BTF > +*/ nit: comment style. > + __u64 __opaque[1]; > +} __aligned(8); > + > +/* Non-opaque version of bpf_iter_dmabuf */ > +struct bpf_iter_dmabuf_kern { > + struct dma_buf *dmabuf; > +} __aligned(8); > + [...]
Re: [PATCH bpf-next v4 4/5] selftests/bpf: Add test for dmabuf_iter
On Thu, May 8, 2025 at 11:20 AM T.J. Mercier wrote: [...] > diff --git a/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c > b/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c > new file mode 100644 > index ..35745f4ce0f8 > --- /dev/null > +++ b/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c > @@ -0,0 +1,224 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* Copyright (c) 2025 Google */ > + > +#include > +#include > +#include > +#include "dmabuf_iter.skel.h" > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include > +#include > +#include > + > +static int memfd, udmabuf; Global fds are weird. AFAICT, we don't really need them to be global? If we really need them to be global, please initialize them to -1, just in case we close(0) by accident. > +static const char udmabuf_test_buffer_name[DMA_BUF_NAME_LEN] = > "udmabuf_test_buffer_for_iter"; > +static size_t udmabuf_test_buffer_size; > +static int sysheap_dmabuf; > +static const char sysheap_test_buffer_name[DMA_BUF_NAME_LEN] = > "sysheap_test_buffer_for_iter"; > +static size_t sysheap_test_buffer_size;
Re: [PATCH bpf-next v6 4/5] selftests/bpf: Add test for dmabuf_iter
On Tue, May 13, 2025 at 9:36 AM T.J. Mercier wrote: > > This test creates a udmabuf, and a dmabuf from the system dmabuf heap, > and uses a BPF program that prints dmabuf metadata with the new > dmabuf_iter to verify they can be found. > > Signed-off-by: T.J. Mercier > Acked-by: Christian König Acked-by: Song Liu With one more comment below. [...] > diff --git a/tools/testing/selftests/bpf/progs/dmabuf_iter.c > b/tools/testing/selftests/bpf/progs/dmabuf_iter.c > new file mode 100644 > index ..2a1b5397196d > --- /dev/null > +++ b/tools/testing/selftests/bpf/progs/dmabuf_iter.c > @@ -0,0 +1,53 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* Copyright (c) 2025 Google LLC */ > +#include > +#include > +#include > + > +/* From uapi/linux/dma-buf.h */ > +#define DMA_BUF_NAME_LEN 32 > + > +char _license[] SEC("license") = "GPL"; > + > +/* > + * Fields output by this iterator are delimited by newlines. Convert any > + * newlines in user-provided printed strings to spaces. > + */ > +static void sanitize_string(char *src, size_t size) > +{ > + for (char *c = src; *c && (size_t)(c - src) < size; ++c) We should do the size check first, right? IOW: for (char *c = src; (size_t)(c - src) < size && *c; ++c) > + if (*c == '\n') > + *c = ' '; > +} > + [...]
Re: [PATCH bpf-next v6 5/5] selftests/bpf: Add test for open coded dmabuf_iter
On Tue, May 13, 2025 at 9:36 AM T.J. Mercier wrote: > > Use the same test buffers as the traditional iterator and a new BPF map > to verify the test buffers can be found with the open coded dmabuf > iterator. > > Signed-off-by: T.J. Mercier > Acked-by: Christian König > Acked-by: Song Liu > --- > .../testing/selftests/bpf/bpf_experimental.h | 5 +++ > .../selftests/bpf/prog_tests/dmabuf_iter.c| 41 +++ > .../testing/selftests/bpf/progs/dmabuf_iter.c | 38 + > 3 files changed, 84 insertions(+) > > diff --git a/tools/testing/selftests/bpf/bpf_experimental.h > b/tools/testing/selftests/bpf/bpf_experimental.h > index 6535c8ae3c46..5e512a1d09d1 100644 > --- a/tools/testing/selftests/bpf/bpf_experimental.h > +++ b/tools/testing/selftests/bpf/bpf_experimental.h > @@ -591,4 +591,9 @@ extern int bpf_iter_kmem_cache_new(struct > bpf_iter_kmem_cache *it) __weak __ksym > extern struct kmem_cache *bpf_iter_kmem_cache_next(struct > bpf_iter_kmem_cache *it) __weak __ksym; > extern void bpf_iter_kmem_cache_destroy(struct bpf_iter_kmem_cache *it) > __weak __ksym; > > +struct bpf_iter_dmabuf; > +extern int bpf_iter_dmabuf_new(struct bpf_iter_dmabuf *it) __weak __ksym; > +extern struct dma_buf *bpf_iter_dmabuf_next(struct bpf_iter_dmabuf *it) > __weak __ksym; > +extern void bpf_iter_dmabuf_destroy(struct bpf_iter_dmabuf *it) __weak > __ksym; > + > #endif > diff --git a/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c > b/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c > index dc740bd0e2bd..6c2b0c3dbcd8 100644 > --- a/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c > +++ b/tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c > @@ -219,14 +219,52 @@ static void > subtest_dmabuf_iter_check_default_iter(struct dmabuf_iter *skel) > close(iter_fd); > } > > +static void subtest_dmabuf_iter_check_open_coded(struct dmabuf_iter *skel, > int map_fd) > +{ > + LIBBPF_OPTS(bpf_test_run_opts, topts); > + char key[DMA_BUF_NAME_LEN]; > + int err, fd; > + bool found; > + > + /* No need to attach it, just run it directly */ > + fd = bpf_program__fd(skel->progs.iter_dmabuf_for_each); > + > + err = bpf_prog_test_run_opts(fd, &topts); > + if (!ASSERT_OK(err, "test_run_opts err")) > + return; > + if (!ASSERT_OK(topts.retval, "test_run_opts retval")) > + return; > + > + if (!ASSERT_OK(bpf_map_get_next_key(map_fd, NULL, key), "get next > key")) > + return; > + > + do { > + ASSERT_OK(bpf_map_lookup_elem(map_fd, key, &found), "lookup"); > + ASSERT_TRUE(found, "found test buffer"); This check failed once in the CI, on s390: Error: #89/3 dmabuf_iter/open_coded 9309 subtest_dmabuf_iter_check_open_coded:PASS:test_run_opts err 0 nsec 9310 subtest_dmabuf_iter_check_open_coded:PASS:test_run_opts retval 0 nsec 9311 subtest_dmabuf_iter_check_open_coded:PASS:get next key 0 nsec 9312 subtest_dmabuf_iter_check_open_coded:PASS:lookup 0 nsec 9313 subtest_dmabuf_iter_check_open_coded:FAIL:found test buffer unexpected found test buffer: got FALSE But it passed in the rerun. It is probably a bit flakey. Maybe we need some barrier somewhere. Here is the failure: https://github.com/kernel-patches/bpf/actions/runs/15002058808/job/42234864754 To see the log, you need to log in GitHub. Thanks, Song > + } while (bpf_map_get_next_key(map_fd, key, key)); > +} [...]