the first page fault trigger, so,
any other access will not enter page fault.
Suggested-by: Vivek Kasireddy
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 26 --
1 file changed, 24 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers
as "pgcnt".
make sure pass self test.
remove v1's patch4
v3
https://lore.kernel.org/all/20240813090518.3252469-1-l...@vivo.com/
v2
https://lore.kernel.org/all/20240805032550.3912454-1-l...@vivo.com/
v1
https://lore.kernel.org/all/20240801104512.4056860-1-l...@vivo.com/
H
cation for any size and does
not affect the performance of kmalloc allocations.
Signed-off-by: Huan Yang
Acked-by: Christian König
Acked-by: Vivek Kasireddy
---
drivers/dma-buf/udmabuf.c | 26 +-
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/d
t use page array to map, instead, use pfn array.
By this, we removed page usage in udmabuf totally.
Suggested-by: Vivek Kasireddy
Signed-off-by: Huan Yang
---
drivers/dma-buf/Kconfig | 1 +
drivers/dma-buf/udmabuf.c | 22 +++---
2 files changed, 16 insertions(+), 7 deletions(-
pgcnt*8, may waste some memory when use large folio.
The access of array is faster than list, also, if 4K, array can also
save memory than list.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 80 ++-
1 file changed, 37 insertions(+), 43 deletions(-)
, folios may use vmalloc to get memory, which can't
cache but return into pcp(or buddy) when vfree. So, each pin may waste
some time in folios array alloc.
This patch also reuse of folios when iter create head, just use max size
of item.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c
in mmap,
when creating a large size udmabuf, this represents a considerable
overhead.
This patch fill vma area with pfn when the first page fault trigger, so,
any other access will not enter page fault.
Suggested-by: Vivek Kasireddy
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 26
se of folios when iter create head, just use max size
of item.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 165 +++---
1 file changed, 101 insertions(+), 64 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index 0bbc9df
array record each folio is ok.
Compare udmabuf_folio 24 byte, folio array is 8 byte. Even if array need
to be pgcnt*8, may waste some memory when use large folio.
The access of array is faster than list, also, if 4K, array can also
save memory than list.
Signed-off-by: Huan Yang
---
drivers/dm
vo.com/
v3
https://lore.kernel.org/all/20240813090518.3252469-1-l...@vivo.com/
v2
https://lore.kernel.org/all/20240805032550.3912454-1-l...@vivo.com/
v1
https://lore.kernel.org/all/20240801104512.4056860-1-l...@vivo.com/
Huan Yang (7):
udmabuf: pre-fault when first page fault
udmabuf: chan
table, and then pre-fault each pfn
into vma, when first access. Should know, if anything wrong when
pre-fault, will not report it's error, else, report when task access it
at the first time.
Suggested-by: Vivek Kasireddy
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c
cation for any size and does
not affect the performance of kmalloc allocations.
Signed-off-by: Huan Yang
Acked-by: Christian König
Acked-by: Vivek Kasireddy
---
drivers/dma-buf/udmabuf.c | 26 +-
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/d
t use page array to map, instead, use pfn array.
By this, we removed page usage in udmabuf totally.
Suggested-by: Vivek Kasireddy
Signed-off-by: Huan Yang
Acked-by: Vivek Kasireddy
---
drivers/dma-buf/Kconfig | 1 +
drivers/dma-buf/udmabuf.c | 22 +++---
2 files changed, 16 ins
. if reach to pgcnt or nr_folios, end of loop.
By this, more readable.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 132 --
1 file changed, 71 insertions(+), 61 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index
.
This patch give a helper function when init and deinit, by this,
deduce duplicate code.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 52 +++
1 file changed, 31 insertions(+), 21 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf
e can iterate through the folios array during release and
unpin any folio that is different from the ones previously accessed.
By this, not only saves the overhead of the udmabuf_folio data structure
but also makes array access more cache-friendly.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udma
size is huge, need fallback into vmalloc, then, not well, due to
each page will iter alloc, and map into vmalloc area. Too heavy.
Now that we need to iter each udmabuf item, then pin it's range folios,
we can reuse the maximum size range's folios array.
Signed-off-by: Huan Yang
---
d
iterates through folios, while the inner loop correctly
sets the folio and corresponding offset into the udmabuf starting from
the offset. if reach to pgcnt or nr_folios, end of loop.
By this, more readable.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 132
t the first time.
Suggested-by: Vivek Kasireddy
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 35 +--
1 file changed, 33 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index 047c3cd2ceff..0a8c231a36e1 1
correctness
in the commit messages of other patches as well.
I'll fix it in next-version
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 52 +++
1 file changed, 31 insertions(+), 21 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dm
cture
but also makes array access more cache-friendly.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 65 +--
1 file changed, 29 insertions(+), 36 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index 254d9ec
, then pin it's range folios,
we can reuse the maximum size range's folios array.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 34 --
1 file changed, 20 insertions(+), 14 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf
, end of loop.
By this, more readable.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 132 --
1 file changed, 71 insertions(+), 61 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index 456db58446e1..ca2b21c5c57f
ttps://lore.kernel.org/all/20240805032550.3912454-1-l...@vivo.com/
v1
https://lore.kernel.org/all/20240801104512.4056860-1-l...@vivo.com/
Huan Yang (7):
udmabuf: pre-fault when first page fault
udmabuf: change folios array from kmalloc to kvmalloc
udmabuf: fix vmap_udmabuf error page set
udmabuf:
truly accessed
Suggested-by: Vivek Kasireddy
Signed-off-by: Huan Yang
Acked-by: Vivek Kasireddy
---
drivers/dma-buf/udmabuf.c | 33 +++--
1 file changed, 31 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index
.
This patch give a helper function when init and deinit, by this,
reduce duplicate code.
Signed-off-by: Huan Yang
Acked-by: Vivek Kasireddy
---
drivers/dma-buf/udmabuf.c | 52 +++
1 file changed, 31 insertions(+), 21 deletions(-)
diff --git a/drivers/dma-buf
t use page array to map, instead, use pfn array.
By this, we removed page usage in udmabuf totally.
Suggested-by: Vivek Kasireddy
Signed-off-by: Huan Yang
Acked-by: Vivek Kasireddy
---
drivers/dma-buf/Kconfig | 1 +
drivers/dma-buf/udmabuf.c | 22 +++---
2 files changed, 16 ins
cation for any size and does
not affect the performance of kmalloc allocations.
Signed-off-by: Huan Yang
Acked-by: Christian König
Acked-by: Vivek Kasireddy
---
drivers/dma-buf/udmabuf.c | 26 +-
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/d
traversal process.
By this, more readable.
Suggested-by: Vivek Kasireddy
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 134 +-
1 file changed, 76 insertions(+), 58 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
size is huge, need fallback into vmalloc, then, not well, due to
each page need alloc, and map into vmalloc area. Too heavy.
Now that we need to iter each udmabuf item, then pin it's range folios,
we can reuse the maximum size range's folios array.
Signed-off-by: Huan Yang
Acked
e can iterate through the folios array during release and
unpin any folio that is different from the ones previously accessed.
By this, not only saves the overhead of the udmabuf_folio data structure
but also makes array access more cache-friendly.
Signed-off-by: Huan Yang
Acked-by: Vivek Kasi
_heap_name);
printf("\n-\n");
clock_gettime(CLOCK_MONOTONIC, &ts_start);
dmabuf_heap_test(type, dmabuf_heap_name);
clock_gettime(CLOCK_MONOTONIC, &ts_end);
start = ts_start.tv_sec * 10 + ts_start.tv_nsec;
Example for DMA_HEAP_IOCTL_ALLOC_AND_READ used in system_heap.
By this, it will both alloc memory and trigger IO to load file
into each batched allocated memory.
Signed-off-by: Huan Yang
---
drivers/dma-buf/heaps/system_heap.c | 53 ++---
1 file changed, 49 insertions
file_fd which you want to load into dma-buf, then,
it promise if you got a dma-buf fd, it will contains the file content.
Notice, file_fd depends on user how to open this file. So, both buffer
I/O and Direct I/O is supported.
Signed-off-by: Huan Yang
---
drivers/dma-buf/dma-heap.c| 525
Hi Christian,
Thanks for your reply.
在 2024/7/11 17:00, Christian König 写道:
Am 11.07.24 um 09:42 schrieb Huan Yang:
Some user may need load file into dma-buf, current
way is:
1. allocate a dma-buf, get dma-buf fd
2. mmap dma-buf fd into vaddr
3. read(file_fd, vaddr, fsz)
This is too
Hi Christian,
在 2024/7/11 19:39, Christian König 写道:
Am 11.07.24 um 11:18 schrieb Huan Yang:
Hi Christian,
Thanks for your reply.
在 2024/7/11 17:00, Christian König 写道:
Am 11.07.24 um 09:42 schrieb Huan Yang:
Some user may need load file into dma-buf, current
way is:
1. allocate a dma
在 2024/7/12 9:59, Huan Yang 写道:
Hi Christian,
在 2024/7/11 19:39, Christian König 写道:
Am 11.07.24 um 11:18 schrieb Huan Yang:
Hi Christian,
Thanks for your reply.
在 2024/7/11 17:00, Christian König 写道:
Am 11.07.24 um 09:42 schrieb Huan Yang:
Some user may need load file into dma-buf
Hi Christian,
在 2024/7/12 15:10, Christian König 写道:
Am 12.07.24 um 04:14 schrieb Huan Yang:
在 2024/7/12 9:59, Huan Yang 写道:
Hi Christian,
在 2024/7/11 19:39, Christian König 写道:
Am 11.07.24 um 11:18 schrieb Huan Yang:
Hi Christian,
Thanks for your reply.
在 2024/7/11 17:00, Christian
在 2024/7/12 15:41, Christian König 写道:
Am 12.07.24 um 09:29 schrieb Huan Yang:
Hi Christian,
在 2024/7/12 15:10, Christian König 写道:
Am 12.07.24 um 04:14 schrieb Huan Yang:
在 2024/7/12 9:59, Huan Yang 写道:
Hi Christian,
在 2024/7/11 19:39, Christian König 写道:
Am 11.07.24 um 11:18 schrieb
在 2024/7/12 18:59, Christian König 写道:
Am 12.07.24 um 09:52 schrieb Huan Yang:
在 2024/7/12 15:41, Christian König 写道:
Am 12.07.24 um 09:29 schrieb Huan Yang:
Hi Christian,
在 2024/7/12 15:10, Christian König 写道:
Am 12.07.24 um 04:14 schrieb Huan Yang:
在 2024/7/12 9:59, Huan Yang 写道:
Hi
I just research the udmabuf, Please correct me if I'm wrong.
在 2024/7/15 20:32, Christian König 写道:
Am 15.07.24 um 11:11 schrieb Daniel Vetter:
On Thu, Jul 11, 2024 at 11:00:02AM +0200, Christian König wrote:
Am 11.07.24 um 09:42 schrieb Huan Yang:
Some user may need load file into dm
在 2024/7/16 17:31, Daniel Vetter 写道:
[你通常不会收到来自 daniel.vet...@ffwll.ch 的电子邮件。请访问
https://aka.ms/LearnAboutSenderIdentification,以了解这一点为什么很重要]
On Tue, Jul 16, 2024 at 10:48:40AM +0800, Huan Yang wrote:
I just research the udmabuf, Please correct me if I'm wrong.
在 2024/7/15 20:32, Chri
does
not affect the performance of kmalloc allocations.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index 047c3cd2ceff..71182a6d35b1 100644
--- a/driver
在 2024/7/16 20:07, Christian König 写道:
Am 16.07.24 um 11:31 schrieb Daniel Vetter:
On Tue, Jul 16, 2024 at 10:48:40AM +0800, Huan Yang wrote:
I just research the udmabuf, Please correct me if I'm wrong.
在 2024/7/15 20:32, Christian König 写道:
Am 15.07.24 um 11:11 schrieb Daniel Vetter
在 2024/7/18 1:03, Christoph Hellwig 写道:
[Some people who received this message don't often get email from
h...@infradead.org. Learn why this is important at
https://aka.ms/LearnAboutSenderIdentification ]
On Wed, Jul 17, 2024 at 05:15:07PM +0200, Daniel Vetter wrote:
I'm talking about memfd
在 2024/7/18 11:08, Christoph Hellwig 写道:
[Some people who received this message don't often get email from
h...@infradead.org. Learn why this is important at
https://aka.ms/LearnAboutSenderIdentification ]
On Thu, Jul 18, 2024 at 09:51:39AM +0800, Huan Yang wrote:
Yes, actually, if dm
在 2024/7/18 1:03, Christoph Hellwig 写道:
copy_file_range only work inside the same file system anyway, so
it is completely irrelevant here.
What should work just fine is using sendfile (or splice if you like it
complicated) to write TO the dma buf. That just iterates over the page
cache on the
cation for any size and does
not affect the performance of kmalloc allocations.
Signed-off-by: Huan Yang
---
Changelog:
v2 -> v1: rebase, change offset and mempin folio array use kvmalloc,
change description.
drivers/dma-buf/udmabuf.c | 24
1 file changed, 12 insertio
9b70db2e-e562-4771-be6b-1fa8df19e...@amd.com/
[7]
https://patchew.org/linux/20230209102954.528942-1-dhowe...@redhat.com/20230209102954.528942-7-dhowe...@redhat.com/
[8] https://lore.kernel.org/all/20240711074221.459589-1-l...@vivo.com/
[9] https://lore.kernel.org/all/5ccbe705-883c-4651-9e66-6b452c
ation. When the allocate_async_read ops heap
is not implemented, it will wait for the dma-buf to be allocated before
reading the file (sync).
Signed-off-by: Huan Yang
---
drivers/dma-buf/dma-heap.c | 14 ++
include/linux/dma-heap.h | 8 ++--
2 files changed, 16 insertions(+), 6 dele
.
Therefore, the larger the size of the file that needs to be read, the
greater the corresponding benefits will be.
Signed-off-by: Huan Yang
---
drivers/dma-buf/heaps/system_heap.c | 70 +++--
1 file changed, 66 insertions(+), 4 deletions(-)
diff --git a/drivers/dma-buf
file read work is executed serially. Considering that the default
I/O amount initiated at a time is 128MB, which is already quite large,
multiple threads will not help accelerate I/O performance.
So, this is more suit for large size file read into dma-buf.
Signed-off-by: Huan Yang
---
drivers
he file size must be page aligned.
Therefore, for the user, len and file_fd are mutually exclusive,
and they are combined using a union.
Once the user obtains the dma-buf fd, the dma-buf directly contains the
file content.
Signed-off-by: Huan Yang
---
drivers/dm
ff-by: Huan Yang
---
drivers/dma-buf/dma-heap.c | 8 ++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c
index df1b2518f126..2b69cf3ca570 100644
--- a/drivers/dma-buf/dma-heap.c
+++ b/drivers/dma-buf/dma-heap.c
@@ -417,6 +4
在 2024/7/30 16:03, Christian König 写道:
Am 30.07.24 um 09:57 schrieb Huan Yang:
Background
Some user may need load file into dma-buf, current way is:
1. allocate a dma-buf, get dma-buf fd
2. mmap dma-buf fd into user vaddr
3. read(file_fd, vaddr, fsz)
Due to dma-buf user map
在 2024/7/30 16:37, Christian König 写道:
Am 30.07.24 um 10:14 schrieb Huan Yang:
在 2024/7/30 16:03, Christian König 写道:
Am 30.07.24 um 09:57 schrieb Huan Yang:
Background
Some user may need load file into dma-buf, current way is:
1. allocate a dma-buf, get dma-buf fd
2. mmap dma
在 2024/7/30 16:56, Daniel Vetter 写道:
[? daniel.vet...@ffwll.ch ?
https://aka.ms/LearnAboutSenderIdentification?]
On Tue, Jul 30, 2024 at 03:57:44PM +0800, Huan Yang wrote:
UDMA-BUF step:
1. memfd_create
2. open file(buffer/direct)
3. udmabuf create
4
在 2024/7/30 18:42, Christian König 写道:
Am 30.07.24 um 11:05 schrieb Huan Yang:
在 2024/7/30 16:56, Daniel Vetter 写道:
[? daniel.vet...@ffwll.ch ?
https://aka.ms/LearnAboutSenderIdentification?]
On Tue, Jul 30, 2024 at 03:57:44PM +0800, Huan Yang wrote:
UDMA-BUF
在 2024/7/30 18:43, Christian König 写道:
Am 30.07.24 um 10:46 schrieb Huan Yang:
在 2024/7/30 16:37, Christian König 写道:
Am 30.07.24 um 10:14 schrieb Huan Yang:
在 2024/7/30 16:03, Christian König 写道:
Am 30.07.24 um 09:57 schrieb Huan Yang:
Background
Some user may need load file into
在 2024/7/30 17:05, Huan Yang 写道:
在 2024/7/30 16:56, Daniel Vetter 写道:
[? daniel.vet...@ffwll.ch ?
https://aka.ms/LearnAboutSenderIdentification?]
On Tue, Jul 30, 2024 at 03:57:44PM +0800, Huan Yang wrote:
UDMA-BUF step:
1. memfd_create
2. open file(buffer
在 2024/7/31 1:19, T.J. Mercier 写道:
On Tue, Jul 30, 2024 at 1:14 AM Huan Yang wrote:
在 2024/7/30 16:03, Christian König 写道:
Am 30.07.24 um 09:57 schrieb Huan Yang:
Background
Some user may need load file into dma-buf, current way is:
1. allocate a dma-buf, get dma-buf fd
2
在 2024/7/30 21:11, Christian König 写道:
Am 30.07.24 um 13:36 schrieb Huan Yang:
Either drop the whole approach or change udmabuf to do what you
want to do.
OK, if so, do I need to send a patch to make dma-buf support sendfile?
Well the udmabuf approach doesn't need to use sendfile,
dedicated to the allocation and
deallocation of udmabuf_folio.This is expected to improve the
performance of allocation and deallocation within the expected range,
while also avoiding memory waste.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 18 +++---
1 file changed, 15
dedicated to the allocation and
deallocation of udmabuf_folio.This is expected to improve the
performance of allocation and deallocation within the expected range,
while also avoiding memory waste.
Signed-off-by: Huan Yang
---
v2 -> v1: fix double unregister, remove unlikely
drivers/dma-
在 2024/7/31 14:26, Huan Yang 写道:
The current udmabuf_folio contains a list_head and the corresponding
folio pointer, with a size of 24 bytes. udmabuf_folio uses kmalloc to
allocate memory.
However, kmalloc is a public pool, starting from 64 bytes. This means
that each udmabuf_folio allocation
,
while also avoiding memory waste.
Signed-off-by: Huan Yang
---
v3 -> v2: fix error description.
v2 -> v1: fix double unregister, remove unlikely.
drivers/dma-buf/udmabuf.c | 19 +++
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drive
page, and then map into vmalloc area
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index 047c3cd2ceff..6604d91e7072 100644
--- a/drivers/dma-buf/udmabuf.c
+++ b/dr
nto loop variables.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 40 ---
1 file changed, 21 insertions(+), 19 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index 6604d91e7072..0285194e6b51 100644
--- a/drivers/dma-
在 2024/8/1 1:11, Christophe JAILLET 写道:
[Some people who received this message don't often get email from
christophe.jail...@wanadoo.fr. Learn why this is important at
https://aka.ms/LearnAboutSenderIdentification ]
Le 31/07/2024 à 09:37, Huan Yang a écrit :
The current udmabuf_
在 2024/8/1 4:46, Daniel Vetter 写道:
On Tue, Jul 30, 2024 at 08:04:04PM +0800, Huan Yang wrote:
在 2024/7/30 17:05, Huan Yang 写道:
在 2024/7/30 16:56, Daniel Vetter 写道:
[? daniel.vet...@ffwll.ch ?
https://aka.ms/LearnAboutSenderIdentification?]
On Tue, Jul 30, 2024
cation for any size and does
not affect the performance of kmalloc allocations.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 26 +-
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index d69aea
.
Or else, try alloc a new sgt, and cmpxchg to set it.
When the swap fails, it means that another process has set sg correctly.
Therefore, we reuse the new sg. If trigger by device, need invoke map to
sync it.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmab
access to virtual addresses needs to trap into
kernel mode.
Therefore, when creating a large size udmabuf, this represents a
considerable overhead.
Therefore, the current patch removes the page fault method of mmap and
instead fills it directly when mmap is triggered.
Signed-off-by: Huan Yang
page, and then map into vmalloc area
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 17 +
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index a915714c5dce..7ed532342d7f 100644
--- a/drivers/dma-buf/udma
* PAGESIZE = sum(folios_size(folios[i])) i=0->nr_folios
pagecount * PAGESIZE = sum(item_size[i]) i=0, item_count (do not
record)
item_offset use to record each memfd offset if exist, else 0.
Huan Yang (5):
udmabuf: cancel mmap page fault, direct map it
udmabuf: change folios array from
s, we can accept the overhead of the udmabuf_folio structure
and the performance loss of traversing the list during unpinning.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 149 +-
1 file changed, 66 insertions(+), 83 deletions(-)
diff --git a/drive
在 2024/8/1 18:50, Christian König 写道:
Am 01.08.24 um 12:45 schrieb Huan Yang:
The current udmabuf mmap uses a page fault mechanism to populate the
vma.
However, the current udmabuf has already obtained and pinned the folio
upon completion of the creation.This means that the physical memory
arious data structures in udmabuf have the
following corresponding relationships:
pagecount * PAGESIZE = sum(folios_size(folios[i])) i=0->nr_folios
pagecount * PAGESIZE = sum(item_size[i]) i=0, item_count (do not
record)
item_offset use to record each memfd offset if exist, else 0.
Huan Y
page, and then map into vmalloc area
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index af2391cea0bf..9737f063b6b3 100644
--- a/drivers/dma-buf/udmabuf.c
+++ b/dr
access to virtual addresses needs to trap into
kernel mode.
Therefore, when creating a large size udmabuf, this represents a
considerable overhead.
The current patch removes the page fault method of mmap and
instead fills it directly when mmap is triggered.
Signed-off-by: Huan Yang
---
drivers/dma
cation for any size and does
not affect the performance of kmalloc allocations.
Signed-off-by: Huan Yang
Acked-by: Christian König
---
drivers/dma-buf/udmabuf.c | 26 +-
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/d
ns such as "pgcnt".
make sure pass self test.
remove v1's patch4
v1
https://lore.kernel.org/all/20240801104512.4056860-1-l...@vivo.com/
Huan Yang (4):
udmabuf: cancel mmap page fault, direct map it
udmabuf: change folios array from kmalloc to kvmalloc
fix vmap_udmabuf erro
g the list during unpinning.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 167 ++
1 file changed, 61 insertions(+), 106 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index 9737f063b6b3..442ed99d8b33 100644
--- a/dr
allocations.
Signed-off-by: Huan Yang
Acked-by: Christian König
---
drivers/dma-buf/udmabuf.c | 26 +-
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index 475268d4ebb1..af2391cea0bf 100644
--- a/driv
But, ubuf->folios is only contain's the folio's head page.
That mean we repeatedly mapped the folio head page to the vmalloc area.
This patch fix it, set each folio's page correct, so that pages array
contains right page, and then map into vmalloc area
Signed-off-by: Huan Yang
---
not work if THP is enabled.
Considering the existence of HVO, I also feel the need to find further
optimization methods.
Thanks.
Thanks,
Vivek
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 167 ++
1 file changed, 61 insertions(+), 106
sound, but it may not save memory, only increase
context switch overhead.
whether opengl is available in the environment or not.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 39 ---
1 file changed, 16 insertions(+), 23 deletions(-)
diff --
make sure pass self test.
remove v1's patch4
v2
https://lore.kernel.org/all/20240805032550.3912454-1-l...@vivo.com/
v1
https://lore.kernel.org/all/20240801104512.4056860-1-l...@vivo.com/
Huan Yang (5):
udmabuf: cancel mmap page fault, direct map it
udmabuf: change folios array
t use page array to map, instead, use pfn array.
Signed-off-by: Huan Yang
Suggested-by: Vivek Kasireddy
---
drivers/dma-buf/udmabuf.c | 22 +++---
1 file changed, 15 insertions(+), 7 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index 3ec72d47bb
a considerable
overhead.
The current patch removes the page fault method of mmap and
instead fills pfn directly when mmap is triggered.
Signed-off-by: Huan Yang
Suggested-by: Vivek Kasireddy
---
drivers/dma-buf/udmabuf.c | 37 +++--
1 file changed, 15 insertions
e can iterate through the folios array during release and
unpin any folio that is different from the ones previously accessed.
By this, not only saves the overhead of the udmabuf_folio data structure
but also makes array access more cache-friendly.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udma
patch iterates through folios, while the inner
loop correctly sets the folio and corresponding offset into the udmabuf
starting from the offset. if reach to pgcnt or nr_folios, end of loop.
By this, more readable.
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 65
cation for any size and does
not affect the performance of kmalloc allocations.
Signed-off-by: Huan Yang
Acked-by: Christian König
Acked-by: Vivek Kasireddy
---
drivers/dma-buf/udmabuf.c | 26 +-
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/d
mmap,
when creating a large size udmabuf, this represents a considerable
overhead.
The current patch removes the page fault method of mmap and
instead fills pfn directly when mmap is triggered.
Signed-off-by: Huan Yang
Suggested-by: Vivek Kasireddy
---
drivers/dma-buf/udmabuf.c | 37
>folios is only contain's the folio's head page.
That mean we repeatedly mapped the folio head page to the vmalloc area.
Due to udmabuf can use hugetlb, if HVO enabled, tail page may not exist,
so, we can't use page array to map, instead, use pfn array.
Signed-off-by: Huan Ya
folio to the unpin_list.
The outer loop of this patch iterates through folios, while the inner
loop correctly sets the folio and corresponding offset into the udmabuf
starting from the offset. if reach to pgcnt or nr_folios, end of loop.
By this, more readable.
Signed-off-by: Huan Yang
for memfds backed by shmem, but I suspect
this may not work if THP is enabled.
Thanks,
Vivek
By this, not only saves the overhead of the udmabuf_folio data structure
but also makes array access more cache-friendly.
Signed-off-by: Huan Yang
---
drivers/dm
f: convert udmabuf driver to use folios")
OK, I'll update it
Thanks,
Vivek
Suggested-by: Vivek Kasireddy
Signed-off-by: Huan Yang
Acked-by: Vivek Kasireddy
---
drivers/dma-buf/Kconfig | 1 +
drivers/dma-buf/udmabuf.c | 22 +++---
2 files changed, 16 inser
modifies
the original loop condition, using the pinned folio as the external
loop condition, and sets the offset and folio during the traversal process.
By this, more readable.
Suggested-by: Vivek Kasireddy
Signed-off-by: Huan Yang
---
drivers/dma-buf/udmabuf.c | 134
74914-1-l...@vivo.com/
v3
https://lore.kernel.org/all/20240813090518.3252469-1-l...@vivo.com/
v2
https://lore.kernel.org/all/20240805032550.3912454-1-l...@vivo.com/
v1
https://lore.kernel.org/all/20240801104512.4056860-1-l...@vivo.com/
Huan Yang (7):
udmabuf: pre-fault when first pa
1 - 100 of 133 matches
Mail list logo