Re: [PATCH RFC v5 00/21] DEPT(Dependency Tracker)

2022-03-20 Thread Byungchul Park
On Fri, Mar 18, 2022 at 04:49:45PM +0900, Byungchul Park wrote:
> On Wed, Mar 16, 2022 at 11:39:19PM -0400, Theodore Ts'o wrote:
> > On Wed, Mar 16, 2022 at 11:26:12AM +0900, Byungchul Park wrote:
> > > I'm gonna re-add RFC for a while at Ted's request. But hard testing is
> > > needed to find false alarms for now that there's no false alarm with my
> > > system. I'm gonna look for other systems that might produce false
> > > alarms. And it'd be appreciated if you share it when you see any alarms
> > > with yours.
> > 
> > Is dept1.18_on_v5.17-rc7 roughly equivalent to the v5 version sent to
> 
> Yes.
> 
> > the list.  The commit date is March 16th, so I assume it was.  I tried
> > merging it with the ext4 dev branch, and tried enabling CONFIG_DEPT
> > and running xfstests.  The result was nearly test failing, because a 
> > DEPT warning.
> > 
> > I assume that this is due to some misconfiguration of DEPT on my part?
> 
> I guess it was becasue of the commit b1fca27d384e8("kernel debug:
> support resetting WARN*_ONCE"). Your script seems to reset WARN*_ONCE
> repeatedly.
> 
> But, yeah. It's *too much* that Dept warns it on the lack of pools. I
> will switch it to just pr_warn_once().
> 
> Plus, I will implement a new functionality to expand pools to prevent
> facing the situation in advance.
> 
> > And I'm curious why DEPT_WARN_ONCE is apparently getting many, many
> > times?
> > 
> > [  760.990409] DEPT_WARN_ONCE: Pool(ecxt) is empty.
> > [  770.319656] DEPT_WARN_ONCE: Pool(ecxt) is empty.
> > [  772.460360] DEPT_WARN_ONCE: Pool(ecxt) is empty.
> > [  784.039676] DEPT_WARN_ONCE: Pool(ecxt) is empty.
> > 
> > (and this goes on over and over...)
> > 
> > Here's the full output of the DEPT warning from trying to run
> > generic/001.  There is a similar warning for generic/002, generic/003,
> > etc., for a total of 468 failures out of 495 tests run.
> 
> Sorry for the noise. I will prevent this as described above.
> 
> > [  760.945068] run fstests generic/001 at 2022-03-16 08:16:53
> > [  760.985440] [ cut here ]
> > [  760.990409] DEPT_WARN_ONCE: Pool(ecxt) is empty.
> > [  760.995166] WARNING: CPU: 1 PID: 73369 at kernel/dependency/dept.c:297 
> > from_pool+0xc2/0x110
> > [  761.003915] CPU: 1 PID: 73369 Comm: bash Tainted: GW 
> > 5.17.0-rc7-xfstests-00649-g5456f2312272 #520
> > [  761.014389] Hardware name: Google Google Compute Engine/Google Compute 
> > Engine, BIOS Google 01/01/2011
> > [  761.024363] RIP: 0010:from_pool+0xc2/0x110
> > [  761.028598] Code: 3d 32 62 96 01 00 75 c2 48 6b db 38 48 c7 c7 00 94 f1 
> > ad 48 89 04 24 c6 05 1a 62 96 01 01 48 8b b3 20 9a 2f ae e8 2f dd bf 00 
> > <0f> 0b 48 8b 04 24 eb 98 48 63 c2 48 0f af 86 28 9a 2f ae 48 03 86
> > [  761.048189] RSP: 0018:a7ce4425fd48 EFLAGS: 00010086
> > [  761.053617] RAX:  RBX: 00a8 RCX: 
> > 
> > [  761.060965] RDX: 0001 RSI: adfb95e0 RDI: 
> > 
> > [  761.068322] RBP: 001dc598 R08:  R09: 
> > a7ce4425fb90
> > [  761.075789] R10: fffe0aa0 R11: fffe0ae8 R12: 
> > 9768e07f0600
> > [  761.083063] R13:  R14: 0246 R15: 
> > 
> > [  761.090312] FS:  7fd4ecc4c740() GS:97699940() 
> > knlGS:
> > [  761.098623] CS:  0010 DS:  ES:  CR0: 80050033
> > [  761.104580] CR2: 563c61657eb0 CR3: 0001328fa001 CR4: 
> > 003706e0
> > [  761.111921] DR0:  DR1:  DR2: 
> > 
> > [  761.119171] DR3:  DR6: fffe0ff0 DR7: 
> > 0400
> > [  761.126617] Call Trace:
> > [  761.129175]  
> > [  761.131385]  add_ecxt+0x54/0x1c0
> > [  761.134736]  ? simple_attr_write+0x87/0x100
> > [  761.139063]  dept_event+0xaa/0x1d0
> > [  761.142687]  ? simple_attr_write+0x87/0x100
> > [  761.147089]  __mutex_unlock_slowpath+0x60/0x2d0
> > [  761.151866]  simple_attr_write+0x87/0x100
> > [  761.155997]  debugfs_attr_write+0x40/0x60
> > [  761.160124]  vfs_write+0xec/0x390
> > [  761.163557]  ksys_write+0x68/0xe0
> > [  761.167004]  do_syscall_64+0x43/0x90
> > [  761.170782]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [  761.176204] RIP: 0033:0x7fd4ecd3df33
> > [  761.180010] Code: 8b 15 61 ef 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff 
> > eb b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 
> > <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
> > [  761.199551] RSP: 002b:7ffe772d4808 EFLAGS: 0246 ORIG_RAX: 
> > 0001
> > [  761.207240] RAX: ffda RBX: 0002 RCX: 
> > 7fd4ecd3df33
> > [  761.214583] RDX: 0002 RSI: 563c61657eb0 RDI: 
> > 0001
> > [  761.221835] RBP: 563c61657eb0 R08: 000a R09: 
> > 0001
> > [  761.229537] R10: 563c61902240 R11: 0246 R12: 
> > 0

Re: [PATCH v4 00/24] DEPT(Dependency Tracker)

2022-03-20 Thread Byungchul Park
On Fri, Mar 18, 2022 at 04:51:29PM +0900, Byungchul Park wrote:
> On Wed, Mar 16, 2022 at 09:30:02AM +, Hyeonggon Yoo wrote:
> > On Wed, Mar 16, 2022 at 01:32:13PM +0900, Byungchul Park wrote:
> > > On Sat, Mar 12, 2022 at 01:53:26AM +, Hyeonggon Yoo wrote:
> > > > On Fri, Mar 04, 2022 at 04:06:19PM +0900, Byungchul Park wrote:
> > > > > Hi Linus and folks,
> > > > > 
> > > > > I've been developing a tool for detecting deadlock possibilities by
> > > > > tracking wait/event rather than lock(?) acquisition order to try to
> > > > > cover all synchonization machanisms. It's done on v5.17-rc1 tag.
> > > > > 
> > > > > https://github.com/lgebyungchulpark/linux-dept/commits/dept1.14_on_v5.17-rc1
> > > > >
> > > > 
> > > > Small feedback unrelated to thread:
> > > > I'm not sure "Need to expand the ring buffer" is something to call
> > > > WARN(). Is this stack trace useful for something?
> > > > 
> > > > 
> > > > Hello Byungchul. These are two warnings of DEPT on system.
> > > 
> > > Hi Hyeonggon,
> > > 
> > > Could you run scripts/decode_stacktrace.sh and share the result instead
> > > of the raw format below if the reports still appear with PATCH v5? It'd
> > > be appreciated (:
> > >
> > 
> > Hi Byungchul.
> > 
> > on dept1.18_on_v5.17-rc7, the kernel_clone() warning has gone.
> > There is one warning remaining on my system:
> > 
> > It warns when running kunit-try-catch-test testcase.
> 
> Hi Hyeonggon,
> 
> I can reproduce it thanks to you. I will let you know on all works done.

Hi Hyeonggon,

All works wrt this issue have been done. I've just updated the same
branch.

https://github.com/lgebyungchulpark/linux-dept/commits/dept1.18_on_v5.17-rc7

This is just for your information.

Thanks,
Byungchul

> 
> Thanks,
> Byungchul
> 
> > ===
> > DEPT: Circular dependency has been detected.
> > 5.17.0-rc7+ #4 Not tainted
> > ---
> > summary
> > ---
> > *** AA DEADLOCK ***
> > 
> > context A
> > [S] (unknown)(&try_completion:0)
> > [W] wait_for_completion_timeout(&try_completion:0)
> > [E] complete(&try_completion:0)
> > 
> > [S]: start of the event context
> > [W]: the wait blocked
> > [E]: the event not reachable
> > ---
> > context A's detail
> > ---
> > context A
> > [S] (unknown)(&try_completion:0)
> > [W] wait_for_completion_timeout(&try_completion:0)
> > [E] complete(&try_completion:0)
> > 
> > [S] (unknown)(&try_completion:0):
> > (N/A)
> > 
> > [W] wait_for_completion_timeout(&try_completion:0):
> > kunit_try_catch_run (lib/kunit/try-catch.c:78 (discriminator 1)) 
> > stacktrace:
> > dept_wait (kernel/dependency/dept.c:2149) 
> > wait_for_completion_timeout (kernel/sched/completion.c:119 (discriminator 
> > 4) kernel/sched/completion.c:165 (discriminator 4)) 
> > kunit_try_catch_run (lib/kunit/try-catch.c:78 (discriminator 1)) 
> > kunit_test_try_catch_successful_try_no_catch (lib/kunit/kunit-test.c:43) 
> > kunit_try_run_case (lib/kunit/test.c:333 lib/kunit/test.c:374) 
> > kunit_generic_run_threadfn_adapter (lib/kunit/try-catch.c:30) 
> > kthread (kernel/kthread.c:379) 
> > ret_from_fork (arch/arm64/kernel/entry.S:757)
> > 
> > [E] complete(&try_completion:0):
> > kthread_complete_and_exit (kernel/kthread.c:327) 
> > stacktrace:
> > dept_event (kernel/dependency/dept.c:2376 (discriminator 2)) 
> > complete (kernel/sched/completion.c:33 (discriminator 4)) 
> > kthread_complete_and_exit (kernel/kthread.c:327) 
> > kunit_try_catch_throw (lib/kunit/try-catch.c:18) 
> > kthread (kernel/kthread.c:379) 
> > ret_from_fork (arch/arm64/kernel/entry.S:757) 
> > 
> > ---
> > information that might be helpful
> > ---
> > Hardware name: linux,dummy-virt (DT)
> > Call trace:
> > dump_backtrace.part.0 (arch/arm64/kernel/stacktrace.c:186) 
> > show_stack (arch/arm64/kernel/stacktrace.c:193) 
> > dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) 
> > dump_stack (lib/dump_stack.c:114) 
> > print_circle (./arch/arm64/include/asm/atomic_ll_sc.h:112 
> > ./arch/arm64/include/asm/atomic.h:30 
> > ./include/linux/atomic/atomic-arch-fallback.h:511 
> > ./include/linux/atomic/atomic-instrumented.h:258 
> > kernel/dependency/dept.c:140 kernel/dependency/dept.c:748) 
> > cb_check_dl (kernel/dependency/dept.c:1083 kernel/dependency/dept.c:1064) 
> > bfs (kernel/dependency/dept.c:833) 
> > add_dep (kernel/dependency/dept.c:1409) 
> > do_event (kernel/dependency/dept.c:175 kernel/dependency/dept.c:1644) 
> > dept_event (kernel/dependency/dept.c:2376 (discriminator 2)) 
> > complete (kernel/sched/completion.c:33 (discriminator 4)) 
> > kthread_complete_and_exit (kernel/kthread.c:327) 
> > kunit_try_catch_throw (lib/kunit/try-catch.c:18) 
> > kthread (kernel/kthread.c:379) 
> > ret_fr

Re: [PATCH v7 3/7] drm/i915: Prepare for multiple GTs

2022-03-20 Thread Andi Shyti
Hi Michal,

[...]

> > +static int intel_gt_tile_setup(struct intel_gt *gt, phys_addr_t phys_addr)
> > +{
> > +   int ret;
> > +
> > +   if (!gt_is_root(gt)) {
> > +   struct intel_uncore_mmio_debug *mmio_debug;
> > +   struct intel_uncore *uncore;
> > +
> > +   uncore = kzalloc(sizeof(*uncore), GFP_KERNEL);
> > +   if (!uncore)
> > +   return -ENOMEM;
> > +
> > +   mmio_debug = kzalloc(sizeof(*mmio_debug), GFP_KERNEL);
> > +   if (!mmio_debug) {
> > +   kfree(uncore);
> > +   return -ENOMEM;
> > +   }
> > +
> > +   gt->uncore = uncore;
> > +   gt->uncore->debug = mmio_debug;
> > +
> > +   __intel_gt_init_early(gt);
> > +   }
> > +
> > +   intel_uncore_init_early(gt->uncore, gt);
> > +
> > +   ret = intel_uncore_setup_mmio(gt->uncore, phys_addr);
> > +   if (ret)
> > +   return ret;
> 
> (little guessing as in this patch we don't have non-root gt yet)
> 
> if the future, when we will be doing setup of non-root gt, if we return
> here then likely we will leak both uncore/mmio_debug as gt will not be
> added to i915->gts thus it will not be visible in for_each_gt loop used
> to release/cleanup all gts.
> 
> since in above code you are doing cleanup in case of kzalloc failure,
> same should be done in case of mmio setup failure.

that's a good point. In the next patch I am going to add support
for the first multitile platform and, because it's too old to
remember, I had a look and I think this part is not properly
managed.

Thanks for the note!

Andi


Re: [PATCH v2] drm: Fix a infinite loop condition when order becomes 0

2022-03-20 Thread Arunpravin Paneer Selvam



On 16/03/22 5:01 pm, Matthew Auld wrote:
> On 16/03/2022 06:34, Arunpravin Paneer Selvam wrote:
>> handle a situation in the condition order-- == min_order,
>> when order = 0 and min_order = 0, leading to order = -1,
>> it now won't exit the loop. To avoid this problem,
>> added a order check in the same condition, (i.e)
>> when order is 0, we return -ENOSPC
>>
>> v2: use full name in email program and in Signed-off tag
>>
>> Signed-off-by: Arunpravin Paneer Selvam 
>> ---
>>   drivers/gpu/drm/drm_buddy.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
>> index 72f52f293249..5ab66aaf2bbd 100644
>> --- a/drivers/gpu/drm/drm_buddy.c
>> +++ b/drivers/gpu/drm/drm_buddy.c
>> @@ -685,7 +685,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
>>  if (!IS_ERR(block))
>>  break;
>>   
>> -if (order-- == min_order) {
>> +if (!order || order-- == min_order) {
> 
> It shouldn't be possible to enter an infinite loop here, without first 
> tripping up the BUG_ON(order < min_order) further up, and for that, as 
> we discussed here[1], it sounded like the conclusion was to rather add a 
> simple check somewhere in drm_buddy_alloc_blocks() to reject any size 
> not aligned to the min_page_size?

I sent a patch adding a check to reject any size not aligned to the
min_page_size. Please review
> 
> [1] 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F477414%2F%3Fseries%3D101108%26rev%3D1&data=04%7C01%7CArunpravin.PaneerSelvam%40amd.com%7C51e60632b18847b73da808da074085d6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637830270945242954%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=wVVqKzjUoL6jL00uxXiXZz7%2FzJQeuC%2BuBdB8hPKqQao%3D&reserved=0
> 
>>  err = -ENOSPC;
>>  goto err_free;
>>  }
>>
>> base-commit: 3bd60c0259406c5ca3ce5cdc958fb910ad4b8175


[PATCH] drm: add a check to verify the size alignment

2022-03-20 Thread Arunpravin Paneer Selvam
add a simple check to reject any size not aligned to the
min_page_size.

Signed-off-by: Arunpravin Paneer Selvam 
---
 drivers/gpu/drm/drm_buddy.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 72f52f293249..b503c88786b0 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -661,6 +661,9 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
if (range_overflows(start, size, mm->size))
return -EINVAL;
 
+   if (WARN_ON(!IS_ALIGNED(size, min_page_size)))
+   return -EINVAL;
+
/* Actual range allocation */
if (start + size == end)
return __drm_buddy_alloc_range(mm, start, size, blocks);
-- 
2.25.1



Re: [PATCH] drm: Fix a infinite loop condition when order becomes 0

2022-03-20 Thread Arunpravin Paneer Selvam



On 16/03/22 12:28 pm, Paul Menzel wrote:
> Dear Arunprivin,
> 
> 
> Am 16.03.22 um 07:49 schrieb Arunpravin Paneer Selvam:
> 
>> On 15/03/22 9:14 pm, Paul Menzel wrote:
> 
>>> Am 15.03.22 um 16:42 schrieb Arunpravin:
>>>
 On 15/03/22 2:35 pm, Paul Menzel wrote:
>>>
> Am 15.03.22 um 10:01 schrieb Arunpravin:
>
>> On 15/03/22 1:49 pm, Paul Menzel wrote:
>
>>> Am 14.03.22 um 20:40 schrieb Arunpravin:
 handle a situation in the condition order-- == min_order,
 when order = 0, leading to order = -1, it now won't exit
 the loop. To avoid this problem, added a order check in
 the same condition, (i.e) when order is 0, we return
 -ENOSPC

 Signed-off-by: Arunpravin 
>>>
>>> Please use your full name.
>> okay
>
> You might also configure that in your email program.
 yes
>>>
>>> Not done yet though. ;-)
>>>
>> done in v2 :)
 ---
  drivers/gpu/drm/drm_buddy.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
 index 72f52f293249..5ab66aaf2bbd 100644
 --- a/drivers/gpu/drm/drm_buddy.c
 +++ b/drivers/gpu/drm/drm_buddy.c
>>>
>>> In what tree is that file?
>>>
>> drm-tip - 
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm-tip%2Ftree%2F&data=04%7C01%7Carunpravin.paneerselvam%40amd.com%7C439b31d360ef495ab13408da071a6e1f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637830107357395422%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Z8KNmbUXmhk0xA8z7yHJN2j%2BRJ5VwpuMXww21mrC8x8%3D&reserved=0
>> drm-misc-next - 
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Ftree%2F&data=04%7C01%7Carunpravin.paneerselvam%40amd.com%7C439b31d360ef495ab13408da071a6e1f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637830107357395422%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Mwqy6NVTiR%2FoHFpLvXnQdE95kHoJJUEiig0Juz37ATQ%3D&reserved=0
>>>
>>> Thank Outlook. Now everybody feels safe.
>>>
 @@ -685,7 +685,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
if (!IS_ERR(block))
break;
  
 -  if (order-- == min_order) {
 +  if (!order || order-- == min_order) {
err = -ENOSPC;
goto err_free;
}
>
> Thank you for the hint. So the whole function is:
>
>   do {
>   order = min(order, (unsigned int)fls(pages) - 1);
>   BUG_ON(order > mm->max_order);
>   BUG_ON(order < min_order);
>
>   do {
>   if (flags & DRM_BUDDY_RANGE_ALLOCATION)
>   /* Allocate traversing within the range */
>   block = alloc_range_bias(mm, start, end, order);
>   else
>   /* Allocate from freelist */
>   block = alloc_from_freelist(mm, order, flags);
>
>   if (!IS_ERR(block))
>   break;
>
>   if (order-- == min_order) {
>   err = -ENOSPC;
>   goto err_free;
>   }
>   } while (1);
>
>   mark_allocated(block);
>   mm->avail -= drm_buddy_block_size(mm, block);
>   kmemleak_update_trace(block);
>   list_add_tail(&block->link, &allocated);
>
>   pages -= BIT(order);
>
>   if (!pages)
>   break;
>   } while (1);
>
> Was the BUG_ON triggered for your case?
>
>   BUG_ON(order < min_order);
 no, this BUG_ON is not triggered for this bug
>
> Please give more details.

 there is a chance when there is no space to allocate, order value
 decrements and reaches to 0 at one point, here we should exit the loop,
 otherwise, further order value decrements to -1 and do..while loop
 doesn't exit. Hence added a check to exit the loop if order value becomes 
 0.
>>>
>>> Sorry, I do not see it. How can that be with order ≥ min_order and the
>>> check `order-- == min_order`? Is min_order 0? Please explain that in the
>>> next commit message.
>>>
>> please check v2, yes when min_order is 0, the above said situation may
>> occur.And, since the order is unsigned int, I think it will not trigger
>> the BUG_ON(order < min_order) when order becomes -1. Hence I think we
>> needed a check !order to exit the loop.
> 
> Thank you f

Re: [PATCH v2] drm: Fix a infinite loop condition when order becomes 0

2022-03-20 Thread Arunpravin Paneer Selvam



On 16/03/22 6:02 pm, Christian König wrote:
> Am 16.03.22 um 12:31 schrieb Matthew Auld:
>> On 16/03/2022 06:34, Arunpravin Paneer Selvam wrote:
>>> handle a situation in the condition order-- == min_order,
>>> when order = 0 and min_order = 0, leading to order = -1,
>>> it now won't exit the loop. To avoid this problem,
>>> added a order check in the same condition, (i.e)
>>> when order is 0, we return -ENOSPC
>>>
>>> v2: use full name in email program and in Signed-off tag
>>>
>>> Signed-off-by: Arunpravin Paneer Selvam 
>>> 
>>> ---
>>>   drivers/gpu/drm/drm_buddy.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
>>> index 72f52f293249..5ab66aaf2bbd 100644
>>> --- a/drivers/gpu/drm/drm_buddy.c
>>> +++ b/drivers/gpu/drm/drm_buddy.c
>>> @@ -685,7 +685,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
>>>   if (!IS_ERR(block))
>>>   break;
>>>   -    if (order-- == min_order) {
>>> +    if (!order || order-- == min_order) {
>>
>> It shouldn't be possible to enter an infinite loop here, without first 
>> tripping up the BUG_ON(order < min_order) further up, and for that, as 
>> we discussed here[1], it sounded like the conclusion was to rather add 
>> a simple check somewhere in drm_buddy_alloc_blocks() to reject any 
>> size not aligned to the min_page_size?
> 
> Sounds good to me as well.
> 
> And just to make it clear: I don't review the functionality here, that's 
> up to you guys.
> 
> Just giving my Ack on things like style and pushing the end result 
> upstream as necessary.
> 
> So let me know when you have settled on a solution.

I sent a drm buddy patch adding a simple check to verify the size
aligned to the min_page_size.

In amdgpu part patch, I have a check if size is not aligned to the
min_page_size, we are enabling the is_contiguous flag, therefore the
size value round up to the power of two and allocated, Later, the
allocated block trimmed to the original requested size.

> 
> Regards,
> Christian.
> 
>>
>> [1] 
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F477414%2F%3Fseries%3D101108%26rev%3D1&data=04%7C01%7CArunpravin.PaneerSelvam%40amd.com%7Cf03d93b31c7d47b1389c08da074915cf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637830307722869478%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=i3Pw3Go9Niu20Z4CF0IiWKPgTYTBPyK0SJYVq02fx0A%3D&reserved=0
>>
>>>   err = -ENOSPC;
>>>   goto err_free;
>>>   }
>>>
>>> base-commit: 3bd60c0259406c5ca3ce5cdc958fb910ad4b8175
> 


Re: [PATCH] drm: add a check to verify the size alignment

2022-03-20 Thread Paul Menzel

Dear Arunpravin,


Am 21.03.22 um 06:59 schrieb Arunpravin Paneer Selvam:

add a simple check to reject any size not aligned to the
min_page_size.


Nit: I’d start sentences with a capital letter.

Could you please add a summary of the discussion to the commit message, 
so the question “Why?” is answered?



Kind regards,

Paul



Signed-off-by: Arunpravin Paneer Selvam 
---
  drivers/gpu/drm/drm_buddy.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 72f52f293249..b503c88786b0 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -661,6 +661,9 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
if (range_overflows(start, size, mm->size))
return -EINVAL;
  
+	if (WARN_ON(!IS_ALIGNED(size, min_page_size)))

+   return -EINVAL;
+
/* Actual range allocation */
if (start + size == end)
return __drm_buddy_alloc_range(mm, start, size, blocks);


Re: [Intel-gfx] Small bar recovery vs compressed content on DG2

2022-03-20 Thread Thomas Hellström



On 3/18/22 19:12, Daniel Vetter wrote:

Maybe also good to add dri-devel to these discussions.

I'm not sure where exactly we landed with dgpu error capture (maybe I
should check the code but it's really w/e here), but I think we can
also toss in "you need a non-recoverable context for error capture to
work on dgpu".


Error capture now works even with multiple pipelined migrations simply 
by looking at the vma_resource for the metadata snapshot when the 
request was queued instead of the vma for metadata, so no additional 
restrictions. (Also fixed up the gfp mode and completely avoided the 
contiguous allocations for the page directories).



Since that simplifies things even more. Maybe Thomas
forgot to add that to the list of restrictions.

Anyway on the "we can't capture lmem-only compressed buffers", I think
that's totally fine. Those are for render targets, and we don't
capture those. Adding Lionel and Ken to confirm.

OK.

-Daniel

On Fri, 18 Mar 2022 at 17:26, Bloomfield, Jon  wrote:

@Thomas Hellström - I agree :-)

My question was really to @Joonas Lahtinen, who was saying we could always 
migrate in the CPU fault handler. I am pushing back on that unless we have no 
choice. It's the very complication we were trying to avoid with the current 
SAS. If that's what's needed, then so be it. But I'm asking whether we can 
instead handle this specially, instead of adding generic complexity to the 
primary code paths.

Jon


-Original Message-
From: Thomas Hellström 
Sent: Friday, March 18, 2022 2:48 AM
To: Bloomfield, Jon ; Joonas Lahtinen
; Intel Graphics Development ; Auld, Matthew ; C,
Ramalingam ; Vetter, Daniel

Subject: Re: Small bar recovery vs compressed content on DG2

Hi,

On Thu, 2022-03-17 at 18:21 +, Bloomfield, Jon wrote:

+@Vetter, Daniel

Let's not start re-inventing this on the fly again. That's how we got
into trouble in the past. The SAS/Whitepaper does currently require
the SMEM+LMEM placement for mappable, for good reasons.

Just to avoid any misunderstandings here:

We have two hard requirements from Arch that clash, main problem is
compressed bos can't be captured on error with current designs.

 From an engineering point of view we can do little more than list
options available to resolve this and whether they are hard or not so
hard to implemement. But IMHO Arch needs to agree on what's got to
give.

Thanks,
Thomas



We cannot 'always migrate to mappable in the fault handler'. Or at
least, this is not as trivial as it is to write in a sentence due to
the need to spill out other active objects, and all the usual
challenges with context synchronization etc. It is possible, perhaps
with a lot of care, but it is challenging to guarantee, easy to
break, and not needed for 99.9% of software. We are trying to
simplify our driver stack.

If we need a special mechanism for debug, we should devise a special
mechanism, not throw out the general LMEM+SMEM requirement. Are

there

any identified first-class clients that require such access, or is it
only debugging tools?

If only debug, then why can't the tool use a copy engine submission
to access the data in place? Or perhaps a bespoke ioctl to access
this via the KMD (and kmd submitted copy-engine BB)?

Thanks,

Jon


-Original Message-
From: Thomas Hellström 
Sent: Thursday, March 17, 2022 2:35 AM
To: Joonas Lahtinen ; Bloomfield,
Jon
; Intel Graphics Development ; Auld, Matthew ;
C,
Ramalingam 
Subject: Re: Small bar recovery vs compressed content on DG2

On Thu, 2022-03-17 at 10:43 +0200, Joonas Lahtinen wrote:

Quoting Thomas Hellström (2022-03-16 09:25:16)

Hi!

Do we somehow need to clarify in the headers the semantics for
this?

  From my understanding when discussing the CCS migration series
with
Ram, the kernel will never do any resolving (compressing /
decompressing) migrations or evictions which basically implies
the
following:

*) Compressed data must have LMEM only placement, otherwise the

GPU

would read garbage if accessing from SMEM.

This has always been the case, so it should be documented in the
uAPI
headers and kerneldocs.


*) Compressed data can't be assumed to be mappable by the CPU,
because
in order to ensure that on small BAR, the placement needs to be
LMEM+SMEM.

Not strictly true, as we could always migrate to the mappable
region
in
the CPU fault handler. Will need the same set of tricks as with
limited
mappable GGTT in past.

In addition to Matt's reply:

Yes, if there is sufficient space. I'm not sure we want to
complicate
this to migrate only part of the buffer to mappable on a fault
basis?
Otherwise this is likely to fail.

One option is to allow cpu-mapping from SYSTEM like TTM is doing
for
evicted buffers, even if SYSTEM is not in the placement list, and
then
migrate back to LMEM for gpu access.

But can user-space even interpret the compressed data when CPU-
mapping?
without access to the CCS metadata?


*) Neither can compressed data be part of a CAPTURE buffer,
because
that
re