Re: [PATCH v2 00/11] Remove device private pages from physical address space

Jordan Niethe Wed, 14 Jan 2026 07:27:55 -0800

Hi,

On 9/1/26 17:22, Matthew Brost wrote:

On Fri, Jan 09, 2026 at 12:27:50PM +1100, Jordan Niethe wrote:

Hi
On 9/1/26 11:31, Matthew Brost wrote:

On Fri, Jan 09, 2026 at 11:01:13AM +1100, Jordan Niethe wrote:

Hi,


On 8/1/26 16:42, Jordan Niethe wrote:

Hi,

On 8/1/26 13:25, Jordan Niethe wrote:

Hi,

On 8/1/26 05:36, Matthew Brost wrote:


Thanks for the series. For some reason Intel's CI couldn't apply this
series to drm-tip to get results [1]. I'll manually apply this
and run all
our SVM tests and get back you on results + review the changes here. For
future reference if you want to use our CI system, the series must apply
to drm-tip, feel free to rebase this series and just send to intel-xe
list if you want CI


Thanks, I'll rebase on drm-tip and send to the intel-xe list.


For reference the rebase on drm-tip on the intel-xe list:

https://patchwork.freedesktop.org/series/159738/

Will watch the CI results.


The series causes some failures in the intel-xe tests:
https://patchwork.freedesktop.org/series/159738/#rev4

Working through the failures now.


Yea, I saw the failures. I haven't had time look at the patches on my
end quite yet. Scrabling to get a few things in 6.20/7.0 PR, so I may
not have bandwidth to look in depth until mid next week but digging is
on my TODO list.


Sure, that's completely fine. The failures seem pretty directly related to
the
series so I think I'll be able to make good progress.

For example 
https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-159738v4/bat-bmg-2/igt@[email protected]

It looks like I missed that xe_pagemap_destroy_work() needs to be updated to
remove the call to devm_release_mem_region() now we are no longer reserving
a mem
region.


+1

So this is the one I’d be most concerned about [1].
xe_exec_system_allocator is our SVM test, which does almost all the
ridiculous things possible in user space to stress SVM. It’s blowing up
in the core MM—but the source of the bug could be anywhere (e.g., Xe
SVM, GPU SVM, migrate device layer, or core MM). I’ll try to help when I
have bandwidth.

Matt

[1] 
https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-159738v4/shard-bmg-9/igt@xe_exec_system_alloca...@threads-many-large-execqueues-free-nomemset.html


A similar fault in lruvec_stat_mod_folio can be repro'd if

memremap_device_private_pagemap() is called with NUMA_NO_NODE instead of(say)

numa_node_id() for the nid parameter.

The xe_svm driver uses devm_memremap_device_private_pagemap() which uses

dev_to_node() for the nid parameter. Suspect this is causing somethingsimilar

to happen.

When memremap_pages() calls pagemap_range() we have the following logic:

        if (nid < 0)
                nid = numa_mem_id();

I think we might need to add this to memremap_device_private_pagemap()to handle

the NUMA_NO_NODE case. Still confirming.

Thanks,
Jordan.



Thanks,
Jordan.


Matt

Thanks,
Jordan.


Thanks,
Jordan.


Jordan.


I was also wondering if Nvidia could help review one our core MM patches
[2] which is gating enabling 2M device pages too?

Matt

[1] https://patchwork.freedesktop.org/series/159738/
[2] https://patchwork.freedesktop.org/patch/694775/?series=159119&rev=1

Re: [PATCH v2 00/11] Remove device private pages from physical address space

Reply via email to