On 3/24/25 23:14, Christian König wrote:
> Am 24.03.25 um 12:23 schrieb Bert Karwatzki:
>> Am Sonntag, dem 23.03.2025 um 17:51 +1100 schrieb Balbir Singh:
>>> On 3/22/25 23:23, Bert Karwatzki wrote:
>>>> ...
>>>> So why is use_dma32 enabled with nokaslr?
On 3/24/25 22:23, Bert Karwatzki wrote:
> Am Sonntag, dem 23.03.2025 um 17:51 +1100 schrieb Balbir Singh:
>> On 3/22/25 23:23, Bert Karwatzki wrote:
>>> The problem occurs in this part of ttm_tt_populate(), in the nokaslr case
>>> the loop is entered and repeatedly
On 3/27/25 21:53, Ingo Molnar wrote:
>
> * Balbir Singh wrote:
>
>>> Yes, turning off CONFIG_HSA_AMD_SVM fixes the issue, the strange memory
>>> resource
>>> afe-aff : :03:00.0
>>> is gone.
>>>
>>> If one wou
On 3/27/25 09:58, Linus Torvalds wrote:
> On Wed, 26 Mar 2025 at 15:00, Bert Karwatzki wrote:
>>
>> As Balbir Singh found out this memory comes from amdkfd
>> (kgd2kfd_init_zone_device()) with CONFIG_HSA_AMD_SVM=y. The memory gets
>> placed
>> by devm_request_f
On 3/26/25 21:10, Bert Karwatzki wrote:
> Am Mittwoch, dem 26.03.2025 um 12:50 +1100 schrieb Balbir Singh:
>> On 3/26/25 10:43, Balbir Singh wrote:
>>> On 3/26/25 10:21, Bert Karwatzki wrote:
>>>> Am Mittwoch, dem 26.03.2025 um 09:45 +1100 schrieb Balbir Singh:
>&
On 3/26/25 10:43, Balbir Singh wrote:
> On 3/26/25 10:21, Bert Karwatzki wrote:
>> Am Mittwoch, dem 26.03.2025 um 09:45 +1100 schrieb Balbir Singh:
>>>
>>>
>>> The second region seems to be additional, I suspect that is HMM mapping
>>> from kg
On 3/26/25 10:21, Bert Karwatzki wrote:
> Am Mittwoch, dem 26.03.2025 um 09:45 +1100 schrieb Balbir Singh:
>>
>>
>> The second region seems to be additional, I suspect that is HMM mapping from
>> kgd2kfd_init_zone_device()
>>
>> Balbir Singh
&g
On 3/25/25 18:35, Christian König wrote:
> Am 24.03.25 um 23:48 schrieb Balbir Singh:
>>>> lspci -v reports 8G of memory at 0xfc so I assmumed that is the
>>>> GPU RAM.
>>>> 03:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 23
&
On 3/25/25 19:36, Christian König wrote:
> Am 25.03.25 um 00:07 schrieb Bert Karwatzki:
>> Here's the dmesg from linux-next-6.14-rc7-next20250321 (CONFIG_PCI_P2PDMA
>> not set)
>> The memory ranges of (afe-aff) or
>> (3ffe-3fff) are
>> mentioned in neither of them.
&amdgpu_bo_driver, adev->dev,
> adev_to_drm(adev)->anon_inode->i_mapping,
> adev_to_drm(adev)->vma_offset_manager,
> adev->need_swiotlb,
> false /* use_dma32 */);
> if (r) {
> DRM_ERROR("failed initializing buffer object driver(%d).\n", r);
> return r;
> }
>
I think this brings us really close, instead of forcing use_dma32 to false, I
wonder if we need something like
uin64_t dma_bits = fls64(dma_get_mask(adev->dev));
to ttm_device_init, pass the last argument (use_dma32) as dma_bits < 32?
Thanks,
Balbir Singh
On 3/22/25 19:04, Ingo Molnar wrote:
>
> * Balbir Singh wrote:
>
>>> How frequently does this happen and what is the impact to users if
>>> this happens?
>>
>> It happens a 100% of the time when the BAR space lies beyond the
>> 10TiB region.
>
On 3/21/25 23:26, Bert Karwatzki wrote:
>>>
>>
>> I am not an expert in amdgpu or gtt_mgr, but I wonder if some of the deletes
>> are coming
>> from forceful eviction of memory during allocation?
>>
>> Have you filed a bug report for the nokaslr
On 3/21/25 21:24, Ingo Molnar wrote:
>
> * Balbir Singh wrote:
>
>> On 3/20/25 20:01, Ingo Molnar wrote:
>>>
>>> * Balbir Singh wrote:
>>>
>>>> On 3/17/25 00:09, Bert Karwatzki wrote:
>>>>> This is related to the admgpu
laris (just opening the game the closing it again from the
> game menu)
>
> The findings are:
> (a) The size of the RB tree is the same in the working and non-working case
> (50-
> 60)
> (b) The number of calls to amdgpu_gtt_mgr_new() is ~2000 in both cases
> (c) In the non-working case amdgpu_gtt_mgr_del() is called far more often then
> in the working case:
> Non-working case (cmdline: nokaslr) 834 calls to amdgpu_gtt_mgt_del()
> Working case (cmdline: nokaslr amdgpu.vramlimit=512) 51 calls to
> amdgpu_gtt_mgr_del()
> Working case (cmdline: no additional arguments) 44 calls to
> amdgpu_gtt_mgr_del()
>
I am not an expert in amdgpu or gtt_mgr, but I wonder if some of the deletes
are coming
from forceful eviction of memory during allocation?
Have you filed a bug report for the nokaslr case?
Balbir Singh
On 3/20/25 20:01, Ingo Molnar wrote:
>
> * Balbir Singh wrote:
>
>> On 3/17/25 00:09, Bert Karwatzki wrote:
>>> This is related to the admgpu.gttsize. My laptop has the maximum amount
>>> of memory (64G) and usually gttsize is half of main memory size. I just
On 3/17/25 00:09, Bert Karwatzki wrote:
> This is related to the admgpu.gttsize. My laptop has the maximum amount
> of memory (64G) and usually gttsize is half of main memory size. I just
> tested with cmdline="nokaslr amdgpi.gttsize=2048" and the problem does
> not occur. So I did some more tes
On 3/14/25 17:14, Balbir Singh wrote:
> On 3/14/25 09:22, Bert Karwatzki wrote:
>> Am Freitag, dem 14.03.2025 um 08:54 +1100 schrieb Balbir Singh:
>>> On 3/14/25 05:12, Bert Karwatzki wrote:
>>>> Am Donnerstag, dem 13.03.2025 um 22:47 +1100 schrieb Balbir Singh:
>
On 3/15/25 01:18, Bert Karwatzki wrote:
> Am Samstag, dem 15.03.2025 um 00:34 +1100 schrieb Balbir Singh:
>> On 3/14/25 17:14, Balbir Singh wrote:
>>> On 3/14/25 09:22, Bert Karwatzki wrote:
>>>> Am Freitag, dem 14.03.2025 um 08:54 +1100 schrieb Balbir Singh:
>>
On 3/14/25 09:22, Bert Karwatzki wrote:
> Am Freitag, dem 14.03.2025 um 08:54 +1100 schrieb Balbir Singh:
>> On 3/14/25 05:12, Bert Karwatzki wrote:
>>> Am Donnerstag, dem 13.03.2025 um 22:47 +1100 schrieb Balbir Singh:
>>>>
>>>>
>>>> Anywa
On 3/14/25 05:12, Bert Karwatzki wrote:
> Am Donnerstag, dem 13.03.2025 um 22:47 +1100 schrieb Balbir Singh:
>>
>>
>> Anyway, I think the nokaslr result is interesting, it seems like with nokaslr
>> even the older kernels have problems with the game
>>
>> Cou
20 matches
Mail list logo