Hi folks,
anyone interested in a bunch of amdgpu kmemleak reports from latest Linus tree
+ tip?
GPU is:
[ 11.317312] [drm] amdgpu kernel modesetting enabled.
[ 11.363627] [drm] initializing kernel modesetting (CARRIZO 0x1002:0x9874
0x103C:0x807E 0xC4).
[ 11.364077] [drm] register mmio bas
+ amd-gfx@lists.freedesktop.org
On Sun, May 05, 2024 at 09:59:22PM +0300, Tranton Baddy wrote:
> I have this in my dmesg since version 6.8.6, not sure when it appeared. Is
> amdgpu driver has bug?
> [ 64.253144]
> ==
> [ 64.2531
On Sat, Nov 18, 2023 at 01:32:30PM -0600, Yazen Ghannam wrote:
> +void mce_setup_global(struct mce *m)
We usually call those things "common":
mce_setup_common().
> +{
> + memset(m, 0, sizeof(struct mce));
> +
> + m->cpuid= cpuid_eax(1);
> + m->cpuvendor= boot_cpu_data.x86
On Sat, Nov 18, 2023 at 01:32:31PM -0600, Yazen Ghannam wrote:
> Current AMD systems may report MCA errors using the ACPI Boot Error
> Record Table (BERT). The BERT entries for MCA errors will be an x86
> Common Platform Error Record (CPER) with an MSR register context that
> matches the MCAX/SMCA
On Sat, Nov 18, 2023 at 01:32:33PM -0600, Yazen Ghannam wrote:
> @@ -714,14 +721,10 @@ static bool legacy_mce_is_memory_error(struct mce *m)
> */
> static bool smca_mce_is_memory_error(struct mce *m)
> {
> - enum smca_bank_types bank_type;
> -
> if (XEC(m->status, 0x3f))
>
On Sat, Nov 18, 2023 at 01:32:34PM -0600, Yazen Ghannam wrote:
> +/* GPU UMCs have MCATYPE=0x1.*/
> +bool smca_gpu_umc_bank_type(u64 ipid)
> +{
> + if (!smca_umc_bank_type(ipid))
> + return false;
> +
> + return FIELD_GET(MCI_IPID_MCATYPE, ipid) == 0x1;
> +}
And now this tells
On Wed, Jul 28, 2021 at 02:17:27PM +0100, Christoph Hellwig wrote:
> So common checks obviously make sense, but I really hate the stupid
> multiplexer. Having one well-documented helper per feature is much
> easier to follow.
We had that in x86 - it was called cpu_has_ where xxx is the
feature bi
ecrypted()
> - memremap_is_efi_data()
> - memremap_is_setup_data()
> - early_memremap_is_setup_data()
>
> And finally, phys_mem_access_encrypted() is conditionally built as well,
> but requires a static inline version of it when CONFIG_AMD_MEM_ENCRYPT is
> not set.
>
> Cc: Thomas Gleixne
On Fri, Aug 13, 2021 at 11:59:22AM -0500, Tom Lendacky wrote:
> diff --git a/arch/x86/include/asm/protected_guest.h
> b/arch/x86/include/asm/protected_guest.h
> new file mode 100644
> index ..51e4eefd9542
> --- /dev/null
> +++ b/arch/x86/include/asm/protected_guest.h
> @@ -0,0 +1,29 @@
On Fri, Aug 13, 2021 at 11:59:21AM -0500, Tom Lendacky wrote:
> In prep for other protected virtualization technologies, introduce a
> generic helper function, prot_guest_has(), that can be used to check
> for specific protection attributes, like memory encryption. This is
> intended to eliminate h
On Sun, Aug 15, 2021 at 08:53:31AM -0500, Tom Lendacky wrote:
> It's not a cross-vendor thing as opposed to a KVM or other hypervisor
> thing where the family doesn't have to be reported as AMD or HYGON.
What would be the use case? A HV starts a guest which is supposed to be
encrypted using the AM
On Tue, Aug 17, 2021 at 12:22:33PM +0200, Borislav Petkov wrote:
> This one wants to be part of the previous patch.
... and the three following patches too - the treewide patch does a
single atomic :) replacement and that's it.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tg
On Fri, Aug 13, 2021 at 11:59:24AM -0500, Tom Lendacky wrote:
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index edc67ddf065d..5635ca9a1fbe 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -144,7 +144,7 @@ void __init sme_unmap_bootdata(char
On Fri, Aug 13, 2021 at 11:59:25AM -0500, Tom Lendacky wrote:
> diff --git a/arch/x86/kernel/machine_kexec_64.c
> b/arch/x86/kernel/machine_kexec_64.c
> index 8e7b517ad738..66ff788b79c9 100644
> --- a/arch/x86/kernel/machine_kexec_64.c
> +++ b/arch/x86/kernel/machine_kexec_64.c
> @@ -167,7 +167,7
ture support is added for other memory encyrption
> techonologies, the use of PATTR_GUEST_PROT_STATE can be updated, as
> required, to specifically use PATTR_SEV_ES.
>
> Cc: Thomas Gleixner
> Cc: Ingo Molnar
> Cc: Borislav Petkov
> Signed-off-by: Tom Lendacky
> ---
>
On Fri, Aug 13, 2021 at 11:59:28AM -0500, Tom Lendacky wrote:
> The mem_encrypt_active() function has been replaced by prot_guest_has(),
> so remove the implementation.
>
> Reviewed-by: Joerg Roedel
> Signed-off-by: Tom Lendacky
> ---
> include/linux/mem_encrypt.h | 4
> 1 file changed, 4
On Fri, Aug 13, 2021 at 11:59:23AM -0500, Tom Lendacky wrote:
> Introduce a powerpc version of the prot_guest_has() function. This will
> be used to replace the powerpc mem_encrypt_active() implementation, so
> the implementation will initially only support the PATTR_MEM_ENCRYPT
> attribute.
>
> C
On Tue, Aug 17, 2021 at 10:22:52AM -0500, Tom Lendacky wrote:
> I can change it to be an AMD/HYGON check... although, I'll have to check
> to see if any (very) early use of the function will work with that.
We can always change it later if really needed. It is just that I'm not
a fan of such "pre
On Tue, Aug 17, 2021 at 10:26:18AM -0500, Tom Lendacky wrote:
> >>/*
> >> - * If SME is active we need to be sure that kexec pages are
> >> - * not encrypted because when we boot to the new kernel the
> >> + * If host memory encryption is active we need to be sure that kexec
> >> + * pa
On Tue, Aug 17, 2021 at 09:46:58AM -0500, Tom Lendacky wrote:
> I'm ok with letting the TDX folks make changes to these calls to be SME or
> SEV specific, if necessary, later.
Yap, exactly. Let's add the specific stuff only when really needed.
Thx.
--
Regards/Gruss,
Boris.
https://people.k
On Thu, Aug 19, 2021 at 10:52:53AM +0100, Christoph Hellwig wrote:
> Which suggest that the name is not good to start with. Maybe protected
> hardware, system or platform might be a better choice?
Yah, coming up with a proper name here hasn't been easy.
prot_guest_has() is not the first variant.
On Mon, Aug 23, 2021 at 03:49:39PM -0400, Alex Deucher wrote:
> Maybe fixed with this patch?
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5706cb3c910cc8283f344bc37a889a8d523a2c6d
Nope, this one is already in:
$ git tag --contains 5706cb3c910cc8283f344bc37a889a8d
On Mon, Aug 23, 2021 at 04:31:42PM -0400, Alex Deucher wrote:
> Thanks. I think that should do the trick. Care to send that as a
> formal patch?
Sure, but let me run it through the randconfigs tests first to make sure
nothing else breaks. It is late here so if I don't manage now I'll send
you a fo
From: Borislav Petkov
Building a randconfig here triggered:
ERROR: modpost: "pm_suspend_target_state"
[drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
because the module export of that symbol happens in
kernel/power/suspend.c which is enabled with CONFIG_SUSPEND.
The ifdef
On Tue, Aug 24, 2021 at 06:38:41PM +0530, Lazar, Lijo wrote:
> Without CONFIG_PM_SLEEP and with CONFIG_SUSPEND
Can you even create such a .config?
> I remember giving a reviewed-by for this one, looks like it never got in.
> https://www.spinics.net/lists/amd-gfx/msg66166.html
A better version of
On Tue, Aug 24, 2021 at 07:22:46PM +0530, Lazar, Lijo wrote:
> 'pm_suspend_target_state' is only available when CONFIG_PM_SLEEP
> is set/enabled.
pm_suspend_target_state is available only when CONFIG_SUSPEND is
enabled. The extern thing is only a forward declaration.
> OTOH, when both SUSPEND and
On Wed, Sep 08, 2021 at 05:58:33PM -0500, Tom Lendacky wrote:
> In prep for other confidential computing technologies, introduce a generic
preparation
> helper function, cc_platform_has(), that can be used to check for specific
> active confidential computing attributes, like memory encryption. T
On Wed, Sep 08, 2021 at 05:58:34PM -0500, Tom Lendacky wrote:
> diff --git a/arch/x86/kernel/cc_platform.c b/arch/x86/kernel/cc_platform.c
> new file mode 100644
> index ..3c9bacd3c3f3
> --- /dev/null
> +++ b/arch/x86/kernel/cc_platform.c
> @@ -0,0 +1,21 @@
> +// SPDX-License-Identifier
On Wed, Sep 08, 2021 at 05:58:35PM -0500, Tom Lendacky wrote:
> Introduce a powerpc version of the cc_platform_has() function. This will
> be used to replace the powerpc mem_encrypt_active() implementation, so
> the implementation will initially only support the CC_ATTR_MEM_ENCRYPT
> attribute.
>
On Tue, Sep 14, 2021 at 04:47:41PM +0200, Christophe Leroy wrote:
> Yes, see
> https://lore.kernel.org/linuxppc-dev/20210914123919.58203...@canb.auug.org.au/T/#t
Aha, more compiler magic stuff ;-\
Oh well, I guess that fix will land upstream soon.
Thx.
--
Regards/Gruss,
Boris.
https://pe
On Wed, Sep 08, 2021 at 05:58:36PM -0500, Tom Lendacky wrote:
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 18fe19916bc3..4b54a2377821 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -144,7 +144,7 @@ void __init sme_unmap_bootdata(char
On Wed, Sep 15, 2021 at 10:28:59AM +1000, Michael Ellerman wrote:
> I don't love it, a new C file and an out-of-line call to then call back
> to a static inline that for most configuration will return false ... but
> whatever :)
Yeah, hch thinks it'll cause a big mess otherwise:
https://lore.kern
On Wed, Sep 08, 2021 at 05:58:31PM -0500, Tom Lendacky wrote:
> This patch series provides a generic helper function, cc_platform_has(),
> to replace the sme_active(), sev_active(), sev_es_active() and
> mem_encrypt_active() functions.
>
> It is expected that as new confidential computing technolo
On Wed, Sep 15, 2021 at 07:18:34PM +0200, Christophe Leroy wrote:
> Could you please provide more explicit explanation why inlining such an
> helper is considered as bad practice and messy ?
Tom already told you to look at the previous threads. Let's read them
together. This one, for example:
htt
On Wed, Sep 15, 2021 at 10:26:06AM -0700, Kuppuswamy, Sathyanarayanan wrote:
> I have a Intel variant patch (please check following patch). But it includes
> TDX changes as well. Shall I move TDX changes to different patch and just
> create a separate patch for adding intel_cc_platform_has()?
Yes,
On Tue, Sep 21, 2021 at 12:04:58PM -0500, Tom Lendacky wrote:
> Looks like instrumentation during early boot. I worked with Boris offline to
> exclude arch/x86/kernel/cc_platform.c from some of the instrumentation and
> that allowed an allyesconfig to boot.
And here's the lineup I have so far, I'd
On Wed, Sep 22, 2021 at 12:20:59AM +0300, Kirill A. Shutemov wrote:
> I still believe calling cc_platform_has() from __startup_64() is totally
> broken as it lacks proper wrapping while accessing global variables.
Well, one of the issues on the AMD side was using boot_cpu_data too
early and the In
On Sun, Sep 12, 2021 at 10:13:10PM -0400, Mukul Joshi wrote:
> Export smca_get_bank_type for use in the AMD GPU
> driver to determine MCA bank while handling correctable
> and uncorrectable errors in GPU UMC.
>
> v1->v2:
> - Drop the function is_smca_umc_v2().
> - Drop the patch to introduce a new
On Sun, Sep 12, 2021 at 10:13:11PM -0400, Mukul Joshi wrote:
> On Aldebaran, GPU driver will handle bad page retirement
> even though UMC is host managed. As a result, register a
> bad page retirement handler on the mce notifier chain to
> retire bad pages on Aldebaran.
>
> v1->v2:
> - Use smca_ge
; Want me to ACK this and you can carry it through your tree along with the
> > second patch?
>
> That would be great. Thanks!
Ok, with the above changelog removed:
Acked-by: Borislav Petkov
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Sep 22, 2021 at 05:30:15PM +0300, Kirill A. Shutemov wrote:
> Not fine, but waiting to blowup with random build environment change.
Why is it not fine?
Are you suspecting that the compiler might generate something else and
not a rip-relative access?
--
Regards/Gruss,
Boris.
https:/
On Thu, Sep 23, 2021 at 02:29:07PM +, Yazen Ghannam wrote:
> > + /*
> > +* If the error was generated in UMC_V2, which belongs to GPU UMCs,
> > +* and error occurred in DramECC (Extended error code = 0) then only
> > +* process the error, else bail out.
> > +*/
> > + if (!m
On Thu, Sep 23, 2021 at 05:23:21PM +, Yazen Ghannam wrote:
> Shouldn't the error still be reported to EDAC for decoding and counting? I
> think users want this.
You know what happens with users getting ECCs reported, right? They
think immediately their hw is going bad and start wanting to repl
On Thu, Sep 23, 2021 at 12:05:58AM +0300, Kirill A. Shutemov wrote:
> Unless we find other way to guarantee RIP-relative access, we must use
> fixup_pointer() to access any global variables.
Yah, I've asked compiler folks about any guarantees we have wrt
rip-relative addresses but it doesn't look
On Fri, Sep 24, 2021 at 12:41:32PM +0300, Kirill A. Shutemov wrote:
> On Thu, Sep 23, 2021 at 08:21:03PM +0200, Borislav Petkov wrote:
> > On Thu, Sep 23, 2021 at 12:05:58AM +0300, Kirill A. Shutemov wrote:
> > > Unless we find other way to guarantee RIP-relative a
On Fri, Sep 24, 2021 at 07:46:10PM +, Yazen Ghannam wrote:
> I agree with you in general. But this device isn't really a GPU. And
> users of this device seem to want to count *every* error, at least for
> now.
Aha, so something accelerator-y where they do general purpose computation.
So what'
Signed-off-by: Borislav Petkov
---
arch/x86/include/asm/mem_encrypt.h | 2 --
arch/x86/kernel/sev.c | 6 +++---
arch/x86/mm/mem_encrypt.c | 24 +++-
arch/x86/realmode/init.c | 3 +--
4 files changed, 7 insertions(+), 28 deletions(-)
diff --git
: Tom Lendacky
Signed-off-by: Borislav Petkov
---
arch/Kconfig| 3 ++
include/linux/cc_platform.h | 88 +
2 files changed, 91 insertions(+)
create mode 100644 include/linux/cc_platform.h
diff --git a/arch/Kconfig b/arch/Kconfig
index
From: Tom Lendacky
Introduce an x86 version of the cc_platform_has() function. This will be
used to replace vendor specific calls like sme_active(), sev_active(),
etc.
Signed-off-by: Tom Lendacky
Signed-off-by: Borislav Petkov
---
arch/x86/Kconfig | 1 +
arch/x86/include
-by: Borislav Petkov
---
arch/x86/include/asm/mem_encrypt.h | 2 --
arch/x86/kernel/crash_dump_64.c| 4 +++-
arch/x86/kernel/kvm.c | 3 ++-
arch/x86/kernel/kvmclock.c | 4 ++--
arch/x86/kernel/machine_kexec_64.c | 4 ++--
arch/x86/kvm/svm/svm.c | 3
nally, phys_mem_access_encrypted() is conditionally built as well,
but requires a static inline version of it when CONFIG_AMD_MEM_ENCRYPT is
not set.
Signed-off-by: Tom Lendacky
Signed-off-by: Borislav Petkov
---
arch/x86/include/asm/io.h | 8
arch/x86/mm/ioremap.c | 2 +-
2
From: Borislav Petkov
Hi all,
here's v4 of the cc_platform_has() patchset with feedback incorporated.
I'm going to route this through tip if there are no objections.
Thx.
Tom Lendacky (8):
x86/ioremap: Selectively build arch override encryption functions
arch/cc: Introduce a f
: Borislav Petkov
Acked-by: Michael Ellerman
---
arch/powerpc/platforms/pseries/Kconfig | 1 +
arch/powerpc/platforms/pseries/Makefile | 2 ++
arch/powerpc/platforms/pseries/cc_platform.c | 26
3 files changed, 29 insertions(+)
create mode 100644 arch/powerpc/platforms
sev_active() that are really geared
towards detecting if SME is active.
Signed-off-by: Tom Lendacky
Signed-off-by: Borislav Petkov
---
arch/x86/include/asm/kexec.h | 2 +-
arch/x86/include/asm/mem_encrypt.h | 2 --
arch/x86/kernel/machine_kexec_64.c | 15 ---
arch/x86/kernel
implementation of mem_encrypt_active(), cc_platform_has()
does not need to be implemented in s390 (the config option
ARCH_HAS_CC_PLATFORM is not set).
Signed-off-by: Tom Lendacky
Signed-off-by: Borislav Petkov
---
arch/powerpc/include/asm/mem_encrypt.h | 5 -
arch/powerpc/platforms/pseries/svm.c| 5
On Tue, Sep 28, 2021 at 12:19:49PM -0700, Kuppuswamy, Sathyanarayanan wrote:
> Intel CC support patch is not included in this series. You want me
> to address the issue raised by Joerg before merging it?
Did you not see my email to you today:
https://lkml.kernel.org/r/yvl4zughfsh1q...@zn.tnic
?
On Tue, Sep 28, 2021 at 01:48:46PM -0700, Kuppuswamy, Sathyanarayanan wrote:
> Just read it. If you want to use cpuid_has_tdx_guest() directly in
> cc_platform_has(), then you want to rename intel_cc_platform_has() to
> tdx_cc_platform_has()?
Why?
You simply do:
if (cpuid_has_tdx_guest()
On Tue, Sep 28, 2021 at 02:01:57PM -0700, Kuppuswamy, Sathyanarayanan wrote:
> Yes. But, since the check is related to TDX, I just want to confirm whether
> you are fine with naming the function as intel_*().
Why is this such a big of a deal?!
There's amd_cc_platform_has() and intel_cc_platform_h
On Tue, Oct 05, 2021 at 04:29:41PM +0200, Paul Menzel wrote:
> Selecting the symbol `AMD_MEM_ENCRYPT` – as
> done in Debian 5.13.9-1~exp1 [1] – also selects
> `AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT`, as it defaults to yes,
I'm assuming that "selecting" is done automatically: alldefconfig,
olddefconfig
On Tue, Oct 05, 2021 at 10:48:15AM -0400, Alex Deucher wrote:
> It's not incompatible per se, but SEM requires the IOMMU be enabled
> because the C bit used for encryption is beyond the dma_mask of most
> devices. If the C bit is not set, the en/decryption for DMA doesn't
> occur. So you need IOM
On Wed, Oct 06, 2021 at 09:23:22AM -0400, Alex Deucher wrote:
> There could be some OEM systems that disable the IOMMU on the platform
> and don't provide a switch in the bios to enable it. The GPU driver
> will still work in that case, it will just not be able to enable KFD
> support for ROCm com
Ok,
so I sat down and wrote something and tried to capture all the stuff we
so talked about that it is clear in the future why we did it.
Thoughts?
---
From: Borislav Petkov
Date: Wed, 6 Oct 2021 19:34:55 +0200
Subject: [PATCH] x86/Kconfig: Do not enable AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
On Wed, Oct 06, 2021 at 02:10:30PM -0400, Alex Deucher wrote:
> This is not limited to Raven.
That's what the innocuous "a.o." wanted to state. :)
> All GPUs (and quite a few other
> devices) have a limited DMA mask. AMD GPUs have between 32 and 48
> bits of DMA depending on what generation the
On Wed, Oct 06, 2021 at 02:36:56PM -0400, Alex Deucher wrote:
> From the x86 model and family info? I think Raven has different
> families from other Zen based CPUs.
Yeah, I'd like to avoid a f/m/s mapping table, if possible. Those things
should be a last resort and they always need adjustment wh
On Wed, Oct 06, 2021 at 02:21:40PM -0400, Alex Deucher wrote:
> And just another general comment, swiotlb + bounce buffers isn't
> really useful on GPUs. You may have 10-100s of MBs of memory mapped
> long term into the GPU's address space for random access. E.g., you
> may have buffers in system
Hi folks,
commit in $Subject breaks rebooting an HP laptop here with a Carrizo
chipset: after typing "reboot" and pressing Enter, it powers off the
machine up to a certain point but the fans remain on, screen goes black
and nothing happens anymore. No reboot. I have to power it off by
holding the
On Fri, Oct 08, 2021 at 11:12:35AM -0400, Alex Deucher wrote:
> Can you try swapping the order of
> amdgpu_device_ip_set_powergating_state() and
> amdgpu_device_ip_set_clockgating_state() in the patch?
Nope, the diff below didn't change things.
Should I comment them out one by one and see whether
On Sat, Oct 09, 2021 at 01:20:39AM +, Quan, Evan wrote:
> Maybe the change below can address your issue.
> https://lists.freedesktop.org/archives/amd-gfx/2021-September/069006.html
Nope, that one doesn't change anything.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/note
On Sat, Oct 09, 2021 at 09:54:13AM +, Quan, Evan wrote:
> Oops, I just found some necessary changes are missing from the patch of the
> link below.
> https://lists.freedesktop.org/archives/amd-gfx/2021-September/069006.html
>
> Could you try the patch from the link above + the attached patch?
On Mon, Oct 11, 2021 at 03:05:33PM +0200, Paul Menzel wrote:
> I think, the IOMMU is enabled on the MSI B350M MORTAR, but otherwise, yes
> this looks fine. The help text could also be updated to mention problems
> with AMD Raven devices.
This is not only about Raven GPUs but, as Alex explained, pr
t to have SME enabled,
will need to either enable it in their config or use "mem_encrypt=on" on
the kernel command line.
[ tlendacky: Generalize commit message. ]
Fixes: 7744ccdbc16f ("x86/mm: Add Secure Memory Encryption (SME) support")
Reported-by: Paul Menzel
Signed-off-by: Borislav P
On Mon, Oct 11, 2021 at 08:03:51AM +, Quan, Evan wrote:
> OK... Then forget about previous patches. Let's try to narrow down the
> issue first. Please try the attached patch1 first. If it works,
It does.
> please undo the changes of patch1 and try patch2 to narrow down further.
It does too.
On Wed, Oct 13, 2021 at 09:19:45AM +, Quan, Evan wrote:
> So, I need your help to confirm the last two patches(I sent you) do not
> affect the fix for the bug above.
> Please follow the steps below to verify it:
> 1. Launch a video playing
> 2. open another terminal and issue "sudo pm-suspend"
On Thu, Oct 14, 2021 at 02:02:48AM +, Quan, Evan wrote:
> [Quan, Evan] Yes, but not(apply them) at the same time. One by one as you did
> before.
> - try the patch1 first
Ok, first patch worked fine.
> - undo the changes of patch1 and try patch2
Did that, worked fine too except after the fi
On Mon, Oct 18, 2021 at 03:34:32PM +0800, Evan Quan wrote:
> It's confirmed that on some APUs the interaction with SMU(about DPM
> disablement)
> will power off the UVD. That will make the succeeding interactions with UVD
> on the
> suspend path impossible. And the system will hang due to that. T
On Fri, Nov 05, 2021 at 08:05:41AM +, Quan, Evan wrote:
> I'm wondering are you able to give the attached patch(alone) a try.
Yap, looks good.
Tested-by: Borislav Petkov
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Mon, Nov 08, 2021 at 09:51:03AM +0100, Paul Menzel wrote:
> Please elaborate the kind of issues.
It fails to reboot on Carrizo-based laptops.
Whoever commits this, pls add
Link: https://lore.kernel.org/r/yv81vidwqlwva...@zn.tnic
so that it is clear what the whole story way.
Thx.
--
Regard
Hi,
so this is a drive-by review using the lore.kernel.org mail because I
wasn't CCed on this.
On Tue, May 11, 2021 at 09:30:58PM -0400, Mukul Joshi wrote:
> +static int amdgpu_bad_page_notifier(struct notifier_block *nb,
> + unsigned long val, void *data)
> +{
> +
On Wed, May 12, 2021 at 07:00:58PM +, Joshi, Mukul wrote:
> SMCA UMCv2 corresponds to GPU's UMC MCA bank and the GPU driver is
> only interested in errors on GPU UMC.
So that thing should be called SMCA_GPU_UMC not SMCA_UMC_V2.
> We cannot know this without is_smca_umc_v2.
You don't need it
On Thu, May 13, 2021 at 03:20:36AM +, Joshi, Mukul wrote:
> Exporting smca_get_bank_type() works fine when CONFIG_X86_MCE_AMD is defined.
> I would need to put #ifdef CONFIG_X86_MCE_AMD in my code to compile the amdgpu
> driver when CONFIG_X86_MCE_AMD is not defined.
> I can avoid all that by u
On Thu, May 13, 2021 at 10:17:47AM -0400, Alex Deucher wrote:
> The bad pages are stored in an EEPROM on the board and the next time
> the driver loads it reads the EEPROM so that it can reserve the bad
> pages at init time so they don't get used again.
And that works automagically on the next boo
On Thu, May 13, 2021 at 10:32:45AM -0400, Alex Deucher wrote:
> Right. The sys admin can query the bad page count and decide when to
> retire the card.
Yap, although the driver should actively "tell" the sysadmin when some
critical counts of retired VRAM pages are reached because I doubt all
admi
On Thu, May 13, 2021 at 11:10:34PM +, Joshi, Mukul wrote:
> That's probably not the best example to look at.
Oh, it is the *perfect* example but...
> smca_get_long_name() is used in drivers/edac/mce_amd.c and this file
> doesn't get compiled when CONFIG_X86_MCE_AMD is not defined.
>
> And amd
On Thu, May 13, 2021 at 11:14:30PM +, Joshi, Mukul wrote:
> Are you OK with a new MCE priority (MCE_PRIO_ACCEL) or do you want us to use
> something else?
I still don't know why a separate priority is needed. Maybe this still
needs answering:
> It is a deferred interrupt that generates an MCE
On Fri, May 14, 2021 at 01:06:33PM +, Joshi, Mukul wrote:
> We have RAS functionality in other ASICs that is not dependent on
> CONFIG_X86_MCE_AMD. So, I don't think we would want to do that just
> for one ASIC.
Lemme try again: you said that those errors do get reported through a
deferred int
On Mon, May 17, 2021 at 03:27:23AM +0500, Mikhail Gavrilov wrote:
> Hi folks.
> 5.13-rc1 after 5.13-rc0 is a disaster because it hangs and hangs again
> after reboot.
> All hang's have in common is that they all happens in
> smp_call_function_many_cond function (I compared all trace [1], [2],
> [3]
On Mon, Jan 17, 2022 at 08:16:09AM +0100, Christian König wrote:
> Interesting to see that even that old stuff is still used.
Well, "used" is a stretch.
This is my way of testing on K8 as pretty much all the big K8 boxes to
which I had access to, got decommissioned so this baby is the only K8
rea
Hi,
this patch breaks X on my box - it is fully responsive and I can log in
into it from another machine but both monitors are black and show this:
"The current input timing is not supported by the monitor display. Please
change your input timing to 1920x1200@60Hz or any other monitor
listed ti
Forwarding by mail because I can't find the respective AMD GPU assignee
mail on bugzilla.k.o.
- Forwarded message from bugzilla-dae...@bugzilla.kernel.org -
Date: Sun, 17 Jan 2021 21:13:06 +
From: bugzilla-dae...@bugzilla.kernel.org
To: b...@alien8.de
Subject: [Bug 211245] New: Fedora
Hi folks,
I get the below on -rc2+tip/master. I added printks to your FPU macros:
---
diff --git a/drivers/gpu/drm/amd/display/dc/os_types.h
b/drivers/gpu/drm/amd/display/dc/os_types.h
index 126c2f3a4dd3..49629dc03f99 100644
--- a/drivers/gpu/drm/amd/display/dc/os_types.h
+++ b/drivers/gpu/drm/a
On Fri, Mar 12, 2021 at 06:20:25PM +, Deucher, Alexander wrote:
> Should be fixed with these patches:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=15e8b95d5f7509e0b09289be8c422c459c9f0412
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/com
Hi,
patch in $Subject breaks booting on a laptop here, GPU details are
below. The machine stops booting right when it attempts to switch modes
during boot, to a higher mode than the default VGA one. Machine doesn't
ping and is otherwise unresponsive so that a hard reset is the only
thing that help
On Mon, Dec 14, 2020 at 04:53:39PM -0500, Alex Deucher wrote:
> This reverts commit 8353d30e747f4e5cdd867c6b054dbb85cdcc76a9.
>
> This causes a hang on a carrizo based laptop. Revert until we can fix
> it properly.
>
> Cc: Borislav Petkov
Reported-by: me
> Signed
On Tue, Dec 15, 2020 at 10:47:03AM -0500, Rodrigo Siqueira wrote:
> Hi Boris,
>
> Could you check if your branch has this commit:
>
> drm/amd/display: Fix module load hangs when connected to an eDP
>
> If so, could you try this patch:
>
> https://patchwork.freedesktop.org/series/84965/
So I
On Tue, Dec 15, 2020 at 12:04:23PM -0500, Alex Deucher wrote:
> That patch trivially backports to 5.10. See attached backported
> patch. @Borislav Petkov does the attached patch fix 5.10 for you?
Yes, thanks.
Reported-and-tested-by: Borislav Petkov
--
Regards/Gruss,
Boris.
On Tue, Dec 15, 2020 at 02:00:58PM -0500, Rodrigo Siqueira wrote:
> Thanks for reporting this issue and test the fix.
It was my pleasure. Thanks for the quick fix!
:-)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
___
> DRM_ERROR("Failed initializing VRAM heap.\n");
> --
Was finally able to test those during workstation hw maintenance so I
was able to install a new kernel and reboot.
Reported-by: Borislav Petkov
Tested-by: Borislav Petkov
Th
Hi,
this below triggers with the latest Linus tree:
51f269a6ecc7 ("Merge tag 'probes-fixes-6.4-rc4' of
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace")
...
[ 16.173593] [drm] radeon kernel modesetting enabled.
[ 16.173743] radeon :29:00.0: vgaarb: deactivate vga console
nd the modesetting
> code when the framebuffer got displayed. It only got unpinned once by
> the fbdev helper radeon_fbdev_destroy_pinned_object(). Hence TTM's BO-
> release function complains about the pin counter. Forcing the outputs
> off also undoes the modesettings pin incre
Hi folks,
this is with Linus' tree from Wed:
041fae9c105a ("Merge tag 'f2fs-for-6.2-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs")
on a CZ laptop:
[7.782901] [drm] initializing kernel modesetting (CARRIZO 0x1002:0x9874
0x103C:0x807E 0xC4)
The splat is kinda messy:
1 - 100 of 111 matches
Mail list logo