On 10/12/18 7:49 PM, Suvayu Ali wrote:
> Hello Rick,
> 
> On Fri, Oct 12, 2018 at 11:51:40PM +0000, Rick Stevens wrote:
>>>
>>>   [drm] amdgpu kernel modesetting enabled.
>>>   [drm] initializing kernel modesetting (RAVEN 0x1002:0x15DD 0x1458:0xD000 
>>> 0xC6).
>>>   [drm] register mmio base: 0xFE600000
>>>   [drm] register mmio size: 524288
>>>   [drm] add ip block number 0 <soc15_common>
>>>   [drm] add ip block number 1 <gmc_v9_0>
>>>   [drm] add ip block number 2 <vega10_ih>
>>>   [drm] add ip block number 3 <psp>
>>>   [drm] add ip block number 4 <powerplay>
>>>   [drm] add ip block number 5 <dm>
>>>   [drm] add ip block number 6 <gfx_v9_0>
>>>   [drm] add ip block number 7 <sdma_v4_0>
>>>   [drm] add ip block number 8 <vcn_v1_0>
>>>   amdgpu 0000:06:00.0: Direct firmware load for amdgpu/raven_gpu_info.bin 
>>> failed with error -2
>>>   amdgpu 0000:06:00.0: Failed to load gpu_info firmware 
>>> "amdgpu/raven_gpu_info.bin"
>>>   amdgpu 0000:06:00.0: Fatal error during GPU init
>>>   [drm] amdgpu: finishing device.
>>>   amdgpu: probe of 0000:06:00.0 failed with error -2
>>
>> Well, error "-2" typically is "file not found", so my guess is the code
>> isn't looking in the right spot for the firmware files. Yes, you have
>> them in the right spot, but it could be a permissions or selinux context
>> issue as well.
>>
>> Have you checked dmesg, journalctl and/or the selinux logs to see what
>> they say?
> 
> What I quoted above _is_ from the journal.  I double checked, there are no 
> other suspicious messages.  The SELinux label for the firmware files is 
> "system_u:object_r:lib_t:s0".  There are also no AVC denials.

Yes, that's the same SELinux attributes I have.

> My pre-post searching told me "-2" is for "file not found", so I had already 
> checked the usual.  I thought if I could get a more verbose message in the 
> journal, maybe I get the reason behind the "-2", so I replaced "quiet" by 
> "verbose" in kernel arguments, that didn't help.  I also looked if there are 
> any udev rules that might try to load the firmware from a different path 
> (apparently Ubuntu did that at some point), no go
Yup, Ubuntu's done a few non-standard things in the past. Well, non-
standard from a Fedora/Red Hat/CentOS view at least.

> I also ran `rpm --verify` on the kernel-{core,modules} and linux-firmware 
> packages, all I got is:
> 
>   # rpm --verify kernel-core-4.18.12  # picked a few kernels, same for all
>   .M.......  g /boot/System.map-4.18.12-200.fc28.x86_64
>   .M.......  g /boot/initramfs-4.18.12-200.fc28.x86_64.img
> 
> The permissions are 600.
> 
> My hunch is, as the ROCm installation uses dkms to build the kernel module 
> (which failed btw, that's why ROCm didn't work for me), the uninstallation 
> somehow leaves behind some configuration which persists across kernels.  Do 
> you think that's possible?

Yes, that's possible. I've never even tried ROCm, so I can't speak to
it. You may have to rebuild your initramfs image for the kernel.

It may be easier to simply reinstall the kernel:

        sudo dnf reinstall kernel-core-4.18.12

and see if that clears the issue.
----------------------------------------------------------------------
- Rick Stevens, Systems Engineer, AllDigital    ri...@alldigital.com -
- AIM/Skype: therps2        ICQ: 226437340           Yahoo: origrps2 -
-                                                                    -
-   I haven't lost my mind.  It's backed up on tape somewhere, but   -
-                       probably not recoverable.                    -
----------------------------------------------------------------------
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org

Reply via email to