Control: tags -1 + upstream

Hi,

On Wed, Apr 02, 2025 at 12:22:14PM -0400, Calum McConnell wrote:
> Package: src:linux
> Version: 6.12.20-1
> Severity: normal
> X-Debbugs-Cc: debian-am...@lists.debian.org
> User: debian-am...@lists.debian.org
> Usertags: amd64
> 
> I had a kernel crash and dump occur when running Freecad (version in Trixie 
> repos) with DRI_PRIME=1
> and an override needed to let it run (COIN_GL_NO_CURRENT_CONTEXT_CHECK=1).  
> Its likely that there are
> bugs in Freecad as well, which I will probably try to report, but it was 
> successfully running, and I was
> manipulating constraints when the whole system went down with a kdump.  The 
> error is a BUG: Null Pointer Dereference.
> I had previously (and successfully) been running other games and programs on 
> the amdgpu during this boot, as confirmed 
> by framerates and radeontop.  I have not yet tried to reproduce the error on 
> a clean boot, without a boatload of
> other programs running.
> 
> Kdump-tools collected the dump.  I have attached the DMESG output (xz 
> compressed). A complete dump is available, but weighs in at 1.4GB after
> xz compression; a clean boot reproduction would likely be smaller, and 
> available on request.  I uploaded the dump to:
> https://drive.google.com/file/d/1gro_KMPDqG1kp4BN-VXyZCAM_S8Pg64L/view?usp=sharing
> 
> The OOPS/BUG/kernel standard crash log is below.  The 'kernel log' in the 
> main body of this message refers to the log of a normal-so-far boot.  I also 
> want to draw
> attention to the line "i915 0000:00...".  Unlike the other lines that occur 
> before the oops, this line is NOT typically printed while my machine is 
> operating.
> 
> [611643.107695] [ T388600] pcieport 0000:00:1d.0: Intel SPT PCH root port ACS 
> workaround enabled
> [611643.288862] [ T386248] i915 0000:00:02.0: [drm] *ERROR* Atomic update 
> failure on pipe A (start=1 end=2) time 201 us, min 1073, max 1079, scanline 
> start 1070, end 1083
> [611643.344171] [ T388600] [drm] PCIE GART of 256M enabled (table at 
> 0x000000F400000000).
> [611643.607739] [ T388600] amdgpu 0000:05:00.0: [drm:amdgpu_ring_test_helper 
> [amdgpu]] *ERROR* ring comp_1.1.0 test failed (-110)
> [611643.810417] [ T388600] amdgpu 0000:05:00.0: [drm:amdgpu_ring_test_helper 
> [amdgpu]] *ERROR* ring comp_1.2.0 test failed (-110)
> [611644.014733] [ T388600] amdgpu 0000:05:00.0: [drm:amdgpu_ring_test_helper 
> [amdgpu]] *ERROR* ring comp_1.3.0 test failed (-110)
> [611644.218676] [ T388600] amdgpu 0000:05:00.0: [drm:amdgpu_ring_test_helper 
> [amdgpu]] *ERROR* ring comp_1.0.1 test failed (-110)
> [611644.423194] [ T388600] amdgpu 0000:05:00.0: [drm:amdgpu_ring_test_helper 
> [amdgpu]] *ERROR* ring comp_1.1.1 test failed (-110)
> [611644.627687] [ T388600] amdgpu 0000:05:00.0: [drm:amdgpu_ring_test_helper 
> [amdgpu]] *ERROR* ring comp_1.2.1 test failed (-110)
> [611644.832908] [ T388600] amdgpu 0000:05:00.0: [drm:amdgpu_ring_test_helper 
> [amdgpu]] *ERROR* ring comp_1.3.1 test failed (-110)
> [611644.969406] [ T388600] [drm] UVD and UVD ENC initialized successfully.
> [611645.069388] [ T388600] [drm] VCE initialized successfully.
> [611645.075260] [ T388600] amdgpu 0000:05:00.0: [drm] Cannot find any crtc or 
> sizes
> [611645.172785] [ T388615] [drm] scheduler comp_1.1.0 is not ready, skipping
> [611645.172788] [ T388615] [drm] scheduler comp_1.2.0 is not ready, skipping
> [611645.172791] [ T388615] [drm] scheduler comp_1.3.0 is not ready, skipping
> [611645.172793] [ T388615] [drm] scheduler comp_1.0.1 is not ready, skipping
> [611645.172794] [ T388615] [drm] scheduler comp_1.1.1 is not ready, skipping
> [611645.172795] [ T388615] [drm] scheduler comp_1.2.1 is not ready, skipping
> [611645.172796] [ T388615] [drm] scheduler comp_1.3.1 is not ready, skipping
> [611645.172798] [ T388615] BUG: kernel NULL pointer dereference, address: 
> 0000000000000008
> [611645.172801] [ T388615] #PF: supervisor read access in kernel mode
> [611645.172802] [ T388615] #PF: error_code(0x0000) - not-present page
> [611645.172804] [ T388615] PGD 0 P4D 0 
> [611645.172807] [ T388615] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
> [611645.172811] [ T388615] CPU: 0 UID: 1000 PID: 388615 Comm: freecad:cs0 
> Kdump: loaded Tainted: G     U             6.12.19-amd64 #1  Debian 6.12.19-1
> [611645.172815] [ T388615] Tainted: [U]=USER
> [611645.172817] [ T388615] Hardware name: Dell Inc. Latitude 7424 Rugged 
> Extreme/0TJ1W1, BIOS 1.35.0 11/07/2024
> [611645.172818] [ T388615] RIP: 0010:drm_sched_job_arm+0x23/0x60 [gpu_sched]
> [611645.172827] [ T388615] Code: 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 
> 00 55 53 48 8b 6f 60 48 85 ed 74 3f 48 89 fb 48 89 ef e8 a1 38 00 00 48 8b 45 
> 10 <48> 8b 50 08 48 89 53 18 8b 45 24 89 43 5c b8 01 00 00 00 f0 48 0f
> [611645.172830] [ T388615] RSP: 0018:ffffaa5246bab808 EFLAGS: 00010206
> [611645.172832] [ T388615] RAX: 0000000000000000 RBX: ffff9c6eda0c9400 RCX: 
> ffff9c6f18e190d0
> [611645.172834] [ T388615] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
> ffff9c6f18e1a838
> [611645.172836] [ T388615] RBP: ffff9c6f18e1a810 R08: ffff9c6f0185b468 R09: 
> ffffaa5246bab648
> [611645.172838] [ T388615] R10: ffffffffac4741e8 R11: 0000000000000003 R12: 
> 0000000000000000
> [611645.172839] [ T388615] R13: ffffaa5246bab888 R14: 0000000000000000 R15: 
> 0000000000000000
> [611645.172841] [ T388615] FS:  00007f80c4bff6c0(0000) 
> GS:ffff9c722fa00000(0000) knlGS:0000000000000000
> [611645.172843] [ T388615] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [611645.172845] [ T388615] CR2: 0000000000000008 CR3: 000000012df78006 CR4: 
> 00000000003726f0
> [611645.172847] [ T388615] Call Trace:
> [611645.172849] [ T388615]  <TASK>
> [611645.172852] [ T388615]  ? __die_body.cold+0x19/0x27
> [611645.172857] [ T388615]  ? page_fault_oops+0x15c/0x2e0
> [611645.172862] [ T388615]  ? exc_page_fault+0x7e/0x180
> [611645.172865] [ T388615]  ? asm_exc_page_fault+0x26/0x30
> [611645.172870] [ T388615]  ? drm_sched_job_arm+0x23/0x60 [gpu_sched]
> [611645.172875] [ T388615]  ? drm_sched_job_arm+0x1f/0x60 [gpu_sched]
> [611645.172879] [ T388615]  amdgpu_cs_ioctl+0x14f2/0x1a20 [amdgpu]
> [611645.173308] [ T388615]  ? psi_group_change+0x138/0x300
> [611645.173315] [ T388615]  ? __pfx_amdgpu_cs_ioctl+0x10/0x10 [amdgpu]
> [611645.173638] [ T388615]  drm_ioctl_kernel+0xad/0x100 [drm]
> [611645.173694] [ T388615]  drm_ioctl+0x277/0x4f0 [drm]
> [611645.173737] [ T388615]  ? __pfx_amdgpu_cs_ioctl+0x10/0x10 [amdgpu]
> [611645.174084] [ T388615]  amdgpu_drm_ioctl+0x4b/0x80 [amdgpu]
> [611645.174393] [ T388615]  __x64_sys_ioctl+0x91/0xd0
> [611645.174396] [ T388615]  do_syscall_64+0x82/0x190
> [611645.174399] [ T388615]  ? gup_fast_pte_range+0xd0/0x380
> [611645.174403] [ T388615]  ? futex_wake+0x8f/0x1b0
> [611645.174407] [ T388615]  ? do_futex+0x125/0x190
> [611645.174409] [ T388615]  ? __x64_sys_futex+0x127/0x1e0
> [611645.174411] [ T388615]  ? sched_clock+0x10/0x30
> [611645.174413] [ T388615]  ? sched_clock_cpu+0xf/0x1d0
> [611645.174416] [ T388615]  ? syscall_exit_to_user_mode+0x4d/0x210
> [611645.174419] [ T388615]  ? do_syscall_64+0x8e/0x190
> [611645.174421] [ T388615]  ? wake_up_q+0x4e/0x90
> [611645.174424] [ T388615]  ? futex_wake+0x187/0x1b0
> [611645.174427] [ T388615]  ? do_futex+0x125/0x190
> [611645.174429] [ T388615]  ? __x64_sys_futex+0x127/0x1e0
> [611645.174431] [ T388615]  ? syscall_exit_to_user_mode+0x4d/0x210
> [611645.174433] [ T388615]  ? do_syscall_64+0x8e/0x190
> [611645.174436] [ T388615]  ? syscall_exit_to_user_mode+0x4d/0x210
> [611645.174438] [ T388615]  ? do_syscall_64+0x8e/0x190
> [611645.174440] [ T388615]  ? do_syscall_64+0x8e/0x190
> [611645.174442] [ T388615]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [611645.174445] [ T388615] RIP: 0033:0x7f80eb3168db
> [611645.174469] [ T388615] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 
> 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 
> 05 <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b 04 25 28 00 00
> [611645.174472] [ T388615] RSP: 002b:00007f80c4bfe750 EFLAGS: 00000246 
> ORIG_RAX: 0000000000000010
> [611645.174475] [ T388615] RAX: ffffffffffffffda RBX: 00000000c0186444 RCX: 
> 00007f80eb3168db
> [611645.174477] [ T388615] RDX: 00007f80c4bfe7e0 RSI: 00000000c0186444 RDI: 
> 000000000000001b
> [611645.174478] [ T388615] RBP: 00007f80c4bfe820 R08: 00007f80c4bfe8a0 R09: 
> 00007f80c4bfe7b0
> [611645.174480] [ T388615] R10: 0000000000000000 R11: 0000000000000246 R12: 
> 00007f80c4bfe7e0
> [611645.174481] [ T388615] R13: 000000000000001b R14: 00007f80c4bfe9e0 R15: 
> 00007f80c4bfe860
> [611645.174484] [ T388615]  </TASK>
> [611645.174485] [ T388615] Modules linked in: snd_usb_audio snd_usbmidi_lib 
> snd_rawmidi uinput sd_mod ccm snd_seq_dummy snd_hrtimer snd_seq 
> snd_seq_device rfcomm cmac algif_hash algif_skcipher af_alg bnep cpuid 
> snd_hda_codec_hdmi dell_pc platform_profile intel_uncore_frequency 
> intel_uncore_frequency_common snd_sof_pci_intel_skl x86_pkg_temp_thermal 
> snd_sof_intel_hda_generic binfmt_misc soundwire_intel 
> soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda_common 
> nls_ascii nls_cp437 snd_soc_hdac_hda vfat snd_sof_intel_hda_mlink fat 
> snd_sof_intel_hda dell_rbtn intel_powerclamp snd_sof_pci snd_sof_xtensa_dsp 
> coretemp snd_sof kvm_intel snd_sof_utils snd_soc_acpi_intel_match 
> snd_soc_acpi soundwire_bus kvm snd_soc_avs snd_soc_hda_codec snd_hda_ext_core 
> iwlmvm snd_ctl_led snd_soc_core snd_hda_codec_realtek snd_hda_codec_generic 
> snd_compress crct10dif_pclmul snd_hda_scodec_component snd_pcm_dmaengine 
> ghash_clmulni_intel intel_rapl_msr mac80211 sha512_ssse3 snd_hda_intel 
> sha256_ssse3 dell_laptop snd_intel_dspcfg sha1_ssse3 mei_wdt
> [611645.174530] [ T388615]  mei_hdcp mei_pxp snd_intel_sdw_acpi aesni_intel 
> snd_hda_codec gf128mul uvcvideo btusb crypto_simd cryptd btrtl btintel rapl 
> videobuf2_vmalloc snd_hda_core uvc processor_thermal_device_pci_legacy 
> processor_thermal_device videobuf2_memops btbcm intel_cstate videobuf2_v4l2 
> btmtk processor_thermal_wt_hint snd_hwdep dell_smm_hwmon dell_wmi libarc4 
> processor_thermal_rfim intel_uncore videodev snd_pcm processor_thermal_rapl 
> bluetooth intel_rapl_common iwlwifi snd_timer dell_smbios iTCO_wdt 
> processor_thermal_wt_req ucsi_acpi dell_wmi_sysman firmware_attributes_class 
> intel_pmc_bxt pcspkr processor_thermal_power_floor snd typec_ucsi dcdbas 
> videobuf2_common mei_me typec processor_thermal_mbox iTCO_vendor_support 
> wmi_bmof intel_xhci_usb_role_switch dell_wmi_descriptor mei mc watchdog 
> ee1004 soundcore intel_pch_thermal intel_soc_dts_iosf roles cfg80211 
> soc_button_array sg joydev dell_smo8800 intel_pmc_core intel_hid intel_vsec 
> int3400_thermal int3403_thermal sparse_keymap acpi_pad pmt_telemetry 
> acpi_thermal_rel
> [611645.174578] [ T388615]  int340x_thermal_zone pmt_class ac rfkill 
> serio_raw evdev msr parport_pc ppdev lp parport nvme_fabrics dm_mod loop 
> nvme_keyring efi_pstore configfs nfnetlink ip_tables x_tables autofs4 btrfs 
> blake2b_generic efivarfs raid10 raid456 async_raid6_recov async_memcpy 
> async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 
> md_mod zstd sr_mod cdrom r8153_ecm cdc_ether usbnet hid_multitouch 
> hid_generic uas usb_storage scsi_mod r8152 mii libphy scsi_common usbhid 
> amdgpu i915 amdxcp drm_exec gpu_sched drm_buddy i2c_algo_bit 
> drm_suballoc_helper drm_display_helper cec rc_core drm_ttm_helper xhci_pci 
> xhci_hcd ttm nvme i2c_hid_acpi i2c_hid usbcore drm_kms_helper hid nvme_core 
> i2c_i801 intel_lpss_pci crc32_pclmul video intel_lpss crc32c_intel e1000e 
> i2c_smbus nvme_auth crc16 idma64 usb_common drm battery wmi button
> [611645.174634] [ T388615] CR2: 0000000000000008

FWIW, the crash looks similar to what was discussed in
https://lore.kernel.org/all/20250107140240.325899-1-philipp.reis...@linbit.com/
.

Regards,
Salvatore

Reply via email to