On 01/09/2018 09:44 AM, Johannes Hirte wrote:
On 2018 Jan 03, Johannes Hirte wrote:
On 2018 Jan 03, Johannes Hirte wrote:
This should be fixed already with
https://lists.freedesktop.org/archives/amd-gfx/2017-October/014932.html
but's still missing upstream.
With this patch, the use-after-free in amdgpu_job_free_cb seems to be
gone. But now I get an use-after-free in
drm_atomic_helper_wait_for_flip_done:
[89387.069387]
==================================================================
[89387.069407] BUG: KASAN: use-after-free in
drm_atomic_helper_wait_for_flip_done+0x24f/0x270
[89387.069413] Read of size 8 at addr ffff880124df0688 by task
kworker/u8:3/31426
[89387.069423] CPU: 1 PID: 31426 Comm: kworker/u8:3 Not tainted
4.15.0-rc6-00001-ge0895ba8d88e #442
[89387.069427] Hardware name: HP HP ProBook 645 G2/80FE, BIOS N77 Ver. 01.10
10/12/2017
[89387.069435] Workqueue: events_unbound commit_work
[89387.069440] Call Trace:
[89387.069448] dump_stack+0x99/0x11e
[89387.069453] ? _atomic_dec_and_lock+0x152/0x152
[89387.069460] print_address_description+0x65/0x270
[89387.069465] kasan_report+0x272/0x360
[89387.069470] ? drm_atomic_helper_wait_for_flip_done+0x24f/0x270
[89387.069475] drm_atomic_helper_wait_for_flip_done+0x24f/0x270
[89387.069483] amdgpu_dm_atomic_commit_tail+0x185e/0x2b90
[89387.069492] ? dm_crtc_duplicate_state+0x130/0x130
[89387.069498] ? drm_atomic_helper_wait_for_dependencies+0x3f2/0x800
[89387.069504] commit_tail+0x92/0xe0
[89387.069511] process_one_work+0x84b/0x1600
[89387.069517] ? tick_nohz_dep_clear_signal+0x20/0x20
[89387.069522] ? _raw_spin_unlock_irq+0xbe/0x120
[89387.069525] ? _raw_spin_unlock+0x120/0x120
[89387.069529] ? pwq_dec_nr_in_flight+0x3c0/0x3c0
[89387.069534] ? arch_vtime_task_switch+0xee/0x190
[89387.069539] ? finish_task_switch+0x27d/0x7f0
[89387.069542] ? wq_worker_waking_up+0xc0/0xc0
[89387.069547] ? copy_overflow+0x20/0x20
[89387.069550] ? sched_clock_cpu+0x18/0x1e0
[89387.069558] ? pci_mmcfg_check_reserved+0x100/0x100
[89387.069562] ? pci_mmcfg_check_reserved+0x100/0x100
[89387.069569] ? schedule+0xfb/0x3b0
[89387.069574] ? __schedule+0x19b0/0x19b0
[89387.069578] ? _raw_spin_unlock_irq+0xb9/0x120
[89387.069582] ? _raw_spin_unlock_irq+0xbe/0x120
[89387.069585] ? _raw_spin_unlock+0x120/0x120
[89387.069590] worker_thread+0x211/0x1790
[89387.069597] ? pick_next_task_fair+0x313/0x10f0
[89387.069601] ? trace_event_raw_event_workqueue_work+0x170/0x170
[89387.069606] ? __read_once_size_nocheck.constprop.6+0x10/0x10
[89387.069612] ? tick_nohz_dep_clear_signal+0x20/0x20
[89387.069616] ? account_idle_time+0x94/0x1f0
[89387.069620] ? _raw_spin_unlock_irq+0xbe/0x120
[89387.069623] ? _raw_spin_unlock+0x120/0x120
[89387.069628] ? finish_task_switch+0x27d/0x7f0
[89387.069633] ? sched_clock_cpu+0x18/0x1e0
[89387.069639] ? ret_from_fork+0x1f/0x30
[89387.069644] ? pci_mmcfg_check_reserved+0x100/0x100
[89387.069650] ? cyc2ns_read_end+0x20/0x20
[89387.069657] ? schedule+0xfb/0x3b0
[89387.069662] ? __schedule+0x19b0/0x19b0
[89387.069666] ? remove_wait_queue+0x2b0/0x2b0
[89387.069670] ? arch_vtime_task_switch+0xee/0x190
[89387.069675] ? _raw_spin_unlock_irqrestore+0xc2/0x130
[89387.069679] ? _raw_spin_unlock_irq+0x120/0x120
[89387.069683] ? trace_event_raw_event_workqueue_work+0x170/0x170
[89387.069688] kthread+0x2d4/0x390
[89387.069693] ? kthread_create_worker+0xd0/0xd0
[89387.069697] ret_from_fork+0x1f/0x30
[89387.069705] Allocated by task 2387:
[89387.069712] kasan_kmalloc+0xa0/0xd0
[89387.069717] kmem_cache_alloc_trace+0xd1/0x1e0
[89387.069722] dm_crtc_duplicate_state+0x73/0x130
[89387.069726] drm_atomic_get_crtc_state+0x13c/0x400
[89387.069730] page_flip_common+0x52/0x230
[89387.069734] drm_atomic_helper_page_flip+0xa1/0x100
[89387.069739] drm_mode_page_flip_ioctl+0xc10/0x1030
[89387.069744] drm_ioctl_kernel+0x1b5/0x2c0
[89387.069748] drm_ioctl+0x709/0xa00
[89387.069752] amdgpu_drm_ioctl+0x118/0x280
[89387.069756] do_vfs_ioctl+0x18a/0x1260
[89387.069760] SyS_ioctl+0x6f/0x80
[89387.069764] do_syscall_64+0x220/0x670
[89387.069768] return_from_SYSCALL_64+0x0/0x65
[89387.069772] Freed by task 2533:
[89387.069776] kasan_slab_free+0x71/0xc0
[89387.069780] kfree+0x88/0x1b0
[89387.069784] drm_atomic_state_default_clear+0x2c8/0xa00
[89387.069787] __drm_atomic_state_free+0x30/0xd0
[89387.069791] drm_atomic_helper_update_plane+0xb6/0x350
[89387.069794] __setplane_internal+0x5b4/0x9d0
[89387.069798] drm_mode_cursor_universal+0x412/0xc60
[89387.069801] drm_mode_cursor_common+0x4b6/0x890
[89387.069805] drm_mode_cursor_ioctl+0xd3/0x120
[89387.069809] drm_ioctl_kernel+0x1b5/0x2c0
[89387.069813] drm_ioctl+0x709/0xa00
[89387.069816] amdgpu_drm_ioctl+0x118/0x280
[89387.069819] do_vfs_ioctl+0x18a/0x1260
[89387.069822] SyS_ioctl+0x6f/0x80
[89387.069824] do_syscall_64+0x220/0x670
[89387.069828] return_from_SYSCALL_64+0x0/0x65
[89387.069834] The buggy address belongs to the object at ffff880124df0480
[89387.069839] The buggy address is located 520 bytes inside of
[89387.069843] The buggy address belongs to the page:
[89387.069849] page:00000000b20cc097 count:1 mapcount:0 mapping:
(null) index:0x0 compound_mapcount: 0
[89387.069856] flags: 0x2000000000008100(slab|head)
[89387.069862] raw: 2000000000008100 0000000000000000 0000000000000000
00000001801c001c
[89387.069867] raw: dead000000000100 dead000000000200 ffff8803f3002c40
0000000000000000
[89387.069869] page dumped because: kasan: bad access detected
[89387.069874] Memory state around the buggy address:
[89387.069878] ffff880124df0580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
fb
[89387.069881] ffff880124df0600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
fb
[89387.069885] >ffff880124df0680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
fb
[89387.069888] ^
[89387.069891] ffff880124df0700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
fb
[89387.069895] ffff880124df0780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
fb
[89387.069897]
==================================================================
ping? There are two different use-after-free in kernel-code and nobody
cares?
+ Harry and Leo
Hi, is there a particular scenario when this happens , can you add dmesg
with echo 0x10 > /sys/module/drm/parameters/debug?
From quick look looks like bad refcount over old crtct state,
drm_atomic_state_put in __setplane_internal will cause CRTC state
release from drm_atomic_state_put instead of just decrementing refcount
as it supposed to be since
drm_atomic_commit called from __setplane_internal should've attached
those states to CRTC objects. I would trace the refcounts to verify this.
Thanks,
Andrey
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx