Hi Mark,

Did you still encounter hung issue?

If yes, could you share me with your play and transcode streams and command 
line,
then I can try to reproduce at my side.

Thanks & Best Regards!

James Zhu

On 2018-02-10 11:06 AM, Mark Thompson wrote:

On 08/02/18 23:05, Mark Thompson wrote:
On 08/02/18 22:37, Alex Deucher wrote:
On Thu, Feb 8, 2018 at 5:28 PM, Mark Thompson <s...@jkqxz.net> wrote:
On 06/02/18 20:05, James Zhu wrote:
The whole series are the updated version. Changes are made mainly based
on the comments from prevous code review from Alex, Leo and Boyuan

James Zhu (8):
   amd/common:add uvd hevc enc support check in hw query
   winsys/amdgpu:add uvd hevc enc support in amdgpu cs
   radeon/uvd:add uvd hevc enc hw interface header
   radeon/uvd:add uvd hevc enc hw ib implementation
   radeon/uvd:add uvd hevc enc functions
   radeon/uvd:add uvd hevc enc files in Makefile list
   radeonsi:create uvd hevc enc entry
   radeonsi: enable uvd encode for HEVC main

  src/amd/common/ac_gpu_info.c                    |   10 +-
  src/amd/common/ac_gpu_info.h                    |    1 +
  src/gallium/drivers/radeon/Makefile.sources     |    3 +
  src/gallium/drivers/radeon/radeon_uvd_enc.c     |  370 ++++++++
  src/gallium/drivers/radeon/radeon_uvd_enc.h     |  471 ++++++++++
  src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c | 1115 +++++++++++++++++++++++
  src/gallium/drivers/radeonsi/si_get.c           |    4 +-
  src/gallium/drivers/radeonsi/si_uvd.c           |   15 +-
  src/gallium/winsys/amdgpu/drm/amdgpu_cs.c       |    6 +
  9 files changed, 1990 insertions(+), 5 deletions(-)
  create mode 100644 src/gallium/drivers/radeon/radeon_uvd_enc.c
  create mode 100644 src/gallium/drivers/radeon/radeon_uvd_enc.h
  create mode 100644 src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c

Can you explain what the requirements are for using this (hardware, firmware, 
software)?

 From what I can find it should be on Polaris and Vega, but I haven't succeeded 
in getting it working on Polaris.
Yes, polaris and vega10.  For polaris, you'll need a kernel that
enables the uvd enc rings.  Patches went upstream last year, 4.14 I
think?  4.15 is a good bet.
Ah, that's where I'm going wrong - despite the dates it's not actually in 4.14, 
so I need 4.15.

                              As for the polaris firmware, you'll need
version FW_1_130_16 or newer:
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=2a713be25a44bd6cec90d8affc54b246a2ca9c7b
Right, I have the encoder working with 4.15.2 on an RX 460 / Polaris 11 with 
firmware 1.130_16.

There seems to be some issue with using both encode and playback at the same 
time?  It hangs the amdgpu driver and all userspaces processes interacting with 
it become stuck and unkillable, requiring a reboot to recover.  It's completely 
repeatable, and only needs a few seconds to die when both mpv (playback) and 
ffmpeg (transcode) are running at the same time.

There is no message at all from the stuck driver, but I end up with hung tasks 
like:

[ 1209.317130] INFO: task kworker/u24:0:5 blocked for more than 120 seconds.
[ 1209.317132]       Not tainted 4.15.2 #2
[ 1209.317133] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1209.317133] kworker/u24:0   D    0     5      2 0x80000000
[ 1209.317137] Workqueue: events_unbound commit_work
[ 1209.317138] Call Trace:
[ 1209.317142]  ? __schedule+0x26b/0x840
[ 1209.317144]  ? __update_load_avg_se.isra.37+0x1b6/0x1c0
[ 1209.317145]  schedule+0x28/0x80
[ 1209.317146]  schedule_timeout+0x1de/0x360
[ 1209.317177]  ? dce110_timing_generator_get_position+0x51/0x60 [amdgpu]
[ 1209.317199]  ? dce110_timing_generator_get_crtc_scanoutpos+0x6b/0xa0 [amdgpu]
[ 1209.317201]  dma_fence_default_wait+0x1f6/0x280
[ 1209.317203]  ? dma_fence_release+0x90/0x90
[ 1209.317204]  dma_fence_wait_timeout+0x33/0xe0
[ 1209.317205]  reservation_object_wait_timeout_rcu+0x198/0x340
[ 1209.317227]  amdgpu_dm_do_flip+0x112/0x350 [amdgpu]
[ 1209.317248]  amdgpu_dm_atomic_commit_tail+0x8a4/0x9a0 [amdgpu]
[ 1209.317250]  ? pick_next_task_fair+0x14f/0x5f0
[ 1209.317251]  commit_tail+0x3a/0x70
[ 1209.317252]  process_one_work+0x17c/0x370
[ 1209.317253]  worker_thread+0x2e/0x370
[ 1209.317255]  ? process_one_work+0x370/0x370
[ 1209.317256]  kthread+0x111/0x130
[ 1209.317257]  ? kthread_create_worker_on_cpu+0x70/0x70
[ 1209.317258]  ret_from_fork+0x1f/0x30
[ 1330.152054] INFO: task kworker/u24:0:5 blocked for more than 120 seconds.
[ 1330.152056]       Not tainted 4.15.2 #2
[ 1330.152056] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1330.152057] kworker/u24:0   D    0     5      2 0x80000000
[ 1330.152059] Workqueue: events_unbound commit_work
[ 1330.152060] Call Trace:
[ 1330.152063]  ? __schedule+0x26b/0x840
[ 1330.152065]  ? __update_load_avg_se.isra.37+0x1b6/0x1c0
[ 1330.152066]  schedule+0x28/0x80
[ 1330.152067]  schedule_timeout+0x1de/0x360
[ 1330.152108]  ? dce110_timing_generator_get_position+0x51/0x60 [amdgpu]
[ 1330.152130]  ? dce110_timing_generator_get_crtc_scanoutpos+0x6b/0xa0 [amdgpu]
[ 1330.152132]  dma_fence_default_wait+0x1f6/0x280
[ 1330.152133]  ? dma_fence_release+0x90/0x90
[ 1330.152134]  dma_fence_wait_timeout+0x33/0xe0
[ 1330.152136]  reservation_object_wait_timeout_rcu+0x198/0x340
[ 1330.152158]  amdgpu_dm_do_flip+0x112/0x350 [amdgpu]
[ 1330.152179]  amdgpu_dm_atomic_commit_tail+0x8a4/0x9a0 [amdgpu]
[ 1330.152180]  ? pick_next_task_fair+0x14f/0x5f0
[ 1330.152181]  commit_tail+0x3a/0x70
[ 1330.152183]  process_one_work+0x17c/0x370
[ 1330.152184]  worker_thread+0x2e/0x370
[ 1330.152185]  ? process_one_work+0x370/0x370
[ 1330.152186]  kthread+0x111/0x130
[ 1330.152187]  ? kthread_create_worker_on_cpu+0x70/0x70
[ 1330.152188]  ret_from_fork+0x1f/0x30
[ 1330.152196] INFO: task mpv/vo:3113 blocked for more than 120 seconds.
[ 1330.152197]       Not tainted 4.15.2 #2
[ 1330.152197] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1330.152198] mpv/vo          D    0  3113   2983 0x80000006
[ 1330.152199] Call Trace:
[ 1330.152200]  ? __schedule+0x26b/0x840
[ 1330.152201]  schedule+0x28/0x80
[ 1330.152202]  schedule_preempt_disabled+0xa/0x10
[ 1330.152204]  __mutex_lock.isra.1+0x18e/0x4c0
[ 1330.152205]  ? drm_release+0x36/0x3b0
[ 1330.152206]  drm_release+0x36/0x3b0
[ 1330.152208]  __fput+0xcd/0x1d0
[ 1330.152210]  task_work_run+0x7b/0xa0
[ 1330.152211]  do_exit+0x2d0/0xb10
[ 1330.152212]  ? __check_object_size+0xaf/0x1b0
[ 1330.152214]  ? _copy_to_user+0x22/0x30
[ 1330.152215]  ? drm_ioctl+0x2ee/0x380
[ 1330.152216]  do_group_exit+0x3a/0xa0
[ 1330.152217]  get_signal+0x260/0x560
[ 1330.152219]  do_signal+0x36/0x690
[ 1330.152231]  ? amdgpu_drm_ioctl+0x6c/0x80 [amdgpu]
[ 1330.152233]  ? do_vfs_ioctl+0xa1/0x610
[ 1330.152234]  exit_to_usermode_loop+0x58/0x90
[ 1330.152235]  do_syscall_64+0xe8/0xf0
[ 1330.152236]  entry_SYSCALL_64_after_hwframe+0x21/0x86
[ 1330.152238] RIP: 0033:0x7f95a1036e6b
[ 1330.152238] RSP: 002b:00007f959b0fa0b0 EFLAGS: 00000293 ORIG_RAX: 
0000000000000007
[ 1330.152239] RAX: fffffffffffffdfc RBX: 00007f959b0fa0f0 RCX: 00007f95a1036e6b
[ 1330.152240] RDX: ffffffffffffffff RSI: 0000000000000001 RDI: 00007f959b0fa0f0
[ 1330.152240] RBP: 0000000000000001 R08: 0000000000000000 R09: 00007f959b0fa400
[ 1330.152241] R10: 0000000000000106 R11: 0000000000000293 R12: 00007f95940376d8
[ 1330.152241] R13: 00007f95943e95a8 R14: 00000000ffffffff R15: 00007f959b0fa0f0
[ 1330.152243] INFO: task ffmpeg_g:3143 blocked for more than 120 seconds.
[ 1330.152243]       Not tainted 4.15.2 #2
[ 1330.152244] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1330.152244] ffmpeg_g        D    0  3143   2402 0x80000006
[ 1330.152245] Call Trace:
[ 1330.152246]  ? __schedule+0x26b/0x840
[ 1330.152247]  schedule+0x28/0x80
[ 1330.152267]  amd_sched_entity_push_job+0xa3/0xf0 [amdgpu]
[ 1330.152269]  ? finish_wait+0x80/0x80
[ 1330.152288]  amdgpu_job_submit+0x9c/0xc0 [amdgpu]
[ 1330.152303]  amdgpu_vm_bo_update_mapping+0x383/0x3f0 [amdgpu]
[ 1330.152318]  ? amdgpu_vm_free_mapping.isra.20+0x20/0x20 [amdgpu]
[ 1330.152331]  amdgpu_vm_clear_freed+0xbb/0x190 [amdgpu]
[ 1330.152345]  amdgpu_gem_object_close+0x19c/0x210 [amdgpu]
[ 1330.152348]  ? drm_gem_object_release_handle+0x2c/0x90
[ 1330.152349]  drm_gem_object_release_handle+0x2c/0x90
[ 1330.152350]  ? drm_gem_object_handle_put_unlocked+0xb0/0xb0
[ 1330.152352]  idr_for_each+0x48/0xe0
[ 1330.152353]  drm_gem_release+0x1c/0x30
[ 1330.152354]  drm_release+0x342/0x3b0
[ 1330.152356]  __fput+0xcd/0x1d0
[ 1330.152357]  task_work_run+0x7b/0xa0
[ 1330.152358]  do_exit+0x2d0/0xb10
[ 1330.152359]  do_group_exit+0x3a/0xa0
[ 1330.152360]  get_signal+0x260/0x560
[ 1330.152361]  do_signal+0x36/0x690
[ 1330.152363]  ? __vma_rb_erase+0x1f6/0x270
[ 1330.152364]  ? SyS_futex+0x12d/0x180
[ 1330.152365]  exit_to_usermode_loop+0x58/0x90
[ 1330.152366]  do_syscall_64+0xe8/0xf0
[ 1330.152367]  entry_SYSCALL_64_after_hwframe+0x21/0x86
[ 1330.152368] RIP: 0033:0x7f60bb3df7dd
[ 1330.152368] RSP: 002b:00007f60927fbdd0 EFLAGS: 00000246 ORIG_RAX: 
00000000000000ca
[ 1330.152369] RAX: fffffffffffffe00 RBX: 0000557f96700178 RCX: 00007f60bb3df7dd
[ 1330.152370] RDX: 0000000000000000 RSI: 0000000000000080 RDI: 0000557f967001a4
[ 1330.152370] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000557f96879778
[ 1330.152371] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000033
[ 1330.152371] R13: 0000557f96700208 R14: 0000000000000000 R15: 0000557f967001a4


Is that known?  Is there anything else I can do with it?

- Mark

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to