> -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of > Steve Lhomme > Sent: Friday, August 7, 2020 3:05 PM > To: ffmpeg-devel@ffmpeg.org > Subject: Re: [FFmpeg-devel] [PATCH v3 1/2] dxva: wait until D3D11 buffer > copies are done before submitting them > > I experimented a bit more with this. Here are the 3 scenarii in other of least > frame late: > > - GetData waiting for 1/2s and releasing the lock > - No use of GetData (current code) > - GetData waiting for 1/2s and keeping the lock > > The last option has horrible perfomance issues and should not be used. > > The first option gives about 50% less late frames compared to the current > code. *But* it requires to unlock the Video Context. There are 2 problems > with this: > > - the same ID3D11Asynchronous is used to wait on multiple concurrent > thread. This can confuse D3D11 which emits a warning in the logs. > - another thread might Get/Release some buffers and submit them before > this thread is finished processing. That can result in distortions, for > example if > the second thread/frame depends on the first thread/frame which is not > submitted yet. > > The former issue can be solved by using a ID3D11Asynchronous per thread. > That requires some TLS storage which FFmpeg doesn't seem to support yet. > With this I get virtually no frame late. > > The latter issue only occur if the wait is too long. For example waiting by > increments of 10ms is too long in my test. Using increments of 1ms or 2ms > works fine in the most stressing sample I have (Sony Camping HDR HEVC high > bitrate). But this seems hackish. There's still potentially a quick frame (alt > frame in VPx/AV1 for example) that might get through to the decoder too > early. (I suppose that's the source of the distortions I > see) > > It's also possible to change the order of the buffer sending, by starting with > the bigger one (D3D11_VIDEO_DECODER_BUFFER_BITSTREAM). But it seems > to have little influence, regardless if we wait for buffer submission or not. > > The results are consistent between integrated GPU and dedicated GPU.
Hi Steven, A while ago I had extended D3D11VA implementation to support single (non-array textures) for interoperability with Intel QSV+DX11. I noticed a few bottlenecks making D3D11VA significantly slower than DXVA2. The solution was to use ID3D10Multithread_SetMultithreadProtected and remove all the locks which are currently applied. Hence, I don't think that your patch is the best possible way . Regards, softworkz _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".