> -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of > Soft Works > Sent: Friday, August 7, 2020 11:59 PM > To: FFmpeg development discussions and patches <ffmpeg- > de...@ffmpeg.org> > Subject: Re: [FFmpeg-devel] [PATCH v3 1/2] dxva: wait until D3D11 buffer > copies are done before submitting them > > > -----Original Message----- > > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of > > Steve Lhomme > > Sent: Friday, August 7, 2020 3:05 PM > > To: ffmpeg-devel@ffmpeg.org > > Subject: Re: [FFmpeg-devel] [PATCH v3 1/2] dxva: wait until D3D11 > > buffer copies are done before submitting them > > > > I experimented a bit more with this. Here are the 3 scenarii in other > > of least frame late: > > > > - GetData waiting for 1/2s and releasing the lock > > - No use of GetData (current code) > > - GetData waiting for 1/2s and keeping the lock > > > > The last option has horrible perfomance issues and should not be used. > > > > The first option gives about 50% less late frames compared to the > > current code. *But* it requires to unlock the Video Context. There are > > 2 problems with this: > > > > - the same ID3D11Asynchronous is used to wait on multiple concurrent > > thread. This can confuse D3D11 which emits a warning in the logs. > > - another thread might Get/Release some buffers and submit them before > > this thread is finished processing. That can result in distortions, > > for example if the second thread/frame depends on the first > > thread/frame which is not submitted yet. > > > > The former issue can be solved by using a ID3D11Asynchronous per thread. > > That requires some TLS storage which FFmpeg doesn't seem to support > yet. > > With this I get virtually no frame late. > > > > The latter issue only occur if the wait is too long. For example > > waiting by increments of 10ms is too long in my test. Using increments > > of 1ms or 2ms works fine in the most stressing sample I have (Sony > > Camping HDR HEVC high bitrate). But this seems hackish. There's still > > potentially a quick frame (alt frame in VPx/AV1 for example) that > > might get through to the decoder too early. (I suppose that's the > > source of the distortions I > > see) > > > > It's also possible to change the order of the buffer sending, by > > starting with the bigger one > (D3D11_VIDEO_DECODER_BUFFER_BITSTREAM). > > But it seems to have little influence, regardless if we wait for buffer > submission or not. > > > > The results are consistent between integrated GPU and dedicated GPU. > > Hi Steven, > > A while ago I had extended D3D11VA implementation to support single (non- > array textures) for interoperability with Intel QSV+DX11. > > I noticed a few bottlenecks making D3D11VA significantly slower than DXVA2. > > The solution was to use ID3D10Multithread_SetMultithreadProtected and > remove all the locks which are currently applied. > > Hence, I don't think that your patch is the best possible way . > > Regards, > softworkz
I almost forgot that I had published that change already: https://github.com/softworkz/ffmpeg_dx11/commit/c09cc37ce7f513717493e060df740aa0e7374257 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".