[AMD Official Use Only - General]

Hi Mark,

Just got the ffmpeg email, please see my answer below in [rdong].

Thanks,
Ruijing

-----Original Message-----
From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of Mark Thompson
Sent: Tuesday, November 22, 2022 6:34 PM
To: ffmpeg-devel@ffmpeg.org
Subject: Re: [FFmpeg-devel] [PATCH v2] avcodec/av1_vaapi: add direct film grain 
mode

On 22/11/2022 20:59, Mark Thompson wrote:
> On 22/11/2022 20:26, Mark Thompson wrote:
>> On 22/11/2022 19:18, Dong, Ruijing wrote:
>>> [AMD Official Use Only - General]
>>>
>>> Hi Mark,
>>>
>>> Sorry for being late to reply to you.
>>>
>>> Your understanding is correct, and I have sent a new patch [v4] for
>>> addressing the current issue and to use driver quirk mechanism to specify 
>>> only AMD VAAPI driver has this behavior, then this could be more specific.
>>>
>>> For AMD hardware, it allocates GPU memory internally for the DPB 
>>> management, the output is always the final one with or without applied 
>>> film-grain.
>>
>> I don't see why this requires you to write the output to the wrong surface.  
>> Why not write it to the correct one instead?
>
> Indeed, this seems to be a trivial fix in Mesa: 
> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fmesa%2Fmesa%2F-%2Fmerge_requests%2F19938&amp;data=05%7C01%7Cruijing.dong%40amd.com%7C0896eeffcf674092aa8408dacce228f3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047570363839738%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=3Be3k9TQdRjGJv2KYNsssFjl3ORaNXzdSBqei%2BgYSBg%3D&amp;reserved=0>.
>
> It would be helpful if someone with suitable hardware could test that.

This was too naive, the Mesa driver doesn't make this easy.

It is only set up to write to a single surface, which is the one provided to 
vaBeginPicture().  However, VAAPI does not work that way - it wants you to 
write to both the pre-grain and the post-grain surfaces, where the pre-grain 
surface is the primary target and gets passed the vaBeginPicture() and the 
post-grain surface is supplied in the parameters.

So that's the first problem: the render target which is given as the pre-grain 
surface needs to be replaced by post-grain surface if we want to only write a 
single surface.

[rdong] Well, the render surface targets to output the displayable surface, and 
film grain has many ways to be carried on,  one way is to let application doing 
next step, the other way is the
decoder who can directly output the final applied grain result.  The VAAPI 
interface should accommodate as much as possible hardware instead of limiting 
the only way
for the implementation. AMD decoder for the performance optimization (tiled 
formats and swizzle modes) and other considerations, it will need to manage the 
reference frames internally, and there is no point to implement a new logic to 
output both pre-grain and post-grain surfaces. If grain_apply is set, then the 
decoder will only output the applied grain result.

Is that enough?  Well, no.  The Mesa driver is also messing with the reference 
frames.

The VAAPI model is that the pre-grain surfaces are passed back into the driver 
on subsequent frames when they are used as reference frames.  However, the Mesa 
driver has hidden the pre-grain surface internally and only written the 
post-grain surface.

Therefore, when writing a post-grain output, it magically associates with the 
target surface information about the pre-grain surface which was written 
internally at the same time.  Then, when you later give it that surface as a 
reference frame it ignores the actual content of the frame and looks at the 
associated data to find what to use internally as the reference.

That's the second problem: if the post-grain surface were actually the render 
target then the magic internal reference gets associated with that, and when we 
pass the real reference frame (the pre-grain surface) in later then it won't 
recognise it because it never wrote to that surface.

[rdong] In my understanding the target surface should be the final output, and 
the reason AMD decoder cannot output the reference frames I have already 
described above, in fact, the reference frame buffer pointers are used as 
reference to indicate how the reference picture buffer will managed internally, 
please understand there are some concept differences there.

How should it be fixed, then?

The best way would be to stop hiding the internal information about reference 
frames: if the real reference frames were visible in VAAPI then everything 
would just work and none of the magic internal references would be needed.

[rdong] Well, it has many difficulties to map the well designed AMD decoder to 
the VAAPI interface, however we could not expect everything is perfect.

If we suppose that this can't be done (maybe it is hidden behind opaque 
firmware which the naughty users buying the products are not allowed to see), 
then Mesa needs two changes:

1.  Write the output to the post-grain surface rather than the pre-grain 
surface.  This is nontrivial because it isn't the surface passed to 
vaBeginPicture(), but given the API there isn't really any way around it.

2.  Attach the magic internal reference to the pre-grain surface, /even though 
it wasn't the one written to/.  This makes the reference frames work, since the 
pre-grain surfaces will be the ones passed back in later frames.

[rdong] What is the point of passing back and forth the so called pre-grain 
surfaces as reference frames?

Alternatively: make new API in libva somehow.  Probably wants an attribute 
which indicates that vaBeginPicture() would want the post-grain surface and 
then ignore the surfaces in the picture parameters?  Unclear exactly how this 
should be specified, but whatever it is it needs to be very clear about how the 
references would work.

Hacking FFmpeg to use the API differently based on matching substrings in the 
vendor name does not seem like a good approach here, given that future Mesa 
decode implementations (which could be AMD or could be other hardware) may well 
be more sensible.  It would also doom other VAAPI users, since you can only 
really send this sort of hack to a few big projects like FFmpeg.

[rdong] I agree with you on this idea, however as I checked the current 
substring is in fact only used by AMD driver, if there were some better ideas, 
we would like to consider them as well.


Thanks,

- Mark
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fffmpeg.org%2Fmailman%2Flistinfo%2Fffmpeg-devel&amp;data=05%7C01%7Cruijing.dong%40amd.com%7C0896eeffcf674092aa8408dacce228f3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047570363839738%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=OSXDuJzGJCUpgkzQlV8duiClz8Eb8nVLc2MG01o%2Fs3Q%3D&amp;reserved=0

To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with 
subject "unsubscribe".

<<attachment: winmail.dat>>

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to