I suggest to keep it simple from a driver perspective and require applications to use vaSyncSurface
Currently our vaSyncSurface doesn't really do what the name suggests. All what we do is flushing the command buffers and that for good reasons.

That the application waits for all decoding to complete before handing of the surface to the post processing/display engine not only makes the application trickier to write (fortunately you already solved that for Kodi), but is also seriously bad for things like power management.

In other words we not only do this heavy pipelining of work to gain throughput, but also for the reason that the kernel driver and hardware need a good idea of what is coming next. When the application waits for decoding to finish before handing of post processing to the we can't make those predictions any more.

We even discussed to use some sort of hack to signal the kernel driver during video decode to not drop below a certain power level to handle such things, but for GL interop that would mean that we need to set an environment variable for video decoding because we can't really differentiate the use case from the driver side.

Regards,
Christian.

Am 19.03.2017 um 15:44 schrieb rainer.hochec...@onlinehome.de:
> for example how does synchronization happen between the two APIs?
right, vaapi seems not as matured as vdpau in this regard. But Kodi's multithreading design does cope with this. We call
vaSyncSurface before feeding vpp and before maping va buffers to GL.
I suggest to keep it simple from a driver perspective and require applications to use vaSyncSurface
*Gesendet:* Sonntag, 19. März 2017 um 15:28 Uhr
*Von:* "Christian König" <deathsim...@vodafone.de>
*An:* "Peter Frühberger" <peter.fruehber...@gmail.com>
*Cc:* "rainer.hochec...@onlinehome.de" <rainer.hochec...@onlinehome.de>, mesa-dev@lists.freedesktop.org, lru...@libreelec.tv, "Michel Dänzer" <mic...@daenzer.net>, "Marek Olšák" <mar...@gmail.com>, "Wentland, Harry" <harry.wentl...@amd.com>
*Betreff:* Re: 10bit HEVC decoding for RadeonSI v2

    What do you think?

In general that it might work, but basic problem is the API design once more.

While with VDPAU the steps where applications asks OpenGL to interop with VDPAU and the two APIs can do all the handshaking internally.

With VA-API we have Application exporting buffers from VA-API and then importing the same buffer as two surfaces into OpenGL.

That leaves a whole bunch of open questions, for example how does synchronization happen between the two APIs? E.g. the application (Kodi) probably doesn't wants to wait for the decoding result before it uses the the surface with OpenGL. We don't have a way to sync between the two APIs here except for the handle.

The next problem is how do we communicate the layout of data in the buffer? E.g. we have the format and the offset, but that assumes that you don't have any nasty kind of tilling modes applied here.

I think we can make that work for now (we aren't using tilling modes with UVD much anyway), but this is going to bite us again sooner or later. Going to put the whole thing on my todo list once more.

Regards,
Christian.

Am 19.03.2017 um 15:06 schrieb Peter Frühberger:

    Hi Christian,
    we use it the following way:
    Dependend on the surface NV12 vs. P010 we use:
    
https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1416

    R8 and GR88
    or alternatively:
    
https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1493
    R16 and GR32
    There is also a possibility to use BGRA, but this involves
    internal copy of the yuv surfaces in vaapi and is therefore not
    suited well (more memory and more load).
    For both images Y and UV we use: eglCreateImageKHR extension
    follow by glEGLImageTargetTexture2DOES.
    See:
    
https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1262
    On the VAAPI side:
    VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME with
    either VA_RT_FORMAT_YUV420 or VA_FOURCC_P010 are used.
    I think that method is quite generalizable and nothing is intel
    specific.
    What do you think?
    Best regards
    Peter
    2017-03-19 14:49 GMT+01:00 Christian König
    <deathsim...@vodafone.de <mailto:deathsim...@vodafone.de>>:

        Hi Peter,

        Adding Michel and Marek for the Mesa interop side and Harry
        for the display side.

            How do you want us to display the decoded surfaces?

        Well to make a long story short: I don't have the slightest
        idea. Ideally we would of the same handling as Intel so that
        you guys don't have anything vendor dependent in your code.

        The first step would be to get the VA-API DRM extension to
        work with EGL. So that Kodi is able to export the YUV surfaces
        and import parts of them as separate R8/R16 or R8G8/R16G16
        surfaces, right?

        What EGL/GL extension do you guys use to import the surfaces?
        Marek is that stuff fully supported, e.g. do we also handle
        the offsets correctly? I've added the backend code for this
        while doing VDPAU interop, but the EGL/GL frontend code needs
        to handle it gracefully as well.

        The second step is then to teach our DC how to handle RGB
        surfaces with 10bit. I doubt the old code has support for that
        and we probably don't want to add it. So Harry can you comment
        on how far along we got with that in DC?

        Regards,
        Christian.


        Am 19.03.2017 um 13:26 schrieb Peter Frühberger:

            Hi Christian,
            thank you for your message. We are still wondering about
            the render part. How do you want us to display the decoded
            surfaces? Looking at mpv it seems it will only work via
            vaPutSurface and is therefore tight to X11. That means
            it's dependend on the visuals 8 bit only.
            We are working on a drm-only kodi and now ask ourselves:
            Is there a possibility to interop with a drm extension and
            eglCreateImage on AMD hw, too? With the intel only R32, R8
            linux buf methods we are also running succesfully on MIR
            now, wayland would work the very same.
            Best regards
            Peter
            2017-03-10 17:25 GMT+01:00 rainer.hochec...@onlinehome.de
            <mailto:rainer.hochec...@onlinehome.de>
            <rainer.hochec...@onlinehome.de
            <mailto:rainer.hochec...@onlinehome.de>>:

                Hi Christian,
                I already removed the check for Intel in my dev
                branch. On startup
                Kodi does a functional test if vaapi works. If the
                test passes, it is availalbe
                regarless of the underlying type of hardware/driver.
                Regards,
                Rainer
                *Gesendet:* Mittwoch, 08. März 2017 um 13:29 Uhr
                *Von:* "Christian König" <deathsim...@vodafone.de
                <mailto:deathsim...@vodafone.de>>
                *An:* mesa-dev@lists.freedesktop.org
                <mailto:mesa-dev@lists.freedesktop.org>
                *Cc:* rainer.hochec...@onlinehome.de
                <mailto:rainer.hochec...@onlinehome.de>,
                peter.fruehber...@gmail.com
                <mailto:peter.fruehber...@gmail.com>
                *Betreff:* 10bit HEVC decoding for RadeonSI v2
                Hi guys,

                I finally found time testing this and hammering out
                (hopefully) all the
                remaining bugs. Playing a 10bit HEVC file through
                VAAPI with mpv/ffmpeg git
                master from about two days ago now works flawlessly
                and has only about 15% CPU
                load on one core on a Kaveri system.

                The VDPAU path should work as well, but NVidias
                implementation of this is still
                completely broken and so nobody enables it and we
                don't have a way to test it.

                Rainer/Peter maybe you guys want to take a look and
                enable it in Kodi.

                The next logical step is to get our display code paths
                to be 10bit ready.

                Please review and comment,
                Christian.

-- Key-ID: 0x1A995A9B
                               keyserver: pgp.mit.edu <http://pgp.mit.edu>
            ==============================================================
            Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99
            5A9B

-- Key-ID: 0x1A995A9B
                       keyserver: pgp.mit.edu <http://pgp.mit.edu>
    ==============================================================
    Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B


_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to