Hi

On Tue, Feb 15, 2022 at 7:09 AM Akihiko Odaki <akihiko.od...@gmail.com>
wrote:

>
>
> On 2022/02/15 4:49, Marc-André Lureau wrote:
> > Hi
> >
> > On Mon, Feb 14, 2022 at 5:15 PM Akihiko Odaki <akihiko.od...@gmail.com
> > <mailto:akihiko.od...@gmail.com>> wrote:
> >
> >     On Mon, Feb 14, 2022 at 9:07 PM Marc-André Lureau
> >     <marcandre.lur...@gmail.com <mailto:marcandre.lur...@gmail.com>>
> wrote:
> >      >
> >      > Hi Akihiko
> >      >
> >      > On Sun, Feb 13, 2022 at 6:44 AM Akihiko Odaki
> >     <akihiko.od...@gmail.com <mailto:akihiko.od...@gmail.com>> wrote:
> >      >>
> >      >> ui/dbus required to have multiple DisplayChangeListeners
> >     (possibly with OpenGL)
> >      >> for a console but that caused several problems:
> >      >> - It broke egl-headless, an unusual display which implements
> >     OpenGL rendering
> >      >>   for non-OpenGL displays. The code to support multiple
> >     DisplayChangeListeners
> >      >>   does not consider the case where non-OpenGL displays listens
> >     OpenGL consoles.
> >      >
> >      >
> >      > Can you provide instructions on what broke? Even better write a
> >     test, please.
> >
> >     The following command segfaults. Adding a test would be nice, but it
> >     would need a binary which uses OpenGL.
> >     qemu-system-x86_64 -device virtio-gpu-gl-pci -display egl-headless
> >     -vnc :0 -m 8G -cdrom Fedora-Workstation-Live-x86_64-34-1.2.iso -accel
> >     kvm
> >
> >
> > Thanks!
> >
> > This is clearly a mistake from commit 7cc712e98 ("ui: dispatch GL events
> > to all listener").
> >
> > It should have taken into account that some listeners do not have GL
> > callbacks, and guard the call.
> >
> > We should wrap the missing ops->dpy_gl_call() with a if
> > (ops->dpy_gl_call) ? I'll send a patch for that.
>
>
> The assumption that OpenGL DisplayChangeListener and non-OpenGL
> DisplayChangeListener are exclusive is now broken so we have to examine
>

Before the changes, there was already such a GL & non-GL listener mixed
situation. It's only because the first listener with GL would be registered
in register_displaychangelistener() that it "worked".

if the whole patch series works correct without the assumption. Other
> problem I have found (and forgot to mention) is:
> - that console_select and register_displaychangelistener may not call
> dpy_gfx_switch and call dpy_gl_scanout_texture instead. It is
> incompatible with non-OpenGL displays and
>

Can you translate that to a reproducible test with expected outcome?


> - that the compatibility check breaks if egl-headless is present and a
> non-OpenGL DisplayChangeListener with con field is being added.
>
>
Same, thanks


> By the way, dbus registers a DisplayChangeListener for the size
> detection, apparently it fails to set "con" field, making it an
> accidental user of console_select.
>

I need more details, please.


>
> >
> >      >
> >      > "make check-avocado AVOCADO_TESTS=tests/avocado/virtio-gpu.py",
> >     which covers egl-headless usage, works.
> >      >
> >      >>
> >      >> - Multiple OpenGL DisplayChangeListeners of dbus shares one
> >     DisplaySurface and
> >      >>   modifies its texture field, causing OpenGL texture leak and
> >     use-after-free.
> >      >
> >      >
> >      > Again, please provide instructions. I have regularly run -display
> >     dbus with multiple clients and qemu compiled with sanitizers. I
> >     don't see any leak or use after free.
> >
> >     I doubt sanitizers can find this because it is an OpenGL texture. You
> >     may add a probe around surface_gl_create_texture and
> >     surface_gl_destroy_texture.
> >
> >
> > Indeed, a surface is created on each frame, because we create a 2d
> > surface on each qemu_console_resize(), which is called at each virgl
> > scanout. This is a regression introduced by commit ebced09185 ("console:
> > save current scanout details"). I can propose an easy fix, please check
> it.
> >
> > And the root of the leak is actually surface_gl_create_texutre(), it
> > should be idempotent, just like destroy().
>
> That is not enough since it may leave the texture present after
> unregister_displaychangelistener. And calling
> surface_gl_destroy_texture() before unregister_displaychangelistener may
> break the other listeners.
>

The texture is shared, but owned by the console. It is not leaked.

However, given that it is shared, it would be indeed better to refcount the
users to avoid destroying the texture by one listener going away.


> >
> >
> >      >
> >      >>
> >      >> - Introduced extra code to check the compatibility of OpenGL
> >     context providers
> >      >>   and OpenGL texture renderers where those are often inherently
> >     tightly coupled
> >      >>   since they must share the same hardware.
> >      >
> >      >
> >      > So code checks are meant to prevent misusage. They might be too
> >     limited or broken in some ways, but reverting is likely going to
> >     introduce other regressions I was trying to fix.
> >
> >     The misuse will not occur because DisplayChangeListeners will be
> >     merged with OpenGL context providers.
> >
> >
> > Ok, but aren't the checks enough to prevent it already? I have to check
> > the use cases and differences of design, but you might be right that we
> > don't need such a split after all.
>
> Yes, the point is that it requires extra code.
>
> >
> >
> >      >
> >      >> - Needed extra work to broadcast the same change to multiple
> >     dbus listeners.
> >      >>
> >      >
> >      > Compared to what?
> >
> >     Compared to sharing one DisplayChangeListener for multiple dbus
> >     listeners.
> >
> >
> > Well, you just moved the problem to the dbus display, not removed any
> work.
> >
> > What we have currently is more generic, you should be able to add/remove
> > various listeners (in theory, we only really do it for dbus at this
> point).
>
> The each DisplayChangeListeners have to upload the DisplaySurface to the
> graphics accelerator, create a DMA-BUF file descriptor, and make it
> suitable for D-Bus delivery. The duplicate work can be just done once if
> we have only one DisplayChangeListener for one console.
>

And we should avoid duplicating the texture/resources if possible. This is
not specific to DBus, it's the reason why I would rather have the logic in
the console code so we avoid code duplication and we can do further
improvement at the lower-level.


>
> >
> >
> >      >
> >      >>
> >      >> This series solve them by implementing the change broadcast in
> >     ui/dbus, which
> >      >> knows exactly what is needed. Changes for the common code to
> >     support multiple
> >      >> OpenGL DisplayChangeListeners were reverted to fix egl-headless
> >     and reduce
> >      >> the code size.
> >      >
> >      >
> >      > Thanks a lot for your work, I am going to take a look at your
> >     approach. But please help us understand better what the problem
> >     actually is, by giving examples & tests to avoid future regressions
> >     and document the expected behaviour.
> >
> >     The thing is really complicated and I may miss details so please feel
> >     free to ask questions or provide suggestions.
> >
> >
> > Reverting changes and proposing an alternative implementation requires
> > detailed explanation and convincing arguments. It may take a while, but
> > we will try to get through the problems and evaluate the alternative
> > designs. Thanks again for your help!
>
> Rather, I think proposing a large change to common console code requires
> thorough examination and it should be reverted before it reaches the
> release if it is doubtful that it is correct and reduces the complexity
> of a few displays (possibly in the future). "No regression" should come
> first before fix and feature. We can always revisit the change land it
> in a proper form even after reverting the change.
>
>
That's not how we usually work, there was a long RFC&PATCH review period
during which testing was done. It's true that Gerd, the maintainer, didn't
do a thorough review though. If we have to revert, we can do it before the
release. However, I would rather fix the current design since it is meant
to be more generic & flexible. We still have some time.

Thanks again for your help

Regards,
> Akihiko Odaki
>
> >
> >     Regards,
> >     Akihiko Odaki
> >
> >
> >      >
> >      >>
> >      >> Akihiko Odaki (6):
> >      >>   ui/dbus: Share one listener for a console
> >      >>   Revert "console: save current scanout details"
> >      >>   Revert "ui: split the GL context in a different object"
> >      >>   Revert "ui: dispatch GL events to all listeners"
> >      >>   Revert "ui: associate GL context outside of display listener
> >      >>     registration"
> >      >>   Revert "ui: factor out qemu_console_set_display_gl_ctx()"
> >      >>
> >      >>  include/ui/console.h       |  60 +-----
> >      >>  include/ui/egl-context.h   |   6 +-
> >      >>  include/ui/gtk.h           |  11 +-
> >      >>  include/ui/sdl2.h          |   7 +-
> >      >>  include/ui/spice-display.h |   1 -
> >      >>  ui/console.c               | 258 +++++++----------------
> >      >>  ui/dbus-console.c          | 109 ++--------
> >      >>  ui/dbus-listener.c         | 417
> >     +++++++++++++++++++++++++++----------
> >      >>  ui/dbus.c                  |  22 --
> >      >>  ui/dbus.h                  |  36 +++-
> >      >>  ui/egl-context.c           |   6 +-
> >      >>  ui/egl-headless.c          |  20 +-
> >      >>  ui/gtk-egl.c               |  10 +-
> >      >>  ui/gtk-gl-area.c           |   8 +-
> >      >>  ui/gtk.c                   |  25 +--
> >      >>  ui/sdl2-gl.c               |  10 +-
> >      >>  ui/sdl2.c                  |  14 +-
> >      >>  ui/spice-display.c         |  19 +-
> >      >>  18 files changed, 498 insertions(+), 541 deletions(-)
> >      >>
> >      >> --
> >      >> 2.32.0 (Apple Git-132)
> >      >>
> >      >>
> >      >
> >      >
> >      > --
> >      > Marc-André Lureau
> >
> >
> >
> > --
> > Marc-André Lureau
>


-- 
Marc-André Lureau

Reply via email to