from:"Mario Kleiner"

Re: [Mesa-dev] [PATCH] nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads.

2015-06-27 Thread Mario Kleiner


On 06/05/2015 03:57 PM, Mario Kleiner wrote:

On 06/05/2015 03:50 PM, Ilia Mirkin wrote:

This scheme is copied from radeon, does it need a similar fix? I'm away
from computers for another week or so, will be able to look then.



For some reason, no. Testing on Radeon multi-x-screen ZaphodHeads never
showed such problems over multiple years and many mesa versions.
Apparently it does something slightly different, although i saw the
hashing logic with fstat() is the same. Maybe correct by slightly
different design, or maybe we just get lucky because the same fd's
aren't recycled on successive glXQueryVersion() calls?

thanks,
-mario



And to contradict myself, yes, it needs a similar fix for radeon, as 
both code inspection and testing showed. Not sure why this (seemed?) to 
work without problems for me for a long time, but at least on Mesa 10.5+ 
it crashes nicely in the expected way.


I'll send out a patch in the next minute which fixes the problem, as 
tested on a triple-display Radeon HD-5770 setup with different single 
x-screen and multi-x-screen ZaphodHeads configs.


thanks,
-mario




On Jun 5, 2015 4:37 PM, "Mario Kleiner" mailto:mario.kleiner...@gmail.com>> wrote:

The dup'ed fd owned by the nouveau_screen for a device node
must also be used as key for the winsys hash table, instead
of using the original fd passed in for a screen, to make
multi-x-screen ZaphodHeads configurations work on nouveau.

This prevents the following crash scenario that was observed
when a dynamically loaded rendering plugin used OpenGL on a
ZaphodHeads setup, e.g., on a dual x-screen setup. At first
load the plugin worked, but after unloading and reloading it,
the next rendering operation crashed:

1. Client, e.g., a plugin, calls glXQueryVersion.

2. DRI screens for the x-screens 0 and 1 are created, one shared
nouveau_screen is created for the shared device node of both
screens, but the original fd of x-screen 0 is used as identifying
key in the hash table, instead of the dup()ed fd of x-screen 0
which is owned by the nouveau_screen. nouveau_screen's refcount
is now 2.

3. Regular rendering happens by the client plugin, then the plugin
gets unloaded.

4. XCloseDisplay(). x-screen 0 gets its DRI screen destroyed,
nouveau_drm_screen_unref() drops the refcount to 1, calling mesa
code then closes the fd of x-screen 0, so now the fd which is
used as key in the hash table is invalid. x-screen 1 gets
destroyed, nouveau_drm_screen_unref() drops the refcount to 0,
the nouveau_screen gets destroyed, but removal of its entry
in the hash table fails, because the invalid fd in the hash
table no longer matches anything (fstat() on the fd is used
for hashing and key comparison, but fstat() on an already closed
fd fails and returns bogus results). x-screen 1 closes its fd.

Now all fd's are closed, the nouveau_screen destroyed, but
there is a dangling reference to the nouveau_screen in the
hash table.

5. Some OpenGL client plugin gets loaded again and calls
glXQueryVersion. Step 2 above repeats, but because a
dangling reference with a matching fd is found in the winsys
hash table, no new nouveau_screen is created this time. Instead
the invalid pointer to the old nouveau_screen is recycled,
which points to nirvana -> Crash.

This problem is avoided by use of the dup()ed fd which is
owned by the nouveau_screen and has the same lifetime as
the nouveau_screen itself.

Cc: "10.3 10.4 10.5 10.6" mailto:mesa-sta...@lists.freedesktop.org>>

Signed-off-by: Mario Kleiner mailto:mario.kleiner...@gmail.com>>
Cc: Ilia Mirkin mailto:imir...@alum.mit.edu>>
---
  src/gallium/winsys/nouveau/drm/nouveau_drm_winsys.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/winsys/nouveau/drm/nouveau_drm_winsys.c
b/src/gallium/winsys/nouveau/drm/nouveau_drm_winsys.c
index 0635246..dbc3cae 100644
--- a/src/gallium/winsys/nouveau/drm/nouveau_drm_winsys.c
+++ b/src/gallium/winsys/nouveau/drm/nouveau_drm_winsys.c
@@ -120,7 +120,8 @@ nouveau_drm_screen_create(int fd)
 if (!screen)
 goto err;

-   util_hash_table_set(fd_tab, intptr_to_pointer(fd), screen);
+   /* Use dupfd in hash table, to avoid crashes in ZaphodHeads
configs */
+   util_hash_table_set(fd_tab, intptr_to_pointer(dupfd),
screen);
 screen->refcount = 1;
 pipe_mutex_unlock(nouveau_screen_mutex);
 return &screen->base;
--
2.1.4


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radeon: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads.

2015-06-27 Thread Mario Kleiner

Same problem and fix as for nouveau's ZaphodHeads trouble.

See patch ...

"nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads."

... for reference.

Cc: "10.3 10.4 10.5 10.6" 

Signed-off-by: Mario Kleiner 
Cc: Ilia Mirkin 
---
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index ba8d143..ebdb19e 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -484,6 +484,10 @@ static void radeon_winsys_destroy(struct radeon_winsys 
*rws)
 if (ws->gen >= DRV_R600) {
 radeon_surface_manager_free(ws->surf_man);
 }
+
+if (ws->fd)
+close(ws->fd);
+
 FREE(rws);
 }
 
@@ -696,7 +700,7 @@ radeon_drm_winsys_create(int fd, radeon_screen_create_t 
screen_create)
 return NULL;
 }
 
-ws->fd = fd;
+ws->fd = dup(fd);
 
 if (!do_winsys_init(ws))
 goto fail;
@@ -712,7 +716,7 @@ radeon_drm_winsys_create(int fd, radeon_screen_create_t 
screen_create)
 goto fail;
 
 if (ws->gen >= DRV_R600) {
-ws->surf_man = radeon_surface_manager_new(fd);
+ws->surf_man = radeon_surface_manager_new(ws->fd);
 if (!ws->surf_man)
 goto fail;
 }
@@ -753,7 +757,7 @@ radeon_drm_winsys_create(int fd, radeon_screen_create_t 
screen_create)
 return NULL;
 }
 
-util_hash_table_set(fd_tab, intptr_to_pointer(fd), ws);
+util_hash_table_set(fd_tab, intptr_to_pointer(ws->fd), ws);
 
 /* We must unlock the mutex once the winsys is fully initialized, so that
  * other threads attempting to create the winsys from the same fd will
@@ -770,6 +774,9 @@ fail:
 ws->kman->destroy(ws->kman);
 if (ws->surf_man)
 radeon_surface_manager_free(ws->surf_man);
+if (ws->fd)
+close(ws->fd);
+
 FREE(ws);
 return NULL;
 }
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads.

2015-06-27 Thread Mario Kleiner


On 06/28/2015 03:48 AM, Ilia Mirkin wrote:

On Fri, Jun 5, 2015 at 9:36 AM, Mario Kleiner
 wrote:

The dup'ed fd owned by the nouveau_screen for a device node
must also be used as key for the winsys hash table, instead
of using the original fd passed in for a screen, to make
multi-x-screen ZaphodHeads configurations work on nouveau.

This prevents the following crash scenario that was observed
when a dynamically loaded rendering plugin used OpenGL on a
ZaphodHeads setup, e.g., on a dual x-screen setup. At first
load the plugin worked, but after unloading and reloading it,
the next rendering operation crashed:

1. Client, e.g., a plugin, calls glXQueryVersion.

2. DRI screens for the x-screens 0 and 1 are created, one shared
nouveau_screen is created for the shared device node of both
screens, but the original fd of x-screen 0 is used as identifying
key in the hash table, instead of the dup()ed fd of x-screen 0
which is owned by the nouveau_screen. nouveau_screen's refcount
is now 2.


See below, but it shouldn't matter which fd gets used.



3. Regular rendering happens by the client plugin, then the plugin
gets unloaded.

4. XCloseDisplay(). x-screen 0 gets its DRI screen destroyed,
nouveau_drm_screen_unref() drops the refcount to 1, calling mesa
code then closes the fd of x-screen 0, so now the fd which is
used as key in the hash table is invalid. x-screen 1 gets
destroyed, nouveau_drm_screen_unref() drops the refcount to 0,
the nouveau_screen gets destroyed, but removal of its entry
in the hash table fails, because the invalid fd in the hash
table no longer matches anything (fstat() on the fd is used
for hashing and key comparison, but fstat() on an already closed
fd fails and returns bogus results). x-screen 1 closes its fd.

Now all fd's are closed, the nouveau_screen destroyed, but
there is a dangling reference to the nouveau_screen in the
hash table.

5. Some OpenGL client plugin gets loaded again and calls
glXQueryVersion. Step 2 above repeats, but because a
dangling reference with a matching fd is found in the winsys
hash table, no new nouveau_screen is created this time. Instead
the invalid pointer to the old nouveau_screen is recycled,
which points to nirvana -> Crash.

This problem is avoided by use of the dup()ed fd which is
owned by the nouveau_screen and has the same lifetime as
the nouveau_screen itself.


I need to think about this some more, but... this shouldn't happen :)

In fact, the whole dupfd thing was added there for ZaphodHeads screens
in the first place. See
https://bugs.freedesktop.org/show_bug.cgi?id=79823 and commit
a59f2bb17bcc which fixed it.

Note that the hash has the following hash/eq functions:

static unsigned hash_fd(void *key)
{
 int fd = pointer_to_intptr(key);
 struct stat stat;
 fstat(fd, &stat);

 return stat.st_dev ^ stat.st_ino ^ stat.st_rdev;
}

static int compare_fd(void *key1, void *key2)
{
 int fd1 = pointer_to_intptr(key1);
 int fd2 = pointer_to_intptr(key2);
 struct stat stat1, stat2;
 fstat(fd1, &stat1);
 fstat(fd2, &stat2);

 return stat1.st_dev != stat2.st_dev ||
stat1.st_ino != stat2.st_ino ||
stat1.st_rdev != stat2.st_rdev;
}

so fd and dupfd should get hashed to the same thing. I suspect there's
something else going on in your application...



My application is a set of dynamically loaded plugins running inside 
another host app (Psychtoolbox-3 inside GNU/Octave), so what happens 
often is that the OpenGL using plugin gets unloaded and reloaded at 
runtime, so a after a OpenGL session has ended with a XCloseDisplay() 
tearing the winsys down, you can easily have it restart at plugin reload 
with another XOpenDisplay -> glXQueryVersion ->  sequence, where the 
new call to glXQueryVersion will trigger a recreation of the winsys, 
which will find the stale entry in the hash table pointing to nowhere 
instead of the previously released nouveau_screen -> use-after-free -> boom!


The reason this fails is because during destruction of the 2nd, 3rd etc. 
x-screen, the already closed fd associated with the 1st x-screen is fed 
into the compare_fd function, so fstat() errors out on the invalid fd. I 
added printf's etc. on both nouveau and now radeon to verify the fstat 
gives me EBADF errors. So the hash calculation goes wrong when trying to 
find the matching element in the hash table with a fd that has a 
matching hash -> The element which should be removed is ignored/not 
removed because it contains an already closed fd for which no proper 
hash can be calculated anymore -> hash comparison during search goes wrong.


This is because multiple x-screens, e.g., 2 x-screens are destroyed in 
order 0, 1 by FreeScreenConfigs() as part of XCloseDisplay(). 
FreeScreenConfigs() calls dri2DestroyScreen() in dri2.c or 
dri3_destroy_scr

Re: [Mesa-dev] [PATCH] nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads.

2015-06-27 Thread Mario Kleiner


On 06/28/2015 05:41 AM, Ilia Mirkin wrote:

Oh duh. Thanks for the super-detailed explanation. To rephrase what
you said in a slightly shorter manner:

See bug https://bugs.freedesktop.org/show_bug.cgi?id=79823 for why we
need to dupfd (which we are already, although radeon might not be).



Radeon doesn't so far. My 2nd patch for radeon from earlier today adds that.


Except instead of sticking the dup'd fd into the hash table, whose
lifetime matches that of the device, we stick the other one in, which
is effectively owned by the X server. As a result, those fstat's might
fail down the line, and all sorts of hell will break loose.



Yes. Essentially we should make sure the fd's we keep around have the 
same lifetime as the data structure in which we need to use it. That was 
the point, the server owned fd went away while we still needed it.



Would you be opposed to me rewriting your commit message to basically
reflect the above? Perhaps your original one did as well, but it
wasn't clear to me. I'd also like to throw some assert's in to make
sure the fstat's don't fail -- does that sound reasonable?



Oh please, yes! My commit messages often have this disease that they are 
either too terse, when i think the problem is obvious, or too long and 
somewhat convoluted when i'm going overboard in the other direction. Any 
editing/shortening for clarity more than happily accepted :)


thanks,
-mario


Another solution, btw, is to stick  as the real key in the hash, although that'd involve
making pointers. Probably not worth it.

Cheers,

   -ilia

On Sat, Jun 27, 2015 at 11:13 PM, Mario Kleiner
 wrote:

On 06/28/2015 03:48 AM, Ilia Mirkin wrote:


On Fri, Jun 5, 2015 at 9:36 AM, Mario Kleiner
 wrote:


The dup'ed fd owned by the nouveau_screen for a device node
must also be used as key for the winsys hash table, instead
of using the original fd passed in for a screen, to make
multi-x-screen ZaphodHeads configurations work on nouveau.

This prevents the following crash scenario that was observed
when a dynamically loaded rendering plugin used OpenGL on a
ZaphodHeads setup, e.g., on a dual x-screen setup. At first
load the plugin worked, but after unloading and reloading it,
the next rendering operation crashed:

1. Client, e.g., a plugin, calls glXQueryVersion.

2. DRI screens for the x-screens 0 and 1 are created, one shared
 nouveau_screen is created for the shared device node of both
 screens, but the original fd of x-screen 0 is used as identifying
 key in the hash table, instead of the dup()ed fd of x-screen 0
 which is owned by the nouveau_screen. nouveau_screen's refcount
 is now 2.



See below, but it shouldn't matter which fd gets used.



3. Regular rendering happens by the client plugin, then the plugin
 gets unloaded.

4. XCloseDisplay(). x-screen 0 gets its DRI screen destroyed,
 nouveau_drm_screen_unref() drops the refcount to 1, calling mesa
 code then closes the fd of x-screen 0, so now the fd which is
 used as key in the hash table is invalid. x-screen 1 gets
 destroyed, nouveau_drm_screen_unref() drops the refcount to 0,
 the nouveau_screen gets destroyed, but removal of its entry
 in the hash table fails, because the invalid fd in the hash
 table no longer matches anything (fstat() on the fd is used
 for hashing and key comparison, but fstat() on an already closed
 fd fails and returns bogus results). x-screen 1 closes its fd.

 Now all fd's are closed, the nouveau_screen destroyed, but
 there is a dangling reference to the nouveau_screen in the
 hash table.

5. Some OpenGL client plugin gets loaded again and calls
 glXQueryVersion. Step 2 above repeats, but because a
 dangling reference with a matching fd is found in the winsys
 hash table, no new nouveau_screen is created this time. Instead
 the invalid pointer to the old nouveau_screen is recycled,
 which points to nirvana -> Crash.

This problem is avoided by use of the dup()ed fd which is
owned by the nouveau_screen and has the same lifetime as
the nouveau_screen itself.



I need to think about this some more, but... this shouldn't happen :)

In fact, the whole dupfd thing was added there for ZaphodHeads screens
in the first place. See
https://bugs.freedesktop.org/show_bug.cgi?id=79823 and commit
a59f2bb17bcc which fixed it.

Note that the hash has the following hash/eq functions:

static unsigned hash_fd(void *key)
{
  int fd = pointer_to_intptr(key);
  struct stat stat;
  fstat(fd, &stat);

  return stat.st_dev ^ stat.st_ino ^ stat.st_rdev;
}

static int compare_fd(void *key1, void *key2)
{
  int fd1 = pointer_to_intptr(key1);
  int fd2 = pointer_to_intptr(key2);
  struct stat stat1, stat2;
  fstat(fd1, &stat1);
  fstat(fd2, &stat2);

  return stat1.st_dev != stat2.st_dev ||
 stat1.st_ino != st

Re: [Mesa-dev] [PATCH] nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads.

2015-06-27 Thread Mario Kleiner

Ok, maybe one thing for the commit message, as i just made the same 
mixup in my reply: The fd's are not owned by the x-server, but by the 
mesa screens (pipe screens? dri screens?) they represent, so if such a 
screen goes away, the fd goes away. Using "X server" would be confusing, 
especially under dri3, although each such "mesa screen" corresponds to a 
x-screen.


-mario

On 06/28/2015 06:03 AM, Mario Kleiner wrote:

On 06/28/2015 05:41 AM, Ilia Mirkin wrote:

Oh duh. Thanks for the super-detailed explanation. To rephrase what
you said in a slightly shorter manner:

See bug https://bugs.freedesktop.org/show_bug.cgi?id=79823 for why we
need to dupfd (which we are already, although radeon might not be).



Radeon doesn't so far. My 2nd patch for radeon from earlier today adds
that.


Except instead of sticking the dup'd fd into the hash table, whose
lifetime matches that of the device, we stick the other one in, which
is effectively owned by the X server. As a result, those fstat's might
fail down the line, and all sorts of hell will break loose.



Yes. Essentially we should make sure the fd's we keep around have the
same lifetime as the data structure in which we need to use it. That was
the point, the server owned fd went away while we still needed it.


Would you be opposed to me rewriting your commit message to basically
reflect the above? Perhaps your original one did as well, but it
wasn't clear to me. I'd also like to throw some assert's in to make
sure the fstat's don't fail -- does that sound reasonable?



Oh please, yes! My commit messages often have this disease that they are
either too terse, when i think the problem is obvious, or too long and
somewhat convoluted when i'm going overboard in the other direction. Any
editing/shortening for clarity more than happily accepted :)

thanks,
-mario


Another solution, btw, is to stick  as the real key in the hash, although that'd involve
making pointers. Probably not worth it.

Cheers,

   -ilia

On Sat, Jun 27, 2015 at 11:13 PM, Mario Kleiner
 wrote:

On 06/28/2015 03:48 AM, Ilia Mirkin wrote:


On Fri, Jun 5, 2015 at 9:36 AM, Mario Kleiner
 wrote:


The dup'ed fd owned by the nouveau_screen for a device node
must also be used as key for the winsys hash table, instead
of using the original fd passed in for a screen, to make
multi-x-screen ZaphodHeads configurations work on nouveau.

This prevents the following crash scenario that was observed
when a dynamically loaded rendering plugin used OpenGL on a
ZaphodHeads setup, e.g., on a dual x-screen setup. At first
load the plugin worked, but after unloading and reloading it,
the next rendering operation crashed:

1. Client, e.g., a plugin, calls glXQueryVersion.

2. DRI screens for the x-screens 0 and 1 are created, one shared
 nouveau_screen is created for the shared device node of both
 screens, but the original fd of x-screen 0 is used as identifying
 key in the hash table, instead of the dup()ed fd of x-screen 0
 which is owned by the nouveau_screen. nouveau_screen's refcount
 is now 2.



See below, but it shouldn't matter which fd gets used.



3. Regular rendering happens by the client plugin, then the plugin
 gets unloaded.

4. XCloseDisplay(). x-screen 0 gets its DRI screen destroyed,
 nouveau_drm_screen_unref() drops the refcount to 1, calling mesa
 code then closes the fd of x-screen 0, so now the fd which is
 used as key in the hash table is invalid. x-screen 1 gets
 destroyed, nouveau_drm_screen_unref() drops the refcount to 0,
 the nouveau_screen gets destroyed, but removal of its entry
 in the hash table fails, because the invalid fd in the hash
 table no longer matches anything (fstat() on the fd is used
 for hashing and key comparison, but fstat() on an already closed
 fd fails and returns bogus results). x-screen 1 closes its fd.

 Now all fd's are closed, the nouveau_screen destroyed, but
 there is a dangling reference to the nouveau_screen in the
 hash table.

5. Some OpenGL client plugin gets loaded again and calls
 glXQueryVersion. Step 2 above repeats, but because a
 dangling reference with a matching fd is found in the winsys
 hash table, no new nouveau_screen is created this time. Instead
 the invalid pointer to the old nouveau_screen is recycled,
 which points to nirvana -> Crash.

This problem is avoided by use of the dup()ed fd which is
owned by the nouveau_screen and has the same lifetime as
the nouveau_screen itself.



I need to think about this some more, but... this shouldn't happen :)

In fact, the whole dupfd thing was added there for ZaphodHeads screens
in the first place. See
https://bugs.freedesktop.org/show_bug.cgi?id=79823 and commit
a59f2bb17bcc which fixed it.

Note that the hash has the following hash/eq functions:

static unsigned hash_fd(void *key)
{

Re: [Mesa-dev] [PATCH] nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads.

2015-06-27 Thread Mario Kleiner


On 06/28/2015 06:25 AM, Ilia Mirkin wrote:

But that's the thing... in this case, the fd lifetime is owned by the
X server. In fact, nouveau doesn't touch that fd in mesa beyond
dup'ing it, and then exclusively using the dup'd fd.



True, it's not owned by nouveau, but instead by mesa core code? I think 
the x-server itself is only involved in authenticating the fd under 
DRI2, after mesa has opened it? Although under DRI3 the ready made fd 
comes from a xcb_dri3_open() trip to the server.


See the open() call on the device file:
dri2CreateScreen() in mesa/src/glx/dri2_glx.c

xcb_dri3_open():
dri3_create_screen() in mesa/src/glx/dri3_glx.c

And the close() calls for the fd's are also inside those files.

Maybe those winsys functions we fixed here are also involved in things 
like EGL for wayland etc.? Haven't checked this.


Other than that your new commit message is fine. Also i am possibly just 
confused about this, and your commit message is an improvement in 
clarity anyway :)


So i think it is fine to leave it as is.

thanks,
-mario



On Sun, Jun 28, 2015 at 12:23 AM, Mario Kleiner
 wrote:

Ok, maybe one thing for the commit message, as i just made the same mixup in
my reply: The fd's are not owned by the x-server, but by the mesa screens
(pipe screens? dri screens?) they represent, so if such a screen goes away,
the fd goes away. Using "X server" would be confusing, especially under
dri3, although each such "mesa screen" corresponds to a x-screen.

-mario


On 06/28/2015 06:03 AM, Mario Kleiner wrote:


On 06/28/2015 05:41 AM, Ilia Mirkin wrote:


Oh duh. Thanks for the super-detailed explanation. To rephrase what
you said in a slightly shorter manner:

See bug https://bugs.freedesktop.org/show_bug.cgi?id=79823 for why we
need to dupfd (which we are already, although radeon might not be).



Radeon doesn't so far. My 2nd patch for radeon from earlier today adds
that.


Except instead of sticking the dup'd fd into the hash table, whose
lifetime matches that of the device, we stick the other one in, which
is effectively owned by the X server. As a result, those fstat's might
fail down the line, and all sorts of hell will break loose.



Yes. Essentially we should make sure the fd's we keep around have the
same lifetime as the data structure in which we need to use it. That was
the point, the server owned fd went away while we still needed it.


Would you be opposed to me rewriting your commit message to basically
reflect the above? Perhaps your original one did as well, but it
wasn't clear to me. I'd also like to throw some assert's in to make
sure the fstat's don't fail -- does that sound reasonable?



Oh please, yes! My commit messages often have this disease that they are
either too terse, when i think the problem is obvious, or too long and
somewhat convoluted when i'm going overboard in the other direction. Any
editing/shortening for clarity more than happily accepted :)

thanks,
-mario


Another solution, btw, is to stick  as the real key in the hash, although that'd involve
making pointers. Probably not worth it.

Cheers,

-ilia

On Sat, Jun 27, 2015 at 11:13 PM, Mario Kleiner
 wrote:


On 06/28/2015 03:48 AM, Ilia Mirkin wrote:



On Fri, Jun 5, 2015 at 9:36 AM, Mario Kleiner
 wrote:



The dup'ed fd owned by the nouveau_screen for a device node
must also be used as key for the winsys hash table, instead
of using the original fd passed in for a screen, to make
multi-x-screen ZaphodHeads configurations work on nouveau.

This prevents the following crash scenario that was observed
when a dynamically loaded rendering plugin used OpenGL on a
ZaphodHeads setup, e.g., on a dual x-screen setup. At first
load the plugin worked, but after unloading and reloading it,
the next rendering operation crashed:

1. Client, e.g., a plugin, calls glXQueryVersion.

2. DRI screens for the x-screens 0 and 1 are created, one shared
  nouveau_screen is created for the shared device node of both
  screens, but the original fd of x-screen 0 is used as identifying
  key in the hash table, instead of the dup()ed fd of x-screen 0
  which is owned by the nouveau_screen. nouveau_screen's refcount
  is now 2.




See below, but it shouldn't matter which fd gets used.



3. Regular rendering happens by the client plugin, then the plugin
  gets unloaded.

4. XCloseDisplay(). x-screen 0 gets its DRI screen destroyed,
  nouveau_drm_screen_unref() drops the refcount to 1, calling mesa
  code then closes the fd of x-screen 0, so now the fd which is
  used as key in the hash table is invalid. x-screen 1 gets
  destroyed, nouveau_drm_screen_unref() drops the refcount to 0,
  the nouveau_screen gets destroyed, but removal of its entry
  in the hash table fails, because the invalid fd in the hash
  table no longer matches anything (fstat() on

Re: [Mesa-dev] [PATCH] nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads.

2015-06-27 Thread Mario Kleiner


Yes, that's good.

On 06/28/2015 07:00 AM, Ilia Mirkin wrote:

How about:

/* Use dupfd in hash table, to avoid errors if the original fd gets
* closed by its owner. The hash key needs to live at least as long as
* the screen.
*/


On Sun, Jun 28, 2015 at 12:57 AM, Mario Kleiner
 wrote:

On 06/28/2015 06:25 AM, Ilia Mirkin wrote:


But that's the thing... in this case, the fd lifetime is owned by the
X server. In fact, nouveau doesn't touch that fd in mesa beyond
dup'ing it, and then exclusively using the dup'd fd.



True, it's not owned by nouveau, but instead by mesa core code? I think the
x-server itself is only involved in authenticating the fd under DRI2, after
mesa has opened it? Although under DRI3 the ready made fd comes from a
xcb_dri3_open() trip to the server.

See the open() call on the device file:
dri2CreateScreen() in mesa/src/glx/dri2_glx.c

xcb_dri3_open():
dri3_create_screen() in mesa/src/glx/dri3_glx.c

And the close() calls for the fd's are also inside those files.

Maybe those winsys functions we fixed here are also involved in things like
EGL for wayland etc.? Haven't checked this.

Other than that your new commit message is fine. Also i am possibly just
confused about this, and your commit message is an improvement in clarity
anyway :)

So i think it is fine to leave it as is.

thanks,
-mario




On Sun, Jun 28, 2015 at 12:23 AM, Mario Kleiner
 wrote:


Ok, maybe one thing for the commit message, as i just made the same mixup
in
my reply: The fd's are not owned by the x-server, but by the mesa screens
(pipe screens? dri screens?) they represent, so if such a screen goes
away,
the fd goes away. Using "X server" would be confusing, especially under
dri3, although each such "mesa screen" corresponds to a x-screen.

-mario


On 06/28/2015 06:03 AM, Mario Kleiner wrote:



On 06/28/2015 05:41 AM, Ilia Mirkin wrote:



Oh duh. Thanks for the super-detailed explanation. To rephrase what
you said in a slightly shorter manner:

See bug https://bugs.freedesktop.org/show_bug.cgi?id=79823 for why we
need to dupfd (which we are already, although radeon might not be).



Radeon doesn't so far. My 2nd patch for radeon from earlier today adds
that.


Except instead of sticking the dup'd fd into the hash table, whose
lifetime matches that of the device, we stick the other one in, which
is effectively owned by the X server. As a result, those fstat's might
fail down the line, and all sorts of hell will break loose.



Yes. Essentially we should make sure the fd's we keep around have the
same lifetime as the data structure in which we need to use it. That was
the point, the server owned fd went away while we still needed it.


Would you be opposed to me rewriting your commit message to basically
reflect the above? Perhaps your original one did as well, but it
wasn't clear to me. I'd also like to throw some assert's in to make
sure the fstat's don't fail -- does that sound reasonable?



Oh please, yes! My commit messages often have this disease that they are
either too terse, when i think the problem is obvious, or too long and
somewhat convoluted when i'm going overboard in the other direction. Any
editing/shortening for clarity more than happily accepted :)

thanks,
-mario


Another solution, btw, is to stick  as the real key in the hash, although that'd involve
making pointers. Probably not worth it.

Cheers,

 -ilia

On Sat, Jun 27, 2015 at 11:13 PM, Mario Kleiner
 wrote:



On 06/28/2015 03:48 AM, Ilia Mirkin wrote:




On Fri, Jun 5, 2015 at 9:36 AM, Mario Kleiner
 wrote:




The dup'ed fd owned by the nouveau_screen for a device node
must also be used as key for the winsys hash table, instead
of using the original fd passed in for a screen, to make
multi-x-screen ZaphodHeads configurations work on nouveau.

This prevents the following crash scenario that was observed
when a dynamically loaded rendering plugin used OpenGL on a
ZaphodHeads setup, e.g., on a dual x-screen setup. At first
load the plugin worked, but after unloading and reloading it,
the next rendering operation crashed:

1. Client, e.g., a plugin, calls glXQueryVersion.

2. DRI screens for the x-screens 0 and 1 are created, one shared
   nouveau_screen is created for the shared device node of both
   screens, but the original fd of x-screen 0 is used as
identifying
   key in the hash table, instead of the dup()ed fd of x-screen 0
   which is owned by the nouveau_screen. nouveau_screen's
refcount
   is now 2.





See below, but it shouldn't matter which fd gets used.



3. Regular rendering happens by the client plugin, then the plugin
   gets unloaded.

4. XCloseDisplay(). x-screen 0 gets its DRI screen destroyed,
   nouveau_drm_screen_unref() drops the refcount to 1, calling
mesa
   code then closes the fd of x-screen 0, so now the fd which is
   use

Re: [Mesa-dev] [PATCH] radeon: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads.

2015-06-27 Thread Mario Kleiner

Thanks for the review Ilia. Just saw that the new amdgpu winsys driver 
will need the same fix, once this one is confirmed to be ok.


thanks,
-mario

On 06/28/2015 07:23 AM, Ilia Mirkin wrote:

Without being intimately familiar with the radeon winsys code, this
seems sane (esp after re-reviewing the situation for nouveau).
Reviewed-by: Ilia Mirkin . But one of the radeon
people definitely needs to look at this before pushing.

On Sat, Jun 27, 2015 at 9:02 PM, Mario Kleiner
 wrote:

Same problem and fix as for nouveau's ZaphodHeads trouble.

See patch ...

"nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads."

... for reference.

Cc: "10.3 10.4 10.5 10.6" 

Signed-off-by: Mario Kleiner 
Cc: Ilia Mirkin 
---
  src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 13 ++---
  1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index ba8d143..ebdb19e 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -484,6 +484,10 @@ static void radeon_winsys_destroy(struct radeon_winsys 
*rws)
  if (ws->gen >= DRV_R600) {
  radeon_surface_manager_free(ws->surf_man);
  }
+
+if (ws->fd)
+close(ws->fd);
+
  FREE(rws);
  }

@@ -696,7 +700,7 @@ radeon_drm_winsys_create(int fd, radeon_screen_create_t 
screen_create)
  return NULL;
  }

-ws->fd = fd;
+ws->fd = dup(fd);

  if (!do_winsys_init(ws))
  goto fail;
@@ -712,7 +716,7 @@ radeon_drm_winsys_create(int fd, radeon_screen_create_t 
screen_create)
  goto fail;

  if (ws->gen >= DRV_R600) {
-ws->surf_man = radeon_surface_manager_new(fd);
+ws->surf_man = radeon_surface_manager_new(ws->fd);
  if (!ws->surf_man)
  goto fail;
  }
@@ -753,7 +757,7 @@ radeon_drm_winsys_create(int fd, radeon_screen_create_t 
screen_create)
  return NULL;
  }

-util_hash_table_set(fd_tab, intptr_to_pointer(fd), ws);
+util_hash_table_set(fd_tab, intptr_to_pointer(ws->fd), ws);

  /* We must unlock the mutex once the winsys is fully initialized, so that
   * other threads attempting to create the winsys from the same fd will
@@ -770,6 +774,9 @@ fail:
  ws->kman->destroy(ws->kman);
  if (ws->surf_man)
  radeon_surface_manager_free(ws->surf_man);
+if (ws->fd)
+close(ws->fd);
+
  FREE(ws);
  return NULL;
  }
--
2.1.4


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] libdrm/amdgpu: Use private fd for amdgpu_device and winsys hash table to fix ZaphodHeads.

2015-07-14 Thread Mario Kleiner

The amdgpu_device for a device node needs its own dup'ed fd, instead
of using the original fd passed in for a screen, to make multi-x-screen
ZaphodHeads configurations work on amdgpu.

The original fd's lifetime differs from that of the amdgpu_device, and from the
one stored in the hash. The hash key is the fd, and in order to compare hash
entries we fstat them, so the fd must be around for as long as the amdgpu_device
is.

This patch for libdrm/amdgpu is a translation of the radeon-winsys ZaphodHeads
fix for mesa's radeon-winsys, from mesa commit 
28dda47ae4d974e3e032d60e8e0965c8c068c6d8

"winsys/radeon: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads."

Signed-off-by: Mario Kleiner 
Cc: Marek Olšák 
---
 amdgpu/amdgpu_device.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c
index b50ce26..1b0cd12 100644
--- a/amdgpu/amdgpu_device.c
+++ b/amdgpu/amdgpu_device.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "xf86drm.h"
 #include "amdgpu_drm.h"
@@ -153,7 +154,7 @@ int amdgpu_device_initialize(int fd,
return r;
}
if ((flag_auth) && (!flag_authexist)) {
-   dev->flink_fd = fd;
+   dev->flink_fd = dup(fd);
}
*major_version = dev->major_version;
*minor_version = dev->minor_version;
@@ -183,8 +184,8 @@ int amdgpu_device_initialize(int fd,
goto cleanup;
}
 
-   dev->fd = fd;
-   dev->flink_fd = fd;
+   dev->fd = dup(fd);
+   dev->flink_fd = dev->fd;
dev->major_version = version->version_major;
dev->minor_version = version->version_minor;
drmFreeVersion(version);
@@ -213,12 +214,14 @@ int amdgpu_device_initialize(int fd,
*major_version = dev->major_version;
*minor_version = dev->minor_version;
*device_handle = dev;
-   util_hash_table_set(fd_tab, UINT_TO_PTR(fd), dev);
+   util_hash_table_set(fd_tab, UINT_TO_PTR(dev->fd), dev);
pthread_mutex_unlock(&fd_mutex);
 
return 0;
 
 cleanup:
+   if (dev->fd)
+   close(dev->fd);
free(dev);
pthread_mutex_unlock(&fd_mutex);
return r;
@@ -232,6 +235,9 @@ void amdgpu_device_free_internal(amdgpu_device_handle dev)
pthread_mutex_destroy(&dev->bo_table_mutex);
pthread_mutex_destroy(&(dev->vamgr.bo_va_mutex));
util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd));
+   close(dev->fd);
+   if (dev->fd != dev->flink_fd)
+   close(dev->flink_fd);
free(dev);
 }
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] libdrm/amdgpu: Use private fd for amdgpu_device and winsys hash table to fix ZaphodHeads. (v2)

2015-07-17 Thread Mario Kleiner

The amdgpu_device for a device node needs its own dup'ed fd, instead
of using the original fd passed in for a screen, to make multi-x-screen
ZaphodHeads configurations work on amdgpu.

The original fd's lifetime differs from that of the amdgpu_device, and from the
one stored in the hash. The hash key is the fd, and in order to compare hash
entries we fstat them, so the fd must be around for as long as the amdgpu_device
is.

This patch for libdrm/amdgpu is a translation of the radeon-winsys ZaphodHeads
fix for mesa's radeon-winsys, from mesa commit 
28dda47ae4d974e3e032d60e8e0965c8c068c6d8

"winsys/radeon: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads."

Signed-off-by: Mario Kleiner 
Acked-by: Marek Olšák 
Reviewed-by: Christian König 

v2: Check for valid fd's being >= 0, because fd == 0 is in theory
a valid, although unlikely, fd and fd == -1 would denote an
invalid fd. Thanks to William Lewis for pointing this out.

Reported-by: William Lewis 
Signed-off-by: Mario Kleiner 
---
 amdgpu/amdgpu_device.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c
index b50ce26..499539c 100644
--- a/amdgpu/amdgpu_device.c
+++ b/amdgpu/amdgpu_device.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "xf86drm.h"
 #include "amdgpu_drm.h"
@@ -153,7 +154,7 @@ int amdgpu_device_initialize(int fd,
return r;
}
if ((flag_auth) && (!flag_authexist)) {
-   dev->flink_fd = fd;
+   dev->flink_fd = dup(fd);
}
*major_version = dev->major_version;
*minor_version = dev->minor_version;
@@ -168,6 +169,9 @@ int amdgpu_device_initialize(int fd,
return -ENOMEM;
}
 
+   dev->fd = -1;
+   dev->flink_fd = -1;
+
atomic_set(&dev->refcount, 1);
 
version = drmGetVersion(fd);
@@ -183,8 +187,8 @@ int amdgpu_device_initialize(int fd,
goto cleanup;
}
 
-   dev->fd = fd;
-   dev->flink_fd = fd;
+   dev->fd = dup(fd);
+   dev->flink_fd = dev->fd;
dev->major_version = version->version_major;
dev->minor_version = version->version_minor;
drmFreeVersion(version);
@@ -213,12 +217,14 @@ int amdgpu_device_initialize(int fd,
*major_version = dev->major_version;
*minor_version = dev->minor_version;
*device_handle = dev;
-   util_hash_table_set(fd_tab, UINT_TO_PTR(fd), dev);
+   util_hash_table_set(fd_tab, UINT_TO_PTR(dev->fd), dev);
pthread_mutex_unlock(&fd_mutex);
 
return 0;
 
 cleanup:
+   if (dev->fd >= 0)
+   close(dev->fd);
free(dev);
pthread_mutex_unlock(&fd_mutex);
return r;
@@ -232,6 +238,9 @@ void amdgpu_device_free_internal(amdgpu_device_handle dev)
pthread_mutex_destroy(&dev->bo_table_mutex);
pthread_mutex_destroy(&(dev->vamgr.bo_va_mutex));
util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd));
+   close(dev->fd);
+   if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd))
+   close(dev->flink_fd);
free(dev);
 }
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] winsys/radeon: don't leak the fd when it is 0

2015-07-29 Thread Mario Kleiner




On 07/29/2015 05:46 PM, Alex Deucher wrote:

On Wed, Jul 29, 2015 at 10:44 AM, Emil Velikov  wrote:

Earlier commit added an extra dup(fd) to fix a ZaphodHeads issue.
Although it did not consider the (very unlikely) case where we might end
up with the valid fd == 0.

Fixes: 28dda47ae4d(winsys/radeon: Use dup fd as key in drm-winsys hash
table to fix ZaphodHeads.)

Cc: Mario Kleiner 
Signed-off-by: Emil Velikov 


Reviewed-by: Alex Deucher 



Looks good, thanks for catching this.

Reviewed-by: Mario Kleiner 

cc stable seems safe, so can't hurt, no?

I have a similar patch for the amdgpu Zaphodheads fix floating on the 
list. v1 (just the dup(fd) fix, but not checking for fd >= 0) already 
reviewed, but v2 fixes the same problem on amdgpu you just fixed for 
radeon, but so far lacks a reviewed by - and i don't know who can push 
stuff to libdrm?


The patch was "[PATCH] libdrm/amdgpu: Use private fd for amdgpu_device 
and winsys hash table to fix ZaphodHeads. (v2)"


Maybe you can have a quick look to get all these into the next mesa release?

thanks,
-mario


---

Perhaps we should CC mesa-stable as the offending commit, yet
considering how unlikely the above case is I'm ambivalent,

-Emil

  src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index 41f8826..f7784fb 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -493,7 +493,7 @@ static void radeon_winsys_destroy(struct radeon_winsys *rws)
  radeon_surface_manager_free(ws->surf_man);
  }

-if (ws->fd)
+if (ws->fd >= 0)
  close(ws->fd);

  FREE(rws);
@@ -786,7 +786,7 @@ fail:
  ws->kman->destroy(ws->kman);
  if (ws->surf_man)
  radeon_surface_manager_free(ws->surf_man);
-if (ws->fd)
+if (ws->fd >= 0)
  close(ws->fd);

  FREE(ws);
--
2.4.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] winsys/radeon: don't leak the fd when it is 0

2015-07-29 Thread Mario Kleiner


On 07/29/2015 08:50 PM, Alex Deucher wrote:

On Wed, Jul 29, 2015 at 2:48 PM, Mario Kleiner
 wrote:



On 07/29/2015 05:46 PM, Alex Deucher wrote:


On Wed, Jul 29, 2015 at 10:44 AM, Emil Velikov 
wrote:


Earlier commit added an extra dup(fd) to fix a ZaphodHeads issue.
Although it did not consider the (very unlikely) case where we might end
up with the valid fd == 0.

Fixes: 28dda47ae4d(winsys/radeon: Use dup fd as key in drm-winsys hash
table to fix ZaphodHeads.)

Cc: Mario Kleiner 
Signed-off-by: Emil Velikov 



Reviewed-by: Alex Deucher 



Looks good, thanks for catching this.

Reviewed-by: Mario Kleiner 

cc stable seems safe, so can't hurt, no?

I have a similar patch for the amdgpu Zaphodheads fix floating on the list.
v1 (just the dup(fd) fix, but not checking for fd >= 0) already reviewed,
but v2 fixes the same problem on amdgpu you just fixed for radeon, but so
far lacks a reviewed by - and i don't know who can push stuff to libdrm?

The patch was "[PATCH] libdrm/amdgpu: Use private fd for amdgpu_device and
winsys hash table to fix ZaphodHeads. (v2)"

Maybe you can have a quick look to get all these into the next mesa release?


I already picked it up in my libdrm tree:
http://cgit.freedesktop.org/~agd5f/drm/commit/?h=amdgpu&id=e2b18a1300313912aeaecf5ba8dd896b5853f6d4

Alex



Ah ok. Could you get (v2) instead? It has the same fix as Emil's to make 
sure we accept fd == 0 as a valid descriptor.


thanks,
-mario









thanks,
-mario


---

Perhaps we should CC mesa-stable as the offending commit, yet
considering how unlikely the above case is I'm ambivalent,

-Emil

   src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index 41f8826..f7784fb 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -493,7 +493,7 @@ static void radeon_winsys_destroy(struct
radeon_winsys *rws)
   radeon_surface_manager_free(ws->surf_man);
   }

-if (ws->fd)
+if (ws->fd >= 0)
   close(ws->fd);

   FREE(rws);
@@ -786,7 +786,7 @@ fail:
   ws->kman->destroy(ws->kman);
   if (ws->surf_man)
   radeon_surface_manager_free(ws->surf_man);
-if (ws->fd)
+if (ws->fd >= 0)
   close(ws->fd);

   FREE(ws);
--
2.4.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] winsys/radeon: don't leak the fd when it is 0

2015-07-29 Thread Mario Kleiner




On 07/29/2015 09:05 PM, Alex Deucher wrote:

On Wed, Jul 29, 2015 at 2:55 PM, Mario Kleiner
 wrote:

On 07/29/2015 08:50 PM, Alex Deucher wrote:


On Wed, Jul 29, 2015 at 2:48 PM, Mario Kleiner
 wrote:




On 07/29/2015 05:46 PM, Alex Deucher wrote:



On Wed, Jul 29, 2015 at 10:44 AM, Emil Velikov

wrote:



Earlier commit added an extra dup(fd) to fix a ZaphodHeads issue.
Although it did not consider the (very unlikely) case where we might
end
up with the valid fd == 0.

Fixes: 28dda47ae4d(winsys/radeon: Use dup fd as key in drm-winsys hash
table to fix ZaphodHeads.)

Cc: Mario Kleiner 
Signed-off-by: Emil Velikov 




Reviewed-by: Alex Deucher 



Looks good, thanks for catching this.

Reviewed-by: Mario Kleiner 

cc stable seems safe, so can't hurt, no?

I have a similar patch for the amdgpu Zaphodheads fix floating on the
list.
v1 (just the dup(fd) fix, but not checking for fd >= 0) already reviewed,
but v2 fixes the same problem on amdgpu you just fixed for radeon, but so
far lacks a reviewed by - and i don't know who can push stuff to libdrm?

The patch was "[PATCH] libdrm/amdgpu: Use private fd for amdgpu_device
and
winsys hash table to fix ZaphodHeads. (v2)"

Maybe you can have a quick look to get all these into the next mesa
release?



I already picked it up in my libdrm tree:

http://cgit.freedesktop.org/~agd5f/drm/commit/?h=amdgpu&id=e2b18a1300313912aeaecf5ba8dd896b5853f6d4

Alex



Ah ok. Could you get (v2) instead? It has the same fix as Emil's to make
sure we accept fd == 0 as a valid descriptor.


Sure.  Can you send it to me?  I can't seem to find it in my email.

Alex



Will do in a second.
-mario



thanks,
-mario









thanks,
-mario


---

Perhaps we should CC mesa-stable as the offending commit, yet
considering how unlikely the above case is I'm ambivalent,

-Emil

src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index 41f8826..f7784fb 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -493,7 +493,7 @@ static void radeon_winsys_destroy(struct
radeon_winsys *rws)
radeon_surface_manager_free(ws->surf_man);
}

-if (ws->fd)
+if (ws->fd >= 0)
close(ws->fd);

FREE(rws);
@@ -786,7 +786,7 @@ fail:
ws->kman->destroy(ws->kman);
if (ws->surf_man)
radeon_surface_manager_free(ws->surf_man);
-if (ws->fd)
+if (ws->fd >= 0)
close(ws->fd);

FREE(ws);
--
2.4.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] glx: Perform check for valid fbconfig against proper X-Screen.

2016-10-11 Thread Mario Kleiner

Commit cf804b4455fac9e585b3600a8318caaced9c23de
('glx: fix crash with bad fbconfig') introduced a check
in glXCreateNewContext() if the given config is a valid
fbconfig.

Unfortunately the check always checks the given config against
the fbconfigs of the DefaultScreen(dpy), instead of the
actual X-Screen specified in the config config->screen.

This leads to failure whenever a GL context is created
on a non-DefaultScreen(dpy), e.g., on X-Screen 1 of
a multi-x-screen setup, where the default screen is
typically 0.

Fix this by using config->screen instead of DefaultScreen(dpy).

Tested to fix context creation failure on a dual-x-screen setup.

Signed-off-by: Mario Kleiner 
Cc: "11.2 12.0" 
---
 src/glx/glxcmds.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/glx/glxcmds.c b/src/glx/glxcmds.c
index b0a1cb0..6abe0b9 100644
--- a/src/glx/glxcmds.c
+++ b/src/glx/glxcmds.c
@@ -1626,7 +1626,6 @@ glXCreateNewContext(Display * dpy, GLXFBConfig fbconfig,
 int renderType, GLXContext shareList, Bool allowDirect)
 {
struct glx_config *config = (struct glx_config *) fbconfig;
-   int screen = DefaultScreen(dpy);
struct glx_config **config_list;
int list_size;
unsigned i;
@@ -1637,7 +1636,7 @@ glXCreateNewContext(Display * dpy, GLXFBConfig fbconfig,
}
 
config_list = (struct glx_config **)
-  glXGetFBConfigs(dpy, screen, &list_size);
+  glXGetFBConfigs(dpy, config->screen, &list_size);
 
for (i = 0; i < list_size; i++) {
if (config_list[i] == config)
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH] glx: Perform check for valid fbconfig against proper X-Screen.

2016-10-14 Thread Mario Kleiner




On 10/14/2016 11:14 AM, Emil Velikov wrote:

On 12 October 2016 at 18:40, Emil Velikov  wrote:

On 11 October 2016 at 19:42, Mario Kleiner  wrote:



Tested to fix context creation failure on a dual-x-screen setup.

Signed-off-by: Mario Kleiner 
Cc: "11.2 12.0" 

Reviewed-by: Emil Velikov 


... and pushed to master.

Mario, can you skim through this list [1] and let me know if any the
patches is outstanding or I can "close" them.



They've been all merged, so you can close them.

thanks,
-mario


Thanks
Emil
[1] 
https://patchwork.freedesktop.org/project/mesa/patches/?submitter=14956&state=&q=&archive=&delegate=


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] r600: increase performance for DRI PRIME offloading if 2nd GPU is Evergreen+

2016-08-26 Thread Mario Kleiner

This is a direct port of Marek Olšáks patch
"radeonsi: increase performance for DRI PRIME
offloading if 2nd GPU is CIK or VI" to r600.

It uses SDMA for the detiling blit from renderoffload VRAM
to GTT, as SDMA is much faster for tiled->linear blits from
VRAM to GTT.

Testing on a dual Radeon HD-5770 setup reduced the time
for the render offload gpu to get its rendering into
system RAM from approximately 16 msecs for simple rendering
at 1920x1080 pixel 32 bpp to 5 msecs, a > 3x speedup!

This was measured using ftrace to trace the time the radeon kms
driver waited on the dmabuf fence of the renderoffload gpu to
complete.

All in all this brought the time for a flip down from 20 msecs
to 9 msecs, so the prime setup can display at full 60 fps instead
of barely 30 fps vsync'ed.

The current r600 implementation supports SDMA on Evergreen and
later, but not R600/R700 due to some bugs apparently present
in their SDMA implementation.

Signed-off-by: Mario Kleiner 
Cc: Marek Olšák 
---
 src/gallium/drivers/r600/r600_blit.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/src/gallium/drivers/r600/r600_blit.c 
b/src/gallium/drivers/r600/r600_blit.c
index adf616e..8fdc51c 100644
--- a/src/gallium/drivers/r600/r600_blit.c
+++ b/src/gallium/drivers/r600/r600_blit.c
@@ -850,11 +850,30 @@ static void r600_blit(struct pipe_context *ctx,
   const struct pipe_blit_info *info)
 {
struct r600_context *rctx = (struct r600_context*)ctx;
+   struct r600_texture *rdst = (struct r600_texture *)info->dst.resource;
 
if (do_hardware_msaa_resolve(ctx, info)) {
return;
}
 
+   /* Using SDMA for copying to a linear texture in GTT is much faster.
+* This improves DRI PRIME performance.
+*
+* resource_copy_region can't do this yet, because dma_copy calls it
+* on failure (recursion).
+*/
+   if (rdst->surface.level[info->dst.level].mode ==
+   RADEON_SURF_MODE_LINEAR_ALIGNED &&
+   rctx->b.dma_copy &&
+   util_can_blit_via_copy_region(info, false)) {
+   rctx->b.dma_copy(ctx, info->dst.resource, info->dst.level,
+info->dst.box.x, info->dst.box.y,
+info->dst.box.z,
+info->src.resource, info->src.level,
+&info->src.box);
+   return;
+   }
+
assert(util_blitter_is_blit_supported(rctx->blitter, info));
 
/* The driver doesn't decompress resources automatically while
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] DRI3/Present fixes for Mesa 10.3+ (v2)

2014-12-02 Thread Mario Kleiner

A slightly updated and extended series of the dri3/present fixes for Mesa i
sent last week.

Patch 1 and 2 are same as before. Patch 3 now has signed off by Frank Binns
and reviewed by Chris Wilson. Patch 4 and 5 are additional fixes. The last
one makes INTEL_swap_events behave properly again when swap completion events
come in out of order due to skipped present requests. Before the first out
of sequence sbc event caused the INTEL_swap_events extension to become 
completely
unuseable for the rest of a session.

thanks,
-mario

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/5] glx/dri3: Track separate (ust, msc) for PresentPixmap vs. PresentNotifyMsc

2014-12-02 Thread Mario Kleiner

Prevent calls to glXGetSyncValuesOML() and glXWaitForMscOML()
from overwriting the (ust,msc) values of the last successfull
swapbuffers call (PresentPixmapCompleteNotify event), as
glXWaitForSbcOML() relies on those values corresponding to
the most recent completed swap, not to whatever was last
returned from the server.

Problematic call sequence without this patch would have been, e.g.,

glXSwapBuffers()
... wait ...
swap completes -> PresentPixmapComplete event -> (ust,msc)
updated to reflect swap completion time and count.
... wait for at least 1 video refresh cycle/vblank increment.

glXGetSyncValuesOML()
-> PresentNotifyMsc event overwrites (ust,msc) of swap
completion with (ust,msc) of most recent vblank

glXWaitForSbcOML()
-> Returns sbc of last completed swap but (ust,msc) of last
completed vblank, not of last completed swap.
-> Client is confused.

Do this by tracking a separate set of (ust, msc) for the
dri3_wait_for_msc() call than for the dri3_wait_for_sbc()
call.

This makes the glXWaitForSbcOML() call robust again and restores
consistent behaviour with the DRI2 implementation.

Fixes applications originally written and tested against
DRI2 which also rely on this not regressing under DRI3/Present,
e.g., Neuro-Science software like Psychtoolbox-3.

This patch fixes the problem.

Cc: "10.3 10.4" 
Signed-off-by: Mario Kleiner 
---
 src/glx/dri3_glx.c  | 11 +++
 src/glx/dri3_priv.h |  5 -
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index b4ac278..5796491 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -420,11 +420,14 @@ dri3_handle_present_event(struct dri3_drawable *priv, 
xcb_present_generic_event_
 
  if (psc->show_fps_interval)
 show_fps(priv, ce->ust);
+
+ priv->ust = ce->ust;
+ priv->msc = ce->msc;
   } else {
  priv->recv_msc_serial = ce->serial;
+ priv->vblank_ust = ce->ust;
+ priv->vblank_msc = ce->msc;
   }
-  priv->ust = ce->ust;
-  priv->msc = ce->msc;
   break;
}
case XCB_PRESENT_EVENT_IDLE_NOTIFY: {
@@ -498,8 +501,8 @@ dri3_wait_for_msc(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
   }
}
 
-   *ust = priv->ust;
-   *msc = priv->msc;
+   *ust = priv->vblank_ust;
+   *msc = priv->vblank_msc;
*sbc = priv->recv_sbc;
 
return 1;
diff --git a/src/glx/dri3_priv.h b/src/glx/dri3_priv.h
index 8e46640..222deb0 100644
--- a/src/glx/dri3_priv.h
+++ b/src/glx/dri3_priv.h
@@ -182,9 +182,12 @@ struct dri3_drawable {
uint64_t send_sbc;
uint64_t recv_sbc;
 
-   /* Last received UST/MSC values */
+   /* Last received UST/MSC values for pixmap present complete */
uint64_t ust, msc;
 
+   /* Last received UST/MSC values for vblank */
+   uint64_t vblank_ust, vblank_msc;
+
/* Serial numbers for tracking wait_for_msc events */
uint32_t send_msc_serial;
uint32_t recv_msc_serial;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/5] glx/dri3: Request non-vsynced Present for swapinterval zero. (v2)

2014-12-02 Thread Mario Kleiner

Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Cc: "10.3 10.4" 

v2: Add Frank Binns signed off by for his original earlier
patch from April 2014, which is identical to this one, and
Chris Wilsons reviewed tag from May 2014 for that patch, ergo
also for this one.

Signed-off-by: Frank Binns 
Signed-off-by: Mario Kleiner 
Reviewed-by: Chris Wilson 
---
 src/glx/dri3_glx.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 5796491..c53be1b 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -1518,6 +1518,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
xcb_connection_t *c = XGetXCBConnection(dpy);
struct dri3_buffer *back;
int64_t ret = 0;
+   uint32_t options = XCB_PRESENT_OPTION_NONE;
 
unsigned flags = __DRI2_FLUSH_DRAWABLE;
if (flush)
@@ -1557,6 +1558,9 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
   if (target_msc == 0)
  target_msc = priv->msc + priv->swap_interval * (priv->send_sbc - 
priv->recv_sbc);
 
+  if (priv->swap_interval == 0)
+  options |= XCB_PRESENT_OPTION_ASYNC;
+
   back->busy = 1;
   back->last_swap = priv->send_sbc;
   xcb_present_pixmap(c,
@@ -1570,7 +1574,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
  None, /* target_crtc 
*/
  None,
  back->sync_fence,
- XCB_PRESENT_OPTION_NONE,
+ options,
  target_msc,
  divisor,
  remainder, 0, NULL);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/5] glx: Handle out-of-sequence swap completion events correctly.

2014-12-02 Thread Mario Kleiner

The code for emitting INTEL_swap_events swap completion
events needs to translate from 32-Bit sbc on the wire to
64-Bit sbc for the events and handle wraparound accordingly.

It assumed that events would be sent by the server in the
order their corresponding swap requests were emitted from
the client, iow. sbc count should be always increasing. This
was correct for DRI2.

This is not always the case under the DRI3/Present backend,
where the Present extension can execute swaps and send out
completion events in a different order than the submission
order of the present requests. This confused the wraparound
handling. This patch fixes the problem by handling 32-Bit
wraparound in both directions. As long as successive swap
completion events real 64-Bit sbc's don't differ by more
than 2^30, this should be able to do the right thing.

Cc: "10.3 10.4" 
Signed-off-by: Mario Kleiner 
---
 src/glx/glxext.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/glx/glxext.c b/src/glx/glxext.c
index 68c359e..fdc24d4 100644
--- a/src/glx/glxext.c
+++ b/src/glx/glxext.c
@@ -143,8 +143,13 @@ __glXWireToEvent(Display *dpy, XEvent *event, xEvent *wire)
   aevent->ust = ((CARD64)awire->ust_hi << 32) | awire->ust_lo;
   aevent->msc = ((CARD64)awire->msc_hi << 32) | awire->msc_lo;
 
-  if (awire->sbc < glxDraw->lastEventSbc)
-glxDraw->eventSbcWrap += 0x1;
+  /* Handle 32-Bit wire sbc wraparound in both directions to cope with out
+   * of sequence 64-Bit sbc's
+   */
+  if ((int64_t) awire->sbc < ((int64_t) glxDraw->lastEventSbc - 
0x4000))
+ glxDraw->eventSbcWrap += 0x1;
+  if ((int64_t) awire->sbc > ((int64_t) glxDraw->lastEventSbc + 
0x4000))
+ glxDraw->eventSbcWrap -= 0x1;
   glxDraw->lastEventSbc = awire->sbc;
   aevent->sbc = awire->sbc + glxDraw->eventSbcWrap;
   return True;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/5] glx/dri3: Fix glXWaitForSbcOML() to handle targetSBC==0 correctly.

2014-12-02 Thread Mario Kleiner

targetSBC == 0 is a special case, which asks the function
to block until all pending OpenGL bufferswap requests have
completed.

Currently the function just falls through for targetSBC == 0,
returning bogus results.

This breaks applications originally written and tested against
DRI2 which also rely on this not regressing under DRI3/Present,
e.g., Neuro-Science software like Psychtoolbox-3.

This patch fixes the problem.

Cc: "10.3 10.4" 
Signed-off-by: Mario Kleiner 
---
 src/glx/dri3_glx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index a9ff73b..b4ac278 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -529,7 +529,8 @@ dri3_wait_for_sbc(__GLXDRIdrawable *pdraw, int64_t 
target_sbc, int64_t *ust,
 {
struct dri3_drawable *priv = (struct dri3_drawable *) pdraw;
 
-   while (priv->recv_sbc < target_sbc) {
+   while ((target_sbc != 0 && priv->recv_sbc < target_sbc) ||
+  (target_sbc == 0 && priv->recv_sbc < priv->send_sbc)) {
   if (!dri3_wait_for_event(pdraw))
  return 0;
}
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/5] glx/dri3: Don't fail on glXSwapBuffersMscOML(dpy, window, 0, 0, 0)

2014-12-02 Thread Mario Kleiner

glXSwapBuffersMscOML() with target_msc=divisor=remainder=0 gets
translated into target_msc=divisor=0 but remainder=1 by the mesa
api. This is done for server DRI2 where there needs to be a way
to tell the server-side DRI2ScheduleSwap implementation if a call
to glXSwapBuffers() or glXSwapBuffersMscOML(dpy,window,0,0,0) was
done. remainder = 1 was (ab)used as a flag to tell the server to
select proper semantic. The DRI3/Present backend ignored this
signalling, treated any target_msc=0 as glXSwapBuffers() request,
and called xcb_present_pixmap with invalid divisor=0, remainder=1
combo. The present extension responded kindly to this with a
BadValue error and dropped the request, but mesa's DRI3/Present
backend doesn't check for error codes. From there on stuff went
downhill quickly for the calling OpenGL client...

This patch fixes the problem.

Cc: "10.3 10.4" 
Signed-off-by: Mario Kleiner 
---
 src/glx/dri3_glx.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index c53be1b..efff907 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -1552,11 +1552,21 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
   dri3_fence_reset(c, back);
 
   /* Compute when we want the frame shown by taking the last known 
successful
-   * MSC and adding in a swap interval for each outstanding swap request
+   * MSC and adding in a swap interval for each outstanding swap request.
+   * target_msc=divisor=remainder=0 means "Use glXSwapBuffers() semantic"
*/
   ++priv->send_sbc;
-  if (target_msc == 0)
+  if (target_msc == 0 && divisor == 0 && remainder == 0)
  target_msc = priv->msc + priv->swap_interval * (priv->send_sbc - 
priv->recv_sbc);
+  else if (divisor == 0 && remainder > 0) {
+ /* Special case signalling: This means "glXSwapBuffersMscOML() called 
with
+  * target_msc=divisor=remainder=0. Needed to distinguish from 
glXSwapBuffers
+  * case above. Reset remainder to zero, so PresentPixmap won't bail 
on us.
+  * GLX_OML_sync_control says for divisor == 0 any remainder >= 0 is 
fine,
+  * whereas Present extension wants remainder == 0 for divisor == 0.
+  */
+ remainder = 0;
+  }
 
   if (priv->swap_interval == 0)
   options |= XCB_PRESENT_OPTION_ASYNC;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] glx/dri3: Fix glXWaitForSbcOML() to handle targetSBC==0 correctly.

2014-12-04 Thread Mario Kleiner


On 12/04/2014 11:48 AM, Axel Davy wrote:

Le 04/12/2014 11:44, Axel Davy a écrit :

On 02/12/2014 20:53, Mario Kleiner wrote :

targetSBC == 0 is a special case, which asks the function
to block until all pending OpenGL bufferswap requests have
completed.

Currently the function just falls through for targetSBC == 0,
returning bogus results.

This breaks applications originally written and tested against
DRI2 which also rely on this not regressing under DRI3/Present,
e.g., Neuro-Science software like Psychtoolbox-3.

This patch fixes the problem.

Cc: "10.3 10.4" 
Signed-off-by: Mario Kleiner 
---
  src/glx/dri3_glx.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index a9ff73b..b4ac278 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -529,7 +529,8 @@ dri3_wait_for_sbc(__GLXDRIdrawable *pdraw, 
int64_t target_sbc, int64_t *ust,

  {
 struct dri3_drawable *priv = (struct dri3_drawable *) pdraw;
  -   while (priv->recv_sbc < target_sbc) {
+   while ((target_sbc != 0 && priv->recv_sbc < target_sbc) ||
+  (target_sbc == 0 && priv->recv_sbc < priv->send_sbc)) {
if (!dri3_wait_for_event(pdraw))
   return 0;
 }

Looks good to me.

Reviewed-by: Axel Davy 
___


Just an additional thought:

it would probably better to do instead:

if (!target_sbc)
target_sbc = priv->send_sbc;


Yep. Will change it. Will also make the other suggested changes for the 
other patches and then send out a new series.


thanks,
-mario

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/5] glx/dri3: Request non-vsynced Present for swapinterval zero. (v2)

2014-12-04 Thread Mario Kleiner


On 12/04/2014 11:20 AM, Axel Davy wrote:

On 02/12/2014 20:53, Mario Kleiner wrote :

Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.


  
+  if (priv->swap_interval == 0)

+  options |= XCB_PRESENT_OPTION_ASYNC;
+
back->busy = 1;


Currently under DRI3 glx, you'll get triple buffering behaviour.
I agree this should not be the default, and your patches fixes that.
Ideally a user option should be added to control triple buffering.



The DRI2 implementation had a per-drawable "swap_limit" and 
corresponding server api to get and set it. I'm not sure atm. if it was 
ever exposed via mesa, but i think it wasn't. I agree i would be nice to 
have some control there.


There is one more problem i noticed, of which i'm not yet sure if it is 
related to mesa's n-buffering and the way it dynamically adapts n, or 
some driver implementation bug, or some other weirdness, because intel 
and nouveau behave differently. If a client does multiple swapbuffers 
calls without any rendering inbetween, then something in the buffer 
management seems to get confused for the rest of the session, even if 
the client later behaves "properly" ie., draw -> swap -> draw -> swap ...


Good case:

During whole session: draw -> swap -> draw -> swap ...

Bad case:

At least once: swap -> swap
For the rest of the session, even if you do regular 
draw->swap->draw->swap, at least under nouveau, you get proper display 
for a few frames, followed by some frame displayed which contains pixel 
trash, like from an uninitialized renderbuffer, then a few good frames 
-> pixel trash and so on. First i thought it's unsynchronized rendering 
to a swap pending buffer, ie., buffer presented while still rendered to. 
But now i think it is a trash filled buffer presented every couple of 
frames.


I didn't have time to track this one down, and probably won't have time 
to look into it atm. I just made sure that my application never does 
idle swaps, even if it only renders one useless pixel to keep mesa from 
freaking out.


Could be a bug. Could be some assumption that gets broken in a rather 
confusing way due to the triple/n-buffering.



If you add a comment above these lines, saying something like:

It is possible to enable triple buffering behaviour by not using
XCB_PRESENT_OPTION_ASYNC, but this should not be the default.



I'll add it literally, because anything i write in comments tends to 
become a not very well written essay ;-). Or should i add a "...but this 
triple buffering should probably not be the default in future 
implementations but be made configurable by some new api"?


thanks,
-mario


You can add my
Reviewed-by: Axel Davy 

Axel Davy


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/5] glx/dri3: Track separate (ust, msc) for PresentPixmap vs. PresentNotifyMsc (v2)

2014-12-04 Thread Mario Kleiner

Prevent calls to glXGetSyncValuesOML() and glXWaitForMscOML()
from overwriting the (ust,msc) values of the last successfull
swapbuffers call (PresentPixmapCompleteNotify event), as
glXWaitForSbcOML() relies on those values corresponding to
the most recent completed swap, not to whatever was last
returned from the server.

Problematic call sequence without this patch would have been, e.g.,

glXSwapBuffers()
... wait ...
swap completes -> PresentPixmapComplete event -> (ust,msc)
updated to reflect swap completion time and count.
... wait for at least 1 video refresh cycle/vblank increment.

glXGetSyncValuesOML()
-> PresentNotifyMsc event overwrites (ust,msc) of swap
completion with (ust,msc) of most recent vblank

glXWaitForSbcOML()
-> Returns sbc of last completed swap but (ust,msc) of last
completed vblank, not of last completed swap.
-> Client is confused.

Do this by tracking a separate set of (ust, msc) for the
dri3_wait_for_msc() call than for the dri3_wait_for_sbc()
call.

This makes the glXWaitForSbcOML() call robust again and restores
consistent behaviour with the DRI2 implementation.

Fixes applications originally written and tested against
DRI2 which also rely on this not regressing under DRI3/Present,
e.g., Neuro-Science software like Psychtoolbox-3.

This patch fixes the problem.

v2: Rename vblank_msc/ust to notify_msc/ust as suggested by
Axel Davy for better clarity.

Cc: "10.3 10.4" 
Signed-off-by: Mario Kleiner 
Reviewed-by: Axel Davy 
---
 src/glx/dri3_glx.c  | 11 +++
 src/glx/dri3_priv.h |  5 -
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 15d874d..b89cb46 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -420,11 +420,14 @@ dri3_handle_present_event(struct dri3_drawable *priv, 
xcb_present_generic_event_
 
  if (psc->show_fps_interval)
 show_fps(priv, ce->ust);
+
+ priv->ust = ce->ust;
+ priv->msc = ce->msc;
   } else {
  priv->recv_msc_serial = ce->serial;
+ priv->notify_ust = ce->ust;
+ priv->notify_msc = ce->msc;
   }
-  priv->ust = ce->ust;
-  priv->msc = ce->msc;
   break;
}
case XCB_PRESENT_EVENT_IDLE_NOTIFY: {
@@ -498,8 +501,8 @@ dri3_wait_for_msc(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
   }
}
 
-   *ust = priv->ust;
-   *msc = priv->msc;
+   *ust = priv->notify_ust;
+   *msc = priv->notify_msc;
*sbc = priv->recv_sbc;
 
return 1;
diff --git a/src/glx/dri3_priv.h b/src/glx/dri3_priv.h
index 8e46640..1604449 100644
--- a/src/glx/dri3_priv.h
+++ b/src/glx/dri3_priv.h
@@ -182,9 +182,12 @@ struct dri3_drawable {
uint64_t send_sbc;
uint64_t recv_sbc;
 
-   /* Last received UST/MSC values */
+   /* Last received UST/MSC values for pixmap present complete */
uint64_t ust, msc;
 
+   /* Last received UST/MSC values from present notify msc event */
+   uint64_t notify_ust, notify_msc;
+
/* Serial numbers for tracking wait_for_msc events */
uint32_t send_msc_serial;
uint32_t recv_msc_serial;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] DRI3/Present fixes for Mesa 10.3+ (v3)

2014-12-04 Thread Mario Kleiner

Ok, third iteration of the series. Incorporated all the review
comments of Axel Davy and Eric Anholt and retested for extra
paranoia. Thanks!

-mario

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/5] glx/dri3: Don't fail on glXSwapBuffersMscOML(dpy, window, 0, 0, 0) (v2)

2014-12-04 Thread Mario Kleiner

glXSwapBuffersMscOML() with target_msc=divisor=remainder=0 gets
translated into target_msc=divisor=0 but remainder=1 by the mesa
api. This is done for server DRI2 where there needs to be a way
to tell the server-side DRI2ScheduleSwap implementation if a call
to glXSwapBuffers() or glXSwapBuffersMscOML(dpy,window,0,0,0) was
done. remainder = 1 was (ab)used as a flag to tell the server to
select proper semantic. The DRI3/Present backend ignored this
signalling, treated any target_msc=0 as glXSwapBuffers() request,
and called xcb_present_pixmap with invalid divisor=0, remainder=1
combo. The present extension responded kindly to this with a
BadValue error and dropped the request, but mesa's DRI3/Present
backend doesn't check for error codes. From there on stuff went
downhill quickly for the calling OpenGL client...

This patch fixes the problem.

v2: Change comments to be more clear, with reference to
relevant spec, as suggested by Eric Anholt.

Cc: "10.3 10.4" 
Signed-off-by: Mario Kleiner 
Reviewed-by: Axel Davy 
Reviewed-by: Eric Anholt 
---
 src/glx/dri3_glx.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 3f5e64c..1ddc723 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -1560,11 +1560,24 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
   dri3_fence_reset(c, back);
 
   /* Compute when we want the frame shown by taking the last known 
successful
-   * MSC and adding in a swap interval for each outstanding swap request
+   * MSC and adding in a swap interval for each outstanding swap request.
+   * target_msc=divisor=remainder=0 means "Use glXSwapBuffers() semantic"
*/
   ++priv->send_sbc;
-  if (target_msc == 0)
+  if (target_msc == 0 && divisor == 0 && remainder == 0)
  target_msc = priv->msc + priv->swap_interval * (priv->send_sbc - 
priv->recv_sbc);
+  else if (divisor == 0 && remainder > 0) {
+ /* From the GLX_OML_sync_control spec:
+  *
+  * "If  = 0, the swap will occur when MSC becomes
+  *  greater than or equal to ."
+  *
+  * Note that there's no mention of the remainder.  The Present 
extension
+  * throws BadValue for remainder != 0 with divisor == 0, so just drop
+  * the passed in value.
+  */
+ remainder = 0;
+  }
 
   /* From the GLX_EXT_swap_control spec:
*
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/5] glx: Handle out-of-sequence swap completion events correctly.

2014-12-04 Thread Mario Kleiner

The code for emitting INTEL_swap_events swap completion
events needs to translate from 32-Bit sbc on the wire to
64-Bit sbc for the events and handle wraparound accordingly.

It assumed that events would be sent by the server in the
order their corresponding swap requests were emitted from
the client, iow. sbc count should be always increasing. This
was correct for DRI2.

This is not always the case under the DRI3/Present backend,
where the Present extension can execute swaps and send out
completion events in a different order than the submission
order of the present requests. This confused the wraparound
handling. This patch fixes the problem by handling 32-Bit
wraparound in both directions. As long as successive swap
completion events real 64-Bit sbc's don't differ by more
than 2^30, this should be able to do the right thing.

Cc: "10.3 10.4" 
Signed-off-by: Mario Kleiner 
---
 src/glx/glxext.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/glx/glxext.c b/src/glx/glxext.c
index 68c359e..fdc24d4 100644
--- a/src/glx/glxext.c
+++ b/src/glx/glxext.c
@@ -143,8 +143,13 @@ __glXWireToEvent(Display *dpy, XEvent *event, xEvent *wire)
   aevent->ust = ((CARD64)awire->ust_hi << 32) | awire->ust_lo;
   aevent->msc = ((CARD64)awire->msc_hi << 32) | awire->msc_lo;
 
-  if (awire->sbc < glxDraw->lastEventSbc)
-glxDraw->eventSbcWrap += 0x1;
+  /* Handle 32-Bit wire sbc wraparound in both directions to cope with out
+   * of sequence 64-Bit sbc's
+   */
+  if ((int64_t) awire->sbc < ((int64_t) glxDraw->lastEventSbc - 
0x4000))
+ glxDraw->eventSbcWrap += 0x1;
+  if ((int64_t) awire->sbc > ((int64_t) glxDraw->lastEventSbc + 
0x4000))
+ glxDraw->eventSbcWrap -= 0x1;
   glxDraw->lastEventSbc = awire->sbc;
   aevent->sbc = awire->sbc + glxDraw->eventSbcWrap;
   return True;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/5] glx/dri3: Fix glXWaitForSbcOML() to handle targetSBC==0 correctly. (v2)

2014-12-04 Thread Mario Kleiner

targetSBC == 0 is a special case, which asks the function
to block until all pending OpenGL bufferswap requests have
completed.

Currently the function just falls through for targetSBC == 0,
returning bogus results.

This breaks applications originally written and tested against
DRI2 which also rely on this not regressing under DRI3/Present,
e.g., Neuro-Science software like Psychtoolbox-3.

This patch fixes the problem.

v2: Simplify as suggested by Axel Davy. Add comments proposed
by Eric Anholt.

Cc: "10.3 10.4" 
Signed-off-by: Mario Kleiner 
Reviewed-by: Axel Davy 
Reviewed-by: Eric Anholt 
---
 src/glx/dri3_glx.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index a9ff73b..15d874d 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -529,6 +529,15 @@ dri3_wait_for_sbc(__GLXDRIdrawable *pdraw, int64_t 
target_sbc, int64_t *ust,
 {
struct dri3_drawable *priv = (struct dri3_drawable *) pdraw;
 
+   /* From the GLX_OML_sync_control spec:
+*
+* "If  = 0, the function will block until all previous
+*  swaps requested with glXSwapBuffersMscOML for that window have
+*  completed."
+*/
+   if (!target_sbc)
+  target_sbc = priv->send_sbc;
+
while (priv->recv_sbc < target_sbc) {
   if (!dri3_wait_for_event(pdraw))
  return 0;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/5] glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)

2014-12-04 Thread Mario Kleiner

Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Cc: "10.3 10.4" 

v2: Add Frank Binns signed off by for his original earlier
patch from April 2014, which is identical to this one, and
Chris Wilsons reviewed tag from May 2014 for that patch, ergo
also for this one.

v3: Incorporate comment about triple buffering as suggested
by Axel Davy, and reference to relevant spec provided by
Eric Anholt.

Signed-off-by: Frank Binns 
Signed-off-by: Mario Kleiner 
Reviewed-by: Chris Wilson 
Reviewed-by: Axel Davy 
Reviewed-by: Eric Anholt 
---
 src/glx/dri3_glx.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index b89cb46..3f5e64c 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -1526,6 +1526,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
xcb_connection_t *c = XGetXCBConnection(dpy);
struct dri3_buffer *back;
int64_t ret = 0;
+   uint32_t options = XCB_PRESENT_OPTION_NONE;
 
unsigned flags = __DRI2_FLUSH_DRAWABLE;
if (flush)
@@ -1565,6 +1566,17 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
   if (target_msc == 0)
  target_msc = priv->msc + priv->swap_interval * (priv->send_sbc - 
priv->recv_sbc);
 
+  /* From the GLX_EXT_swap_control spec:
+   *
+   * "If  is set to a value of 0, buffer swaps are not
+   *  synchronized to a video frame."
+   *
+   * Implementation note: It is possible to enable triple buffering 
behaviour
+   * by not using XCB_PRESENT_OPTION_ASYNC, but this should not be the 
default.
+   */
+  if (priv->swap_interval == 0)
+  options |= XCB_PRESENT_OPTION_ASYNC;
+
   back->busy = 1;
   back->last_swap = priv->send_sbc;
   xcb_present_pixmap(c,
@@ -1578,7 +1590,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
  None, /* target_crtc 
*/
  None,
  back->sync_fence,
- XCB_PRESENT_OPTION_NONE,
+ options,
  target_msc,
  divisor,
  remainder, 0, NULL);
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] DRI3/Present fixes for Mesa 10.3+ (v2)

2014-12-05 Thread Mario Kleiner

On 12/05/2014 03:41 AM, Eric Anholt wrote:

Mario Kleiner writes:

A slightly updated and extended series of the dri3/present fixes for Mesa i
sent last week.

Patch 1 and 2 are same as before. Patch 3 now has signed off by Frank Binns
and reviewed by Chris Wilson. Patch 4 and 5 are additional fixes. The last
one makes INTEL_swap_events behave properly again when swap completion events
come in out of order due to skipped present requests. Before the first out
of sequence sbc event caused the INTEL_swap_events extension to become
completely
unuseable for the rest of a session.

Sent out review for 3 of them (time for me to head out instead of
reviewing more code), and thanks for working on this. 2/5 it's not
clear to me from my first read of the spec that you shouldn't expect to
get the most recent values from either cause in your return. 5/5 my
first thoughts were "I hate wraparound logic, I'll review this later.
Also, should we even be saving off msc/ust for an out-of-order sbc
event? Needs more thought."

I'll try to think more about these later, but I wouldn't block on me.

Wrt. 2/5: It's a bit ambiguous how to read that bit of the spec, and i
agree that one could read it in a way that the current mesa dri3
behaviour is not (completely) violating the spec. When we implemented
the DRI2 version, we understood it in the way i want to restore with
2/5. The first reason is because the DRI2 / patch 2/5 interpretation
makes the OML_sync_control extension very useful and robust for swap
timing, whereas the alternative reading makes the extension borderline
useless / its use somewhat fragile due to the race described in the
commit. The second reason is backwards compatibility: It would be
awesome not to break timing sensitive apps written against DRI2,
especially given that users of those apps will certainly not understand
why a simple distribution upgrade or graphics stack update pushed by the
distro can suddenly cause such regressions.

In new app releases one could sort of work around such breakage by use
of the INTEL_swap_events extension assuming everything would be nicely
bug free. Unfortunately there were quite a few bugs and omissions and
limitations in various ddx and kms drivers even until very recently
which require funky workarounds, which require funky use of the
different bits of the OML_sync_control extension, in ways that would be
difficult to do without 2/5 restoring the DRI2 semantic.

As an example see
https://github.com/kleinerm/Psychtoolbox-3/blob/master/PsychSourceGL/Source/Linux/Screen/PsychWindowGlue.c#L1372
for my own toolkits backend implementation. It's a nice horror-show of
all the relevant bugs in DRI2/DRI3 since about 2010 with the collection
of workarounds needed to keep stuff working reasonably well on currently
shipping distros.

Wrt. 5/5: As a little helper, attached a little C test thingy for the
wraparound logic, running through a large range of simulated swap
events, exercising the logic for both overflow and underflow wraparound,
which i used to convince myself that i didn't screw up there with
integer overflows etc., apart from testing on the actual server for
normal use. In practice i don't think the wraparound logic in its
revised form will ever trigger. It takes a lot of time for a running app
to accumulate 2^32 bufferswaps.

As far as saving mst/ust for out-of-order sbc events: Yes, imho! The
Present extension allows to complete swaps in a order different from
their submission, e.g., you could submit a present request for target
msc 1, then for 9000, then for 8000..., and Present would complete
them in order 8000, 9000, 1 - this would cause the completion events
to come in in reverse order with decrementing sbc, instead of
incrementing. A valid case for dri3/present, and as INTEL_swap_events is
not restricted to ascending order, it should be able to handle that
case. Another case where one can see smaller inversions of the order is
when switching between vsynced and non-vsynced swaps, possibly when the
Present extension skips presents. I think a client could get mightily
confused if it would try to collect swap events for submitted swaps and
they don't show up.

However, what would be useful would be to extend INTEL_swap_events with
new enums for completion type. Something like GLX_ERROR_INTEL for a
present/swap that failed due to some error, e.g., out of memory, gpu
wedged, whatever, and then GLX_SKIPPED_INTEL for a skipped present.
Currently PresentCompleteModeSkip gets mapped to
GLX_COPY_COMPLETE_INTEL, losing that information, which would be
valuable for client applications like mine, which really need to know
for sure if specific content made it to the actual display unharmed, and
in which order.

Another thought i had: It would be good if we could expose somewhere if
mesa is using DRI2 or DRI3/Present in a given session, e.g., in th

Re: [Mesa-dev] [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

2014-12-15 Thread Mario Kleiner


On 12/15/2014 06:46 AM, Keith Packard wrote:

Mario Kleiner  writes:


Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Hrm. I'd love for this to be controlled by the GLX_EXT_swap_control_tear
extension, but that one uses negative interval values to indicate
tearing, instead of providing a new API, and we can't tell the
difference between 0 and -0.

Are you sure you don't want GLX_EXT_swap_control_tear and an interval of
-1 instead of an interval of 0 with tearing?



Yes. GLX_EXT_swap_control_tear It's a different use case. Useful for 
games which want to avoid tearing if possible, but don't want to get in 
a "tremor" of switching frequently between say 60 fps and 30 fps if they 
only almost manage to do 60 fps, but not with reliable headroom. Would 
be also nice to have support for that, but only as an addition for 
swapinterval -1, not a replacement.


The 0 case is good for benchmarking.

In my specific case i always want vsync'ed swap for actual visual 
stimulation in neuroscience/medical settings, with no frame skipped 
ever. The bonus use for me, except for benchmarking how fast the system 
can go, is if one has a multi-display setup, e.g., dual-display for 
stereoscopic stimulation - one display per eye, or some CAVE like setup 
for VR with more than 2 displays. You want display updates and scanout 
on all of them synchronized, so the scene stays coherent. One simple way 
for visually testing multi-display sync is to intentionally swap all of 
them without vsync, e.g., timed to swap in the middle of the scanout. If 
the tear-lines on all displays are roughly at the same vertical position 
and stay there then that's a good visual test if stuff works. There are 
other ways to do it, but this is the one method that seems to work 
cross-platform, without lots of mental context switching depending on 
what os/gpu/server/driver combo with what settings one uses, and much 
more easy to grasp for scientists with no graphics background. You can 
see at a glance if stuff is roughly correct or not.


-mario

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

2014-12-16 Thread Mario Kleiner


On 12/16/2014 09:23 AM, Keith Packard wrote:

Mario Kleiner  writes:


The 0 case is good for benchmarking.

Sure, but the current code does benchmarking just fine. In fact, because
it doesn't copy queued frames that aren't the most recent before the
vblank, benchmarks tend to run *faster* as a result, and people
generally like that aspect of it...



Hmm. For benchmarking i think i'd consider that a mild form of cheating. 
You get higher fps because you skip processing like the whole gpu blit 
overhead and host processing overhead for queuing / validating / 
processing the copy command in the command stream, so the benchmark 
numbers don't translate very well anymore in how the system would behave 
in a non-benchmark situation?


... but read on below ...


In my specific case i always want vsync'ed swap for actual visual
stimulation in neuroscience/medical settings, with no frame skipped
ever. The bonus use for me, except for benchmarking how fast the system
can go, is if one has a multi-display setup, e.g., dual-display for
stereoscopic stimulation - one display per eye, or some CAVE like setup
for VR with more than 2 displays. You want display updates and scanout
on all of them synchronized, so the scene stays coherent. One simple way
for visually testing multi-display sync is to intentionally swap all of
them without vsync, e.g., timed to swap in the middle of the scanout. If
the tear-lines on all displays are roughly at the same vertical position
and stay there then that's a good visual test if stuff works. There are
other ways to do it, but this is the one method that seems to work
cross-platform, without lots of mental context switching depending on
what os/gpu/server/driver combo with what settings one uses, and much
more easy to grasp for scientists with no graphics background. You can
see at a glance if stuff is roughly correct or not.

It seems like you want something that the GL API doesn't express
precisely; my reading of the  GL spec definitely lets Present work the
way it does today, and as you avoid tearing *and* improve performance in
the vblank_mode=0 case, I'm very reluctant to change it.


From GLX_EXT_swap_control and MESA_swap_control:

"If  is set to a value of 0, buffer swaps are not
synchronized to a video frame."


It depends on how you interpret the "not synchronized to a video frame"? 
Can you explain your interpretation?


I don't think the spec says anywhere that dropping old "not most recent 
at vblank time" frames is allowed, like Present does atm.? And the 
current Present implementation does synchronize the "buffer swap" of its 
most recent received Pixmap to the [onset of] a video frame, so while 
preventing tearing it goes a bit slower than it could go without vsync.


Every past/other implementation than DRI3/Present that i have experience 
with interpreted the spec the way that patch tries to restore, at least 
everything i know of from OSX/Windows/Linux proprietary/DRI2, so my 
interpretation is certainly a valid interpretation, and it is the one 
that provides consistency and therefore the least surprise to 
implementers of GL clients and end users.



Present could trivially offer a new bit to force tearing; I'm not sure
how you'd get at that from GL though.



It does already with PresentOptionAsync? It just needs to be used in 
accordance with the mainstream interpretation of the _swap_control spec, 
like this patch suggests.


I'm not trying to claim here that the current behaviour of Mesa+Present 
isn't useful for some types of applications like games. I'm just saying 
it shouldn't be the default behaviour for swapinterval 0 or > 0. As far 
as i understand the meaning, intention and origin of the 
EXT_swap_control_tear extension, the current Present implementation 
would implement a useful approximation of EXT_swap_control_tear for a 
swapinterval of < 0. Not an exact implementation, but at least following 
the spirit of that extension.


So i'm arguing for restoring the default behaviour any other 
implementation has with that patch, but providing the current behaviour 
via sync_control_tear? Or maybe even some new sync_control_tear2 to 
cover the difference between the current method and sync_control_tear.


When we are at the topic, i can also send you my christmas wish list 
with proposals for future mesa/server releases:


1 - Another thing i'd love to have, which would require a new option 
"PresentOptionDontSkip" is the ability to not skip present requests 
which are late. That would allow to take advantage of mesas triple/quad 
buffering to queue frames for animations ahead of time for playing 
animations or videos and be still certain that every queued frame was 
shown at least for one video refresh cycle. I'd love to take advantage 
of the new triple-buffering behaviour, or maybe even use Pre

Re: [Mesa-dev] [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

2014-12-16 Thread Mario Kleiner


On 12/17/2014 04:17 AM, Keith Packard wrote:

Mario Kleiner  writes:


Hmm. For benchmarking i think i'd consider that a mild form of cheating.
You get higher fps because you skip processing like the whole gpu blit
overhead and host processing overhead for queuing / validating /
processing the copy command in the command stream, so the benchmark
numbers don't translate very well anymore in how the system would behave
in a non-benchmark situation?

It's still a very useful mode -- imagine wanting the lowest possible
latency between the user and the display; normally, you process input
and generate a frame just after vblank, then (if the rendering is really
quick), end up waiting most of the frame time not doing anything before
finally updating it at the next vblank.

With vblank_mode=0 and DRI3, you have the ability to try and generate
another frame before vblank comes; if you manage, then you get that data
on the screen, rather than the older version.

So, it offers latency closer to the tearing vblank_mode=0 but without
any tearing.

That's why I did it; the fact that it offers a small performance benefit
for benchmarks is an unintentional bonus feature.



It is a useful mode for some applications, no disagreement here. I can 
think of use cases where i would exactly want to have that, e.g., VR 
stuff with head-tracking -> rendering -> Occulus Rift style display. I 
actually want to tinker with something like that early next year and see 
if i can make use of it.


It's just that i need access to both, the old behaviour i described, and 
the new "drop frame" behaviour, and i need a way to select what i want 
at runtime via api without the need for easily overwhelmed and confused 
users to change config files or environment variables. I also always 
need meaningful and trustworthy feedback, at least for page-flipped 
presents, about what really happened for a presented frame - was it 
flipped, copied, exchanged, skipped, or did some error happen?


That's why i'd like to have an extension to INTEL_swap_events to also 
report some new completion type "skipped" and "error" and that one patch 
5/5 of mine for mesa reviewed and included, to make sure the swap_events 
don't fall apart so easily.


That's why i think we could (ab)use the swap_control_tear extension to 
implement this behaviour:


swapinterval > 0 - sync to vblank, swap only at most every 
swapinterval'th refresh.
swapinterval = 0 - Don't sync to vblank (PresentOptionAsync), swap 
immediately.


swapinterval < 0 - Do what your current implementation does for 
swapinterval = abs(swapinterval);


The first two would restore consistent behaviour with all other 
implementations.


The last one would implement something at least close to the spirit of 
of swap_control_tear. Although it would be even better to think of some 
extension to the extension so one can select if one really wants 
"tearing if late" as swap_control_tear defines, to squeeze out a few 
more msecs, or rather your tear free "skip old frames" method.


Essentially i'd love to have more control over how this stuff is done, 
and a default behaviour close to what all other implementations do, to 
reduce confusion.


As some kind of stop gap measure one could also think about defining a 
new vblank_mode to enable the new behaviour instead of the old one.


-mario

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

2014-12-16 Thread Mario Kleiner


On 12/17/2014 05:49 AM, Keith Packard wrote:

Mario Kleiner  writes:


It's just that i need access to both, the old behaviour i described, and
the new "drop frame" behaviour, and i need a way to select what i want
at runtime via api without the need for easily overwhelmed and confused
users to change config files or environment variables. I also always
need meaningful and trustworthy feedback, at least for page-flipped
presents, about what really happened for a presented frame - was it
flipped, copied, exchanged, skipped, or did some error happen?

Present reports precisely what it did with each frame; flipped, copied,
or skipped.


That's why i'd like to have an extension to INTEL_swap_events to also
report some new completion type "skipped" and "error" and that one patch
5/5 of mine for mesa reviewed and included, to make sure the swap_events
don't fall apart so easily.

You can use Present events on the target drawable; they're generated to
whoever requests them, so you don't need to rely on the intel swap
events alone.


Never thought about that. Could you show me some short example snippet 
of XLib/GLX code how i reliably detect at runtime if Present is present, 
and then enable this? That would probably do for the moment and at the 
same time solve the problem that i don't know how to reliably detect at 
runtime if i'm on DRI2 or DRI3/Present. Making good use of this will 
require separate code-path and a way to select the right one.


It's still important to fix that wraparound handling bug from my patch 
5/5 for INTEL_swap_events.



As some kind of stop gap measure one could also think about defining a
new vblank_mode to enable the new behaviour instead of the old one.

I really don't have a good suggestion here, given that we have such a
limited API available.



I thought i just made a suggestion how we could wiggle through it within 
existing api?

And we can define new one for future extensions?

Need to sleep now, thanks
-mario

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

2014-12-17 Thread Mario Kleiner


On 12/17/2014 12:45 PM, Eero Tamminen wrote:

Hi,

On 12/16/2014 08:30 PM, Mario Kleiner wrote:

On 12/16/2014 09:23 AM, Keith Packard wrote:

Mario Kleiner  writes:


The 0 case is good for benchmarking.
Sure, but the current code does benchmarking just fine. In fact, 
because

it doesn't copy queued frames that aren't the most recent before the
vblank, benchmarks tend to run *faster* as a result, and people
generally like that aspect of it...



Hmm. For benchmarking i think i'd consider that a mild form of cheating.
You get higher fps because you skip processing like the whole gpu blit
overhead and host processing overhead for queuing / validating /
processing the copy command in the command stream, so the benchmark
numbers don't translate very well anymore in how the system would behave
in a non-benchmark situation?mesa-dev


From performance numbers on Windows it's clear that Windows doesn't
copy frames that happen faster than monitor update frequency.



Under desktop composition, yes. As far as i observed (and i think also 
read somewhere), their compositor wakes up once at the beginning of a 
refresh cycle and composites everything that needs composition and was 
submitted in the previous frame, then pageflips at the next vsync, so 
you'll always have at least 1 frame lag, while at the same time skipping 
frames if the client renders too fast. It's somewhat different for 
unredirected windows, but the rules of when a window gets unredirected 
are somewhat weird and also inconsistently (buggy) implemented on many 
drivers in my experience. OSX ditto. One reason why many of my timing 
sensitive users are fleeing to Linux...



This isn't cheating.  It makes numbers more relevant as it minimizes
Windowing system impact/distortion on application rendering performance.

The problem with Windowing system doing extra copies is that it doesn't
affect all applications equally.  If application is memory bandwidth
limited, the windowing system impact is directly proportional to FPS.
If application isn't memory bandwidth limited, it has no impact. And
whether application is memory bandwidth limited, is HW dependent, which
distorts also HW comparisons.


I can agree that benchmarking is hard and it depends on what you want to 
measure ;-) - so many benchmarks to pick from...


But my main concern is not the benchmarking, but the other use cases i 
mentioned for non-vsynced operation, and some consistency between 
implementations. It's difficult to explain to "normal" users why they 
should follow totally different procedures depending on 
XOrg-Version/Mesa-Version/Type and version of ddx/Which kernel they 
use/What distro and distro version they use/If its DRI2 or DRI3/what 
acceleration api is in use/whatever. Especially when i myself as someone 
who can read source code and does a lot of testing have huge trouble 
remembering all the special cases.


I just like api's to behave somewhat predictable over time and rather 
have more extra api for fine grained control and introducing new 
functionality than less api, so that i can make decisions about what 
tradeoffs to choose automatically in my app, instead of exposing tons of 
configuration howtos to my users.


-mario





- Eero


... but read on below ...


In my specific case i always want vsync'ed swap for actual visual
stimulation in neuroscience/medical settings, with no frame skipped
ever. The bonus use for me, except for benchmarking how fast the 
system

can go, is if one has a multi-display setup, e.g., dual-display for
stereoscopic stimulation - one display per eye, or some CAVE like 
setup

for VR with more than 2 displays. You want display updates and scanout
on all of them synchronized, so the scene stays coherent. One 
simple way
for visually testing multi-display sync is to intentionally swap 
all of
them without vsync, e.g., timed to swap in the middle of the 
scanout. If
the tear-lines on all displays are roughly at the same vertical 
position
and stay there then that's a good visual test if stuff works. There 
are

other ways to do it, but this is the one method that seems to work
cross-platform, without lots of mental context switching depending on
what os/gpu/server/driver combo with what settings one uses, and much
more easy to grasp for scientists with no graphics background. You can
see at a glance if stuff is roughly correct or not.

It seems like you want something that the GL API doesn't express
precisely; my reading of the  GL spec definitely lets Present work the
way it does today, and as you avoid tearing *and* improve 
performance in

the vblank_mode=0 case, I'm very reluctant to change it.


 From GLX_EXT_swap_control and MESA_swap_control:

"If  is set to a value of 0, buffer swaps are not
 synchronized to a video frame."


It depends on how you interpret the "not synchronized to a video frame"?
Can

[Mesa-dev] Fwd: Review for last remaining Mesa wayland-drm depth 30 bits?

2019-01-29 Thread Mario Kleiner

Hi,

could i get some review of the last two missing patches of mine for
depth 30 support in Mesa's egl/wayland wl-drm backend? They are over
six months old now, well-tested at time of original submission:

https://patchwork.freedesktop.org/project/mesa/list/?submitter=14956

Would be good to get this into Mesa 19 before new color formats are
added. Should be useful for new formats as well.

Thanks,
-mario
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Fwd: Review for last remaining Mesa wayland-drm depth 30 bits?

2019-01-29 Thread Mario Kleiner

Thanks for the quick review!

Patchwork didn't pick the r-b's up though. Do i need to do anything /
ping somebody to get those merged to master ideally before the close
19.0 branchpoint and not overlooked?

thanks,
-mario

On Tue, Jan 29, 2019 at 5:13 PM Daniel Stone  wrote:
>
> Hi,
>
> On Tue, 29 Jan 2019 at 16:05, Adam Jackson  wrote:
> > On Tue, 2019-01-29 at 14:45 +0100, Mario Kleiner wrote:
> > > could i get some review of the last two missing patches of mine for
> > > depth 30 support in Mesa's egl/wayland wl-drm backend? They are over
> > > six months old now, well-tested at time of original submission:
> > >
> > > https://patchwork.freedesktop.org/project/mesa/list/?submitter=14956
> > >
> > > Would be good to get this into Mesa 19 before new color formats are
> > > added. Should be useful for new formats as well.
> >
> > 4 and 5 are:
> >
> > Reviewed-by: Adam Jackson 
>
> And they are also:
> Reviewed-by: Daniel Stone 
>
> Thanks for chasing this up!
>
> Cheers,
> Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] drirc: Add sddm-greeter to adaptive_sync blacklist.

2019-02-18 Thread Mario Kleiner

This is the sddm login screen.

Fixes: a9c36dbf9c56 ("drirc: Initial blacklist for adaptive sync")
Signed-off-by: Mario Kleiner 
Cc: 19.0 
---
 src/util/00-mesa-defaults.conf | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/util/00-mesa-defaults.conf b/src/util/00-mesa-defaults.conf
index cb0e6e659e2..43fe95b8810 100644
--- a/src/util/00-mesa-defaults.conf
+++ b/src/util/00-mesa-defaults.conf
@@ -346,6 +346,9 @@ TODO: document the other workarounds.
 
 
 
+
+
+
 
 
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [xf86-video-nouveau] dri2: Enable BufferAge support

2015-05-08 Thread Mario Kleiner




On 01/19/2015 12:00 PM, Chris Wilson wrote:

For enable BufferAge support, we just have to be not using the
DRI2Buffer->flags field for any purpose (i.e. it is always expected to
be 0, as it is now) and to be sure to swap the flags field whenever we
exchange buffers. As nouveau does not exactly support TripleBuffer, we
don't have to worry about setting the copying the flags field when
injecting the third buffer.

Signed-off-by: Chris Wilson 
Cc: Maarten Lankhorst 
Cc: Mario Kleiner 
---
  src/nouveau_dri2.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/src/nouveau_dri2.c b/src/nouveau_dri2.c
index e3445b2..428ef92 100644
--- a/src/nouveau_dri2.c
+++ b/src/nouveau_dri2.c
@@ -711,6 +711,7 @@ nouveau_dri2_finish_swap(DrawablePtr draw, unsigned int 
frame,
}

SWAP(s->dst->name, s->src->name);
+   SWAP(s->dst->flags, s->src->flags);
SWAP(nouveau_pixmap(dst_pix)->bo, nouveau_pixmap(src_pix)->bo);

DamageRegionProcessPending(draw);
@@ -1003,6 +1004,12 @@ nouveau_dri2_init(ScreenPtr pScreen)
dri2.DestroyBuffer2 = nouveau_dri2_destroy_buffer2;
dri2.CopyRegion2 = nouveau_dri2_copy_region2;
  #endif
+
+#if DRI2INFOREC_VERSION >= 10
+   dri2.version = 10;
+   dri2.bufferAge = 1;
+#endif
+
return DRI2ScreenInit(pScreen, &dri2);
  }




Seems ok to me.

Reviewed-by: Mario Kleiner 

-mario
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads.

2015-06-05 Thread Mario Kleiner

The dup'ed fd owned by the nouveau_screen for a device node
must also be used as key for the winsys hash table, instead
of using the original fd passed in for a screen, to make
multi-x-screen ZaphodHeads configurations work on nouveau.

This prevents the following crash scenario that was observed
when a dynamically loaded rendering plugin used OpenGL on a
ZaphodHeads setup, e.g., on a dual x-screen setup. At first
load the plugin worked, but after unloading and reloading it,
the next rendering operation crashed:

1. Client, e.g., a plugin, calls glXQueryVersion.

2. DRI screens for the x-screens 0 and 1 are created, one shared
   nouveau_screen is created for the shared device node of both
   screens, but the original fd of x-screen 0 is used as identifying
   key in the hash table, instead of the dup()ed fd of x-screen 0
   which is owned by the nouveau_screen. nouveau_screen's refcount
   is now 2.

3. Regular rendering happens by the client plugin, then the plugin
   gets unloaded.

4. XCloseDisplay(). x-screen 0 gets its DRI screen destroyed,
   nouveau_drm_screen_unref() drops the refcount to 1, calling mesa
   code then closes the fd of x-screen 0, so now the fd which is
   used as key in the hash table is invalid. x-screen 1 gets
   destroyed, nouveau_drm_screen_unref() drops the refcount to 0,
   the nouveau_screen gets destroyed, but removal of its entry
   in the hash table fails, because the invalid fd in the hash
   table no longer matches anything (fstat() on the fd is used
   for hashing and key comparison, but fstat() on an already closed
   fd fails and returns bogus results). x-screen 1 closes its fd.

   Now all fd's are closed, the nouveau_screen destroyed, but
   there is a dangling reference to the nouveau_screen in the
   hash table.

5. Some OpenGL client plugin gets loaded again and calls
   glXQueryVersion. Step 2 above repeats, but because a
   dangling reference with a matching fd is found in the winsys
   hash table, no new nouveau_screen is created this time. Instead
   the invalid pointer to the old nouveau_screen is recycled,
   which points to nirvana -> Crash.

This problem is avoided by use of the dup()ed fd which is
owned by the nouveau_screen and has the same lifetime as
the nouveau_screen itself.

Cc: "10.3 10.4 10.5 10.6" 

Signed-off-by: Mario Kleiner 
Cc: Ilia Mirkin 
---
 src/gallium/winsys/nouveau/drm/nouveau_drm_winsys.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/winsys/nouveau/drm/nouveau_drm_winsys.c 
b/src/gallium/winsys/nouveau/drm/nouveau_drm_winsys.c
index 0635246..dbc3cae 100644
--- a/src/gallium/winsys/nouveau/drm/nouveau_drm_winsys.c
+++ b/src/gallium/winsys/nouveau/drm/nouveau_drm_winsys.c
@@ -120,7 +120,8 @@ nouveau_drm_screen_create(int fd)
if (!screen)
goto err;
 
-   util_hash_table_set(fd_tab, intptr_to_pointer(fd), screen);
+   /* Use dupfd in hash table, to avoid crashes in ZaphodHeads configs */
+   util_hash_table_set(fd_tab, intptr_to_pointer(dupfd), screen);
screen->refcount = 1;
pipe_mutex_unlock(nouveau_screen_mutex);
return &screen->base;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads.

2015-06-05 Thread Mario Kleiner


On 06/05/2015 03:50 PM, Ilia Mirkin wrote:

This scheme is copied from radeon, does it need a similar fix? I'm away
from computers for another week or so, will be able to look then.



For some reason, no. Testing on Radeon multi-x-screen ZaphodHeads never 
showed such problems over multiple years and many mesa versions. 
Apparently it does something slightly different, although i saw the 
hashing logic with fstat() is the same. Maybe correct by slightly 
different design, or maybe we just get lucky because the same fd's 
aren't recycled on successive glXQueryVersion() calls?


thanks,
-mario



On Jun 5, 2015 4:37 PM, "Mario Kleiner" mailto:mario.kleiner...@gmail.com>> wrote:

The dup'ed fd owned by the nouveau_screen for a device node
must also be used as key for the winsys hash table, instead
of using the original fd passed in for a screen, to make
multi-x-screen ZaphodHeads configurations work on nouveau.

This prevents the following crash scenario that was observed
when a dynamically loaded rendering plugin used OpenGL on a
ZaphodHeads setup, e.g., on a dual x-screen setup. At first
load the plugin worked, but after unloading and reloading it,
the next rendering operation crashed:

1. Client, e.g., a plugin, calls glXQueryVersion.

2. DRI screens for the x-screens 0 and 1 are created, one shared
nouveau_screen is created for the shared device node of both
screens, but the original fd of x-screen 0 is used as identifying
key in the hash table, instead of the dup()ed fd of x-screen 0
which is owned by the nouveau_screen. nouveau_screen's refcount
is now 2.

3. Regular rendering happens by the client plugin, then the plugin
gets unloaded.

4. XCloseDisplay(). x-screen 0 gets its DRI screen destroyed,
nouveau_drm_screen_unref() drops the refcount to 1, calling mesa
code then closes the fd of x-screen 0, so now the fd which is
used as key in the hash table is invalid. x-screen 1 gets
destroyed, nouveau_drm_screen_unref() drops the refcount to 0,
the nouveau_screen gets destroyed, but removal of its entry
in the hash table fails, because the invalid fd in the hash
table no longer matches anything (fstat() on the fd is used
for hashing and key comparison, but fstat() on an already closed
fd fails and returns bogus results). x-screen 1 closes its fd.

Now all fd's are closed, the nouveau_screen destroyed, but
there is a dangling reference to the nouveau_screen in the
hash table.

5. Some OpenGL client plugin gets loaded again and calls
glXQueryVersion. Step 2 above repeats, but because a
dangling reference with a matching fd is found in the winsys
hash table, no new nouveau_screen is created this time. Instead
the invalid pointer to the old nouveau_screen is recycled,
which points to nirvana -> Crash.

This problem is avoided by use of the dup()ed fd which is
owned by the nouveau_screen and has the same lifetime as
the nouveau_screen itself.

Cc: "10.3 10.4 10.5 10.6" mailto:mesa-sta...@lists.freedesktop.org>>

Signed-off-by: Mario Kleiner mailto:mario.kleiner...@gmail.com>>
Cc: Ilia Mirkin mailto:imir...@alum.mit.edu>>
---
  src/gallium/winsys/nouveau/drm/nouveau_drm_winsys.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/winsys/nouveau/drm/nouveau_drm_winsys.c
b/src/gallium/winsys/nouveau/drm/nouveau_drm_winsys.c
index 0635246..dbc3cae 100644
--- a/src/gallium/winsys/nouveau/drm/nouveau_drm_winsys.c
+++ b/src/gallium/winsys/nouveau/drm/nouveau_drm_winsys.c
@@ -120,7 +120,8 @@ nouveau_drm_screen_create(int fd)
 if (!screen)
 goto err;

-   util_hash_table_set(fd_tab, intptr_to_pointer(fd), screen);
+   /* Use dupfd in hash table, to avoid crashes in ZaphodHeads
configs */
+   util_hash_table_set(fd_tab, intptr_to_pointer(dupfd), screen);
 screen->refcount = 1;
 pipe_mutex_unlock(nouveau_screen_mutex);
 return &screen->base;
--
2.1.4


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] glx/dri3: Fix glXWaitForSbcOML() to handle targetSBC==0 correctly.

2014-11-24 Thread Mario Kleiner

targetSBC == 0 is a special case, which asks the function
to block until all pending OpenGL bufferswap requests have
completed.

Currently the function just falls through for targetSBC == 0,
returning bogus results.

This breaks applications originally written and tested against
DRI2 which also rely on this not regressing under DRI3/Present,
e.g., Neuro-Science software like Psychtoolbox-3.

This patch fixes the problem.

Cc: "10.3 10.4" 
Signed-off-by: Mario Kleiner 
---
 src/glx/dri3_glx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index a9ff73b..b4ac278 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -529,7 +529,8 @@ dri3_wait_for_sbc(__GLXDRIdrawable *pdraw, int64_t 
target_sbc, int64_t *ust,
 {
struct dri3_drawable *priv = (struct dri3_drawable *) pdraw;
 
-   while (priv->recv_sbc < target_sbc) {
+   while ((target_sbc != 0 && priv->recv_sbc < target_sbc) ||
+  (target_sbc == 0 && priv->recv_sbc < priv->send_sbc)) {
   if (!dri3_wait_for_event(pdraw))
  return 0;
}
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] glx/dri3: Track separate (ust, msc) for PresentPixmap vs. PresentNotifyMsc

2014-11-24 Thread Mario Kleiner

Prevent calls to glXGetSyncValuesOML() and glXWaitForMscOML()
from overwriting the (ust,msc) values of the last successfull
swapbuffers call (PresentPixmapCompleteNotify event), as
glXWaitForSbcOML() relies on those values corresponding to
the most recent completed swap, not to whatever was last
returned from the server.

Problematic call sequence without this patch would have been, e.g.,

glXSwapBuffers()
... wait ...
swap completes -> PresentPixmapComplete event -> (ust,msc)
updated to reflect swap completion time and count.
... wait for at least 1 video refresh cycle/vblank increment.

glXGetSyncValuesOML()
-> PresentNotifyMsc event overwrites (ust,msc) of swap
completion with (ust,msc) of most recent vblank

glXWaitForSbcOML()
-> Returns sbc of last completed swap but (ust,msc) of last
completed vblank, not of last completed swap.
-> Client is confused.

Do this by tracking a separate set of (ust, msc) for the
dri3_wait_for_msc() call than for the dri3_wait_for_sbc()
call.

This makes the glXWaitForSbcOML() call robust again and restores
consistent behaviour with the DRI2 implementation.

Fixes applications originally written and tested against
DRI2 which also rely on this not regressing under DRI3/Present,
e.g., Neuro-Science software like Psychtoolbox-3.

This patch fixes the problem.

Cc: "10.3 10.4" 
Signed-off-by: Mario Kleiner 
---
 src/glx/dri3_glx.c  | 11 +++
 src/glx/dri3_priv.h |  5 -
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index b4ac278..5796491 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -420,11 +420,14 @@ dri3_handle_present_event(struct dri3_drawable *priv, 
xcb_present_generic_event_
 
  if (psc->show_fps_interval)
 show_fps(priv, ce->ust);
+
+ priv->ust = ce->ust;
+ priv->msc = ce->msc;
   } else {
  priv->recv_msc_serial = ce->serial;
+ priv->vblank_ust = ce->ust;
+ priv->vblank_msc = ce->msc;
   }
-  priv->ust = ce->ust;
-  priv->msc = ce->msc;
   break;
}
case XCB_PRESENT_EVENT_IDLE_NOTIFY: {
@@ -498,8 +501,8 @@ dri3_wait_for_msc(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
   }
}
 
-   *ust = priv->ust;
-   *msc = priv->msc;
+   *ust = priv->vblank_ust;
+   *msc = priv->vblank_msc;
*sbc = priv->recv_sbc;
 
return 1;
diff --git a/src/glx/dri3_priv.h b/src/glx/dri3_priv.h
index 8e46640..222deb0 100644
--- a/src/glx/dri3_priv.h
+++ b/src/glx/dri3_priv.h
@@ -182,9 +182,12 @@ struct dri3_drawable {
uint64_t send_sbc;
uint64_t recv_sbc;
 
-   /* Last received UST/MSC values */
+   /* Last received UST/MSC values for pixmap present complete */
uint64_t ust, msc;
 
+   /* Last received UST/MSC values for vblank */
+   uint64_t vblank_ust, vblank_msc;
+
/* Serial numbers for tracking wait_for_msc events */
uint32_t send_msc_serial;
uint32_t recv_msc_serial;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] DRI3/Present fixes for Mesa 10.3+

2014-11-24 Thread Mario Kleiner

Hi

Here three patches against mesa to fix use of the OML_sync_control
extension under DRI3/Present and restore behaviour compatible to
the DRI2 implementation, so applications like mine, which were written
and tested against DRI2, don't fail miserably under the new backend.

Tested on Intel HD Ironlake, Intel HD-4000 Ivybridge and nouveau with
NVidia GeForce 330M, on single and dual-display, also with special
timing test equipment to confirm proper behaviour wrt. swap scheduling
and swap completion timestamping.

Together with XOrg-1.16.2 + a set of patches with fixes for the X-Server
i sent out earlier, these patches make users of OML_sync_control and
INTEL_swap_events mostly work correctly under DRI3/Present, at least
with the sna and uxa backends on Intel-ddx and the new dri3/present exa
backend on nouveau-ddx.

Could these patches please also get included into a Mesa 10.3.x stable
update?

Thanks,
-mario

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

2014-11-24 Thread Mario Kleiner

Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Cc: "10.3 10.4" 
Signed-off-by: Mario Kleiner 
---
 src/glx/dri3_glx.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 5796491..c53be1b 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -1518,6 +1518,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
xcb_connection_t *c = XGetXCBConnection(dpy);
struct dri3_buffer *back;
int64_t ret = 0;
+   uint32_t options = XCB_PRESENT_OPTION_NONE;
 
unsigned flags = __DRI2_FLUSH_DRAWABLE;
if (flush)
@@ -1557,6 +1558,9 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
   if (target_msc == 0)
  target_msc = priv->msc + priv->swap_interval * (priv->send_sbc - 
priv->recv_sbc);
 
+  if (priv->swap_interval == 0)
+  options |= XCB_PRESENT_OPTION_ASYNC;
+
   back->busy = 1;
   back->last_swap = priv->send_sbc;
   xcb_present_pixmap(c,
@@ -1570,7 +1574,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
  None, /* target_crtc 
*/
  None,
  back->sync_fence,
- XCB_PRESENT_OPTION_NONE,
+ options,
  target_msc,
  divisor,
  remainder, 0, NULL);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

2014-11-30 Thread Mario Kleiner


On 11/25/2014 07:52 AM, Axel Davy wrote:

Hi,



Sorry for the late reply, seems my e-mail filters hate me - i was 
already worried about the lack of replies.


This patch removes the tripple buffering behaviour that the GLX 
implementation

has with DRI3. I understand your concern for Medical softwares,
but perhaps this would be better handled with an user option.

Axel Davy



I don't think it does that?

For swapinterval > 0 all the triple-buffering etc. is left active as 
usual, and swaps are supposed to be vsynced and tear-free. swapinterval 
is 1 by default, so this is the case by default.


For a client or user selected swapinterval = 0, the code in Mesa 
schedules glXSwapBuffers calls to occur immediately, and the Present 
implementation in the server schedules them in a way so they are 
supposed to occur immediately/asap - assuming no vsync, otherwise the 
logic there wouldn't make much sense.


Due to what i'm pretty sure is a bug (see my other 2 patches on top of 
the xserver dri3/present implementation) the PresentOptionAsync isn't 
handled correctly by the server for pagelfips as it should be, and as 
result vsync is always on on intel-ddx, and always off on nouveau-ddx 
(or any future driver which exposes PresentCapabilityAsync as nouveau 
does), which causes massive tearing on nouveau all the time under 
dri3/present.


On any os/graphics-stack i'm familiar with (Windows and OSX all versions 
and drivers, Linux proprietary drivers, DRI2 with intel-ddx and 
nouveau-ddx) a swapinterval zero meant to swap immediately without 
vsync, so this just restores what probably any OpenGL client expects to 
happen. The MESA_swap_control spec even states that "If  is 
set to a value of 0, buffer swaps are not synchronized to a video frame."


For medical or neuro-science applications like mine, or multi-display 
stereo/VR applications it is crucial to have this for users being able 
to easily diagnose multi-display synchronization issues without the need 
to learn how to use git and compilers and how graphics stacks work. For 
gamers and gaming benchmarks it's the "ego-shooter mode" and "squeeze 
out max fps" they like so much.


Btw., digging deeper into the ddx'es i realized that my posted patch for 
the x-server bug is both incomplete and too complex. I've written a new 
patch which is only a simple one-liner but solves more problems than the 
old one. I'll send it out after giving it some testing.


thanks,
-mario


On 25/11/2014 04:00, Mario Kleiner wrote :

Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Cc: "10.3 10.4" 
Signed-off-by: Mario Kleiner 
---
  src/glx/dri3_glx.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 5796491..c53be1b 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -1518,6 +1518,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, 
int64_t target_msc, int64_t divisor,

 xcb_connection_t *c = XGetXCBConnection(dpy);
 struct dri3_buffer *back;
 int64_t ret = 0;
+   uint32_t options = XCB_PRESENT_OPTION_NONE;
   unsigned flags = __DRI2_FLUSH_DRAWABLE;
 if (flush)
@@ -1557,6 +1558,9 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, 
int64_t target_msc, int64_t divisor,

if (target_msc == 0)
   target_msc = priv->msc + priv->swap_interval * 
(priv->send_sbc - priv->recv_sbc);

  +  if (priv->swap_interval == 0)
+  options |= XCB_PRESENT_OPTION_ASYNC;
+
back->busy = 1;
back->last_swap = priv->send_sbc;
xcb_present_pixmap(c,
@@ -1570,7 +1574,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, 
int64_t target_msc, int64_t divisor,

   None, /* target_crtc */
   None,
   back->sync_fence,
- XCB_PRESENT_OPTION_NONE,
+ options,
   target_msc,
   divisor,
   remainder, 0, NULL);




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

2014-11-30 Thread Mario Kleiner


On 11/25/2014 09:31 AM, Frank Binns wrote:

Hi,

I sent exactly the same patch back in April. Despite getting a review-by
it was never merged. See:

http://lists.freedesktop.org/archives/mesa-dev/2014-April/058347.html
http://lists.freedesktop.org/archives/mesa-dev/2014-May/060432.html

Thanks
Frank



Indeed, exactly the same patch - minus white-space differences - already 
reviewed. So i guess i can add your additional signed-off-by and chris 
wilsons reviewed-by to this one?


I have a few more patches coming for this series, which fix two more 
mesa bugs related to dri3/present...


thanks,
-mario


On 25/11/14 03:00, Mario Kleiner wrote:

Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Cc: "10.3 10.4" 
Signed-off-by: Mario Kleiner 
---
  src/glx/dri3_glx.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 5796491..c53be1b 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -1518,6 +1518,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, 
int64_t target_msc, int64_t divisor,

 xcb_connection_t *c = XGetXCBConnection(dpy);
 struct dri3_buffer *back;
 int64_t ret = 0;
+   uint32_t options = XCB_PRESENT_OPTION_NONE;
   unsigned flags = __DRI2_FLUSH_DRAWABLE;
 if (flush)
@@ -1557,6 +1558,9 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, 
int64_t target_msc, int64_t divisor,

if (target_msc == 0)
   target_msc = priv->msc + priv->swap_interval * 
(priv->send_sbc - priv->recv_sbc);

  +  if (priv->swap_interval == 0)
+  options |= XCB_PRESENT_OPTION_ASYNC;
+
back->busy = 1;
back->last_swap = priv->send_sbc;
xcb_present_pixmap(c,
@@ -1570,7 +1574,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, 
int64_t target_msc, int64_t divisor,

   None, /* target_crtc */
   None,
   back->sync_fence,
- XCB_PRESENT_OPTION_NONE,
+ options,
   target_msc,
   divisor,
   remainder, 0, NULL);




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] glx: Handle out-of-sequence swap completion events correctly. (v2)

2015-03-12 Thread Mario Kleiner

The code for emitting INTEL_swap_events swap completion
events needs to translate from 32-Bit sbc on the wire to
64-Bit sbc for the events and handle wraparound accordingly.

It assumed that events would be sent by the server in the
order their corresponding swap requests were emitted from
the client, iow. sbc count should be always increasing. This
was correct for DRI2.

This is not always the case under the DRI3/Present backend,
where the Present extension can execute presents and send out
completion events in a different order than the submission
order of the present requests, due to client code specifying
targetMSC target vblank counts which are not strictly
monotonically increasing. This confused the wraparound
handling. This patch fixes the problem by handling 32-Bit
wraparound in both directions. As long as successive swap
completion events real 64-Bit sbc's don't differ by more
than 2^30, this should be able to do the right thing.

How this is supposed to work:

awire->sbc contains the low 32-Bits of the true 64-Bit sbc
of the current swap event, transmitted over the wire.

glxDraw->lastEventSbc contains the low 32-Bits of the 64-Bit
sbc of the most recently processed swap event.

glxDraw->eventSbcWrap is a 64-Bit offset which tracks the upper
32-Bits of the current sbc. The final 64-Bit output sbc
aevent->sbc is computed from the sum of awire->sbc and
glxDraw->eventSbcWrap.

Under DRI3/Present, swap completion events can be received
slightly out of order due to non-monotic targetMsc specified
by client code, e.g., present request submission:

Submission sbc:   1   2   3
targetMsc:10  11  9

Reception of completion events:
Completion sbc:   3   1   2

The completion sequence 3, 1, 2 would confuse the old wraparound
handling made for DRI2 as 1 < 3 --> Assumes a 32-Bit wraparound
has happened when it hasn't.

The client can queue multiple present requests, in the case of
Mesa up to n requests for n-buffered rendering, e.g., n =  2-4 in
the current Mesa GLX DRI3/Present implementation. In the case of
direct Pixmap presents via xcb_present_pixmap() the number n is
limited by the amount of memory available.

We reasonably assume that the number of outstanding requests n is
much less than 2 billion due to memory contraints and common sense.
Therefore while the order of received sbc's can be a bit scrambled,
successive 64-Bit sbc's won't deviate by much, a given sbc may be
a few counts lower or higher than the previous received sbc.

Therefore any large difference between the incoming awire->sbc and
the last recorded glxDraw->lastEventSbc will be due to 32-Bit
wraparound and we need to adapt glxDraw->eventSbcWrap accordingly
to adjust the upper 32-Bits of the sbc.

Two cases, correponding to the two if-statements in the patch:

a) Previous sbc event was below the last 2^32 boundary, in the previous
glxDraw->eventSbcWrap epoch, the new sbc event is in the next 2^32
epoch, therefore the low 32-Bit awire->sbc wrapped around to zero,
or close to zero --> awire->sbc is apparently much lower than the
glxDraw->lastEventSbc recorded for the previous epoch

--> We need to increment glxDraw->eventSbcWrap by 2^32 to adjust
the current epoch to be one higher than the previous one.

--> Case a) also handles the old DRI2 behaviour.

b) Previous sbc event was above closest 2^32 boundary, but now a
late event from the previous 2^32 epoch arrives, with a true sbc
that belongs to the previous 2^32 segment, so the awire->sbc of
this late event has a high count close to 2^32, whereas
glxDraw->lastEventSbc is closer to zero --> awire->sbc is much
greater than glXDraw->lastEventSbc.

--> We need to decrement glxDraw->eventSbcWrap by 2^32 to adjust
the current epoch back to the previous lower epoch of this late
completion event.

We assume such a wraparound to a higher (a) epoch or lower (b)
epoch has happened if awire->sbc and glxDraw->lastEventSbc differ
by more than 2^30 counts, as such a difference can only happen
on wraparound, or if somehow 2^30 present requests would be pending
for a given drawable inside the server, which is rather unlikely.

v2: Explain the reason for this patch and the new wraparound handling
much more extensive in commit message, no code change wrt. initial
version.

Cc: "10.3 10.4 10.5" 
Signed-off-by: Mario Kleiner 
---
 src/glx/glxext.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/glx/glxext.c b/src/glx/glxext.c
index 68c359e..fdc24d4 100644
--- a/src/glx/glxext.c
+++ b/src/glx/glxext.c
@@ -143,8 +143,13 @@ __glXWireToEvent(Display *dpy, XEvent *event, xEvent *wire)
   aevent->ust = ((CARD64)awire->ust_hi << 32) | awire->ust_lo;
   aevent->msc = ((CARD64)awire->msc_hi << 32) | awire->msc_lo;
 
-  if (awire->sbc < glxDraw->lastEventSbc)
-glxDraw->eventSbcWrap +=

[Mesa-dev] Resend [PATCH] glx: Handle out-of-sequence swap completion events correctly.

2015-03-12 Thread Mario Kleiner

Hi all,

a respin of the bugfix for INTEL_swap_events + DRI3/Present.

The code in the patch itself is identical to the one queued
for Mesa 10.5.1, the one apparently nobody wants to review
- I hate wraparound handling too...

The only difference is a much longer commit message which
explains why we need this for DRI3/Present, and why this new
wraparound handling is supposed to work.

If nobody wants to review this, or can come up with a better
solution, i could also submit a patch that simply removes the
broken wraparound handling of current Mesa, which would still
be an improvement over what we have now, given that few OpenGL
clients will submit over 2^32 swapbuffers in a single session
for a single drawable, as that would require a runtime without
interruption of over 13 months at 120 fps redraw rate.

thanks,
-mario

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mapi: Make private copies of name strings provided by client.

2015-03-12 Thread Mario Kleiner

glXGetProcAddress("glFoo") ends up in stub_add_dynamic() to
create dynamic stubs for dynamic functions. stub_add_dynamic()
doesn't store the caller provided name string "Foo" in a mesa
private copy, but just stores a pointer to the "glFoo" string
passed to glXGetProcAddress - a pointer into arbitrary memory
outside mesa's control.

If the caller passes some dynamically allocated/changing
memory buffer to glXGetProcAddress(), or the caller gets unmapped
from memory, e.g., some dynamically loaded application
plugin which uses OpenGL, this ends badly - with a dangling
pointer.

strdup() the name string provided by the client to avoid
this problem.

Cc: "10.3 10.4 10.5" 
Signed-off-by: Mario Kleiner 
---
 src/mapi/stub.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mapi/stub.c b/src/mapi/stub.c
index dfadbe1..45ccc6a 100644
--- a/src/mapi/stub.c
+++ b/src/mapi/stub.c
@@ -110,7 +110,7 @@ stub_add_dynamic(const char *name)
if (!stub->addr)
   return NULL;
 
-   stub->name = (const void *) name;
+   stub->name = (const void *) strdup(name);
/* to be fixed later */
stub->slot = -1;
 
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 02/15] i965/screen: Add basic support for rendering 10 bpc/depth 30 framebuffers. (v2)

2017-10-18 Thread Mario Kleiner


On 10/10/2017 12:51 PM, Tapani Pälli wrote:

On 10/06/2017 07:11 PM, Mario Kleiner wrote:

Expose formats which are supported at least back to Gen 5 Ironlake,
possibly further. Allow creation of 10 bpc winsys buffers for drawables.

glxinfo now lists new RGBA 10 10 10 2/0 formats.
Works correctly under DRI2 without compositing.

v2: Move the BGRA/BGRX1010102 formats before the RGBA/RGBX
 32 bit formats, as the code comments require. Thanks Emil!
 Update num_formats from 3 to 5, to keep the special Android
 handling intact.

Signed-off-by: Mario Kleiner 
---
  src/mesa/drivers/dri/i965/intel_screen.c | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c

index 712cd40..9c0539d 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1567,7 +1567,13 @@ intelCreateBuffer(__DRIscreen *dri_screen,
    fb->Visual.samples = num_samples;
 }
-   if (mesaVis->redBits == 5) {
+   if (mesaVis->redBits == 10 && mesaVis->alphaBits > 0) {
+  rgbFormat = mesaVis->redMask == 0x3ff0 ? 
MESA_FORMAT_B10G10R10A2_UNORM
+ : 
MESA_FORMAT_R10G10B10A2_UNORM;

+   } else if (mesaVis->redBits == 10) {
+  rgbFormat = mesaVis->redMask == 0x3ff0 ? 
MESA_FORMAT_B10G10R10X2_UNORM
+ : 
MESA_FORMAT_R10G10B10X2_UNORM;

+   } else if (mesaVis->redBits == 5) {
    rgbFormat = mesaVis->redMask == 0x1f ? MESA_FORMAT_R5G6B5_UNORM
 : MESA_FORMAT_B5G6R5_UNORM;
 } else if (mesaVis->sRGBCapable) {
@@ -1949,6 +1955,10 @@ intel_screen_make_configs(__DRIscreen *dri_screen)
    MESA_FORMAT_B8G8R8A8_UNORM,
    MESA_FORMAT_B8G8R8X8_UNORM,
+  /* For 10 bpc, 30 bit depth framebuffers. */
+  MESA_FORMAT_B10G10R10A2_UNORM,
+  MESA_FORMAT_B10G10R10X2_UNORM,
+
    /* The 32-bit RGBA format must not precede the 32-bit BGRA 
format.
 * Likewise for RGBX and BGRX.  Otherwise, the GLX client and 
the GLX
 * server may disagree on which format the GLXFBConfig 
represents,

@@ -1988,7 +1998,7 @@ intel_screen_make_configs(__DRIscreen *dri_screen)
 if (intel_loader_get_cap(dri_screen, DRI_LOADER_CAP_RGBA_ORDERING))
    num_formats = ARRAY_SIZE(formats);
 else
-  num_formats = 3;
+  num_formats = 5;


How about following:
num_formats = ARRAY_SIZE(formats) - 2; /* all - RGBA_ORDERING formats */

(I'm suggesting it here https://patchwork.freedesktop.org/patch/177410/)

Not a big deal though, with or without this patch looks correct.

Reviewed-by: Tapani Pälli 

// Tapani


Good idea. I've changed that patch accordingly. Thanks for the review.
-mario

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] 10-bit Mesa/Gallium support

2017-11-23 Thread Mario Kleiner


On 11/23/2017 06:45 PM, Ilia Mirkin wrote:

On Thu, Nov 23, 2017 at 12:35 PM, Marek Olšák  wrote:

Hi everybody,

Mario, feel free to push your patches if you haven't yet. (except the
workaround)



Hi,

just started 10 minutes ago with rebasing my current patchset against 
mesa master. Will need some adjustments and retesting against i965.


I was also just "sort of done" with a mesa/gallium 10 bit version. I 
think i'll submit rev 3 later today or tomorrow and maybe we'll need to 
sort this out then, what goes where. I'll compare with Mareks branch...


The current state of my series for AMD here is that radeon-kms + ati-ddx 
works nicely under exa (and with a slightly patched weston), but the 
ati-ddx also needed some small patches which i have to send out. On 
amdgpu-kms i know it works under my patched weston branch.


What is completely missing is glamor support, ergo support for at least 
amdgpu-ddx and modesetting-ddx -- and xwayland.



For AMD, I applied Mario's patches (except Wayland - that didn't
apply) and added initial Gallium support:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=10bit

What's the status of Glamor?

Do we have patches for xf86-video-amdgpu? The closed should have
10-bit support, meaning we should have DDX patches already somewhere,
right?


Somewhere there must be some, as the amdgpu-pro driver with the 
proprietary libGL supported depth 30 at least in some version i tested 
earlier this year?




I'd like to test this out with nouveau as well... do I understand
correctly that I shouldn't need anything special to check if it
basically works? i.e. I apply the patches, start Xorg in bpp=30 mode,
and then if glxgears works then I'm done? Is there a good way that I'm
really in 30bpp mode as far as all the software is concerned? (I don't
have a colorimeter or whatever fancy hw to *really* tell the
difference, although I do have a "deep color" TV.) If used with a
24bpp display, is the hw supposed to dither somehow?x

   -ilia



nouveau is quite a bit work to do and not so clear how to proceed.

My current series does do proper xrgb2101010 / argb2101010 rendering 
under gallium on both nv50 and nvc0 (Tested under GeForce 9600 for 
tesla, GTX 970 and 1050 for maxwell and pascal). I used PRIME render 
offload under both DRI3/Present and Wayland/Weston with both intel and 
amd as display gpus, so i know the drivers work together properly and 
nouveau-gallium renders correctly.


The display side for native scanout on Nvidia is somewhat broken atm.:

1. Since Linux 4.10 with the switch of nouveau-kms to atomic 
modesetting, using drmAddFB() with depth/bpp 30/32 maps to xrgb2101010 
format, but nouveau-kms doesn't support xrgb2101010, so setting Xorg to 
depth 30 will end in a server-abort with modesetting failure. nouveau 
before Linux 4.10 mapped 30/32 to xbgr2101010 which seems to be 
supported since nv50. If i boot with a < 4.10 kernel i get a picture at 
least on the old GeForce 9600 and GT330M.


If i hack nouveau-ddx to use a xrgb2101010 color channel mask (red in 
msb's, blue in lsbs) instead of the correct xbgr2101010 mask, then i can 
get nouveau-gallium to render 10 bits, but of course with swapped red 
and blue channels. Switching dithering on via xrandr allows to get 
rendered 10 bit images to get to a 8 bpc display, as confirmed via 
colorimeter. I hope a deep color TV might work without dithering.


According to

https://github.com/envytools/envytools/blob/master/rnndb/display/nv_evo.xml

gpu's since kepler gk104 support xrgb2101010 scanout. With a hacked 
nouveau-kms i can get the maxwell and pascal cards to accept 
xrgb2101010, but the display is beyond weird. So far i couldn't make 
much sense of the pixeltrash -- some of it remotely resembles a desktop, 
but something is going wrong badly. Also the xbgr2101010 mode doesn't 
work correct. The same is true for Wayland+Weston and even if i run 
Weston with pixman, keeping Mesa out of the picture. So nouveau-kms 
needs some work for all modern nvidia gpu's. Gamma table handling 
changed quite a bit, so maybe something is wrong there.


2. We might also need some work for exa on nvc0+, but it's not clear 
what problems are caused by kernel side, and what in exa.


3. In principle the clean solution for nouveau would be to upgrade the 
ddx to drmAddFB2 ioctl, and use xbgr2101010 scanout to support 
everything back to nv50+, but everything we have in X or Wayland is 
meant for xrgb2101010 not xbgr2101010. And we run into ambiguities of 
what, e.g., a depth 30 pixmap means in some extensions like 
glx_texture_form_pixmap. And the GLX extension generally seems to have 
unresolved problems with ARGB formats instead of ABGR formats, which is 
why Mesa doesn't expose ARGB by default -- only on Android atm.


So on nouveau everything except the gallium bits is quite a bit messy at 
the moment, but the gallium bits work according to my testing.


-mario
___
mesa-dev mailing list
mes

Re: [Mesa-dev] 10-bit Mesa/Gallium support

2017-11-23 Thread Mario Kleiner


On 11/23/2017 07:44 PM, Ilia Mirkin wrote:

On Thu, Nov 23, 2017 at 1:31 PM, Mario Kleiner
 wrote:

On 11/23/2017 06:45 PM, Ilia Mirkin wrote:


On Thu, Nov 23, 2017 at 12:35 PM, Marek Olšák  wrote:


Hi everybody,

Mario, feel free to push your patches if you haven't yet. (except the
workaround)



Hi,

just started 10 minutes ago with rebasing my current patchset against mesa
master. Will need some adjustments and retesting against i965.

I was also just "sort of done" with a mesa/gallium 10 bit version. I think
i'll submit rev 3 later today or tomorrow and maybe we'll need to sort this
out then, what goes where. I'll compare with Mareks branch...

The current state of my series for AMD here is that radeon-kms + ati-ddx
works nicely under exa (and with a slightly patched weston), but the ati-ddx
also needed some small patches which i have to send out. On amdgpu-kms i
know it works under my patched weston branch.

What is completely missing is glamor support, ergo support for at least
amdgpu-ddx and modesetting-ddx -- and xwayland.


For AMD, I applied Mario's patches (except Wayland - that didn't
apply) and added initial Gallium support:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=10bit

What's the status of Glamor?

Do we have patches for xf86-video-amdgpu? The closed should have
10-bit support, meaning we should have DDX patches already somewhere,
right?



Somewhere there must be some, as the amdgpu-pro driver with the proprietary
libGL supported depth 30 at least in some version i tested earlier this
year?



I'd like to test this out with nouveau as well... do I understand
correctly that I shouldn't need anything special to check if it
basically works? i.e. I apply the patches, start Xorg in bpp=30 mode,
and then if glxgears works then I'm done? Is there a good way that I'm
really in 30bpp mode as far as all the software is concerned? (I don't
have a colorimeter or whatever fancy hw to *really* tell the
difference, although I do have a "deep color" TV.) If used with a
24bpp display, is the hw supposed to dither somehow?x

-ilia



nouveau is quite a bit work to do and not so clear how to proceed.

My current series does do proper xrgb2101010 / argb2101010 rendering under
gallium on both nv50 and nvc0 (Tested under GeForce 9600 for tesla, GTX 970
and 1050 for maxwell and pascal). I used PRIME render offload under both
DRI3/Present and Wayland/Weston with both intel and amd as display gpus, so
i know the drivers work together properly and nouveau-gallium renders
correctly.

The display side for native scanout on Nvidia is somewhat broken atm.:

1. Since Linux 4.10 with the switch of nouveau-kms to atomic modesetting,
using drmAddFB() with depth/bpp 30/32 maps to xrgb2101010 format, but
nouveau-kms doesn't support xrgb2101010, so setting Xorg to depth 30 will
end in a server-abort with modesetting failure. nouveau before Linux 4.10
mapped 30/32 to xbgr2101010 which seems to be supported since nv50. If i
boot with a < 4.10 kernel i get a picture at least on the old GeForce 9600
and GT330M.

If i hack nouveau-ddx to use a xrgb2101010 color channel mask (red in msb's,
blue in lsbs) instead of the correct xbgr2101010 mask, then i can get
nouveau-gallium to render 10 bits, but of course with swapped red and blue
channels. Switching dithering on via xrandr allows to get rendered 10 bit
images to get to a 8 bpc display, as confirmed via colorimeter. I hope a
deep color TV might work without dithering.

According to

https://github.com/envytools/envytools/blob/master/rnndb/display/nv_evo.xml

gpu's since kepler gk104 support xrgb2101010 scanout. With a hacked
nouveau-kms i can get the maxwell and pascal cards to accept xrgb2101010,
but the display is beyond weird. So far i couldn't make much sense of the
pixeltrash -- some of it remotely resembles a desktop, but something is
going wrong badly. Also the xbgr2101010 mode doesn't work correct. The same
is true for Wayland+Weston and even if i run Weston with pixman, keeping
Mesa out of the picture. So nouveau-kms needs some work for all modern
nvidia gpu's. Gamma table handling changed quite a bit, so maybe something
is wrong there.

2. We might also need some work for exa on nvc0+, but it's not clear what
problems are caused by kernel side, and what in exa.

3. In principle the clean solution for nouveau would be to upgrade the ddx
to drmAddFB2 ioctl, and use xbgr2101010 scanout to support everything back
to nv50+, but everything we have in X or Wayland is meant for xrgb2101010
not xbgr2101010. And we run into ambiguities of what, e.g., a depth 30
pixmap means in some extensions like glx_texture_form_pixmap. And the GLX
extension generally seems to have unresolved problems with ARGB formats
instead of ABGR formats, which is why Mesa doesn't expose ARGB by default --
only on Android atm.


Mistyped, i actually meant BGRA/BGRX is

1 2 3 >

1 - 100 of 203 matches

Mail list logo