date:20210427

On Thu, Apr 22, 2021 at 07:02:43PM -0700, Joseph Kogut wrote:
> Remove usage of legacy dma-api abstraction in preparation for removal
> 
> Signed-off-by: Joseph Kogut 
> ---
> Checkpatch warns here that r128 is marked obsolete, and asks for no
> unnecessary modifications.
> 
> This series aims to address the FIXME in drivers/gpu/drm/drm_pci.c
> explaining that drm_pci_alloc/free is a needless abstraction of the
> dma-api, and it should be removed. Unfortunately, doing this requires
> removing the usage from an obsolete driver as well.
> 
> If this patch is rejected for modifying an obsolete driver, would it be
> appropriate to follow up removing the FIXME from drm_pci?

Feels like a good enough reason, both patches queued up in drm-misc-next
for 5.14. Thanks a lot for doing them.
-Daniel

> 
>  drivers/gpu/drm/drm_bufs.c | 19 ---
>  drivers/gpu/drm/drm_dma.c  |  8 +++-
>  drivers/gpu/drm/r128/ati_pcigart.c | 22 ++
>  3 files changed, 41 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_bufs.c b/drivers/gpu/drm/drm_bufs.c
> index e3d77dfefb0a..94bc1f6049c9 100644
> --- a/drivers/gpu/drm/drm_bufs.c
> +++ b/drivers/gpu/drm/drm_bufs.c
> @@ -674,12 +674,17 @@ int drm_legacy_rmmap_ioctl(struct drm_device *dev, void 
> *data,
>  static void drm_cleanup_buf_error(struct drm_device *dev,
> struct drm_buf_entry *entry)
>  {
> + drm_dma_handle_t *dmah;
>   int i;
>  
>   if (entry->seg_count) {
>   for (i = 0; i < entry->seg_count; i++) {
>   if (entry->seglist[i]) {
> - drm_pci_free(dev, entry->seglist[i]);
> + dmah = entry->seglist[i];
> + dma_free_coherent(dev->dev,
> +   dmah->size,
> +   dmah->vaddr,
> +   dmah->busaddr);
>   }
>   }
>   kfree(entry->seglist);
> @@ -978,10 +983,18 @@ int drm_legacy_addbufs_pci(struct drm_device *dev,
>   page_count = 0;
>  
>   while (entry->buf_count < count) {
> + dmah = kmalloc(sizeof(drm_dma_handle_t), GFP_KERNEL);
> + if (!dmah)
> + return -ENOMEM;
>  
> - dmah = drm_pci_alloc(dev, PAGE_SIZE << page_order, 0x1000);
> + dmah->size = total;
> + dmah->vaddr = dma_alloc_coherent(dev->dev,
> +  dmah->size,
> +  &dmah->busaddr,
> +  GFP_KERNEL);
> + if (!dmah->vaddr) {
> + kfree(dmah);
>  
> - if (!dmah) {
>   /* Set count correctly so we free the proper amount. */
>   entry->buf_count = count;
>   entry->seg_count = count;
> diff --git a/drivers/gpu/drm/drm_dma.c b/drivers/gpu/drm/drm_dma.c
> index d07ba54ec945..eb6b741a6f99 100644
> --- a/drivers/gpu/drm/drm_dma.c
> +++ b/drivers/gpu/drm/drm_dma.c
> @@ -81,6 +81,7 @@ int drm_legacy_dma_setup(struct drm_device *dev)
>  void drm_legacy_dma_takedown(struct drm_device *dev)
>  {
>   struct drm_device_dma *dma = dev->dma;
> + drm_dma_handle_t *dmah;
>   int i, j;
>  
>   if (!drm_core_check_feature(dev, DRIVER_HAVE_DMA) ||
> @@ -100,7 +101,12 @@ void drm_legacy_dma_takedown(struct drm_device *dev)
> dma->bufs[i].seg_count);
>   for (j = 0; j < dma->bufs[i].seg_count; j++) {
>   if (dma->bufs[i].seglist[j]) {
> - drm_pci_free(dev, 
> dma->bufs[i].seglist[j]);
> + dmah = dma->bufs[i].seglist[j];
> + dma_free_coherent(dev->dev,
> +   dmah->size,
> +   dmah->vaddr,
> +   dmah->busaddr);
> + kfree(dmah);
>   }
>   }
>   kfree(dma->bufs[i].seglist);
> diff --git a/drivers/gpu/drm/r128/ati_pcigart.c 
> b/drivers/gpu/drm/r128/ati_pcigart.c
> index 1234ec60c0af..fbb0cfd79758 100644
> --- a/drivers/gpu/drm/r128/ati_pcigart.c
> +++ b/drivers/gpu/drm/r128/ati_pcigart.c
> @@ -45,18 +45,32 @@
>  static int drm_ati_alloc_pcigart_table(struct drm_device *dev,
>  struct drm_ati_pcigart_info *gart_info)
>  {
> - gart_info->table_handle = drm_pci_alloc(dev, gart_info->table_size,
> - PAGE_SIZE);
> - if (gart_info->table_handle == NULL)
> + drm_dma_handle_t *dmah = kmalloc(sizeof(drm_dma

Re: [PATCH v2 1/1] drm/doc: document drm_mode_get_plane

2021-04-27 Thread Pekka Paalanen

On Mon, 26 Apr 2021 14:30:53 -0300
Leandro Ribeiro  wrote:

> On 4/26/21 7:58 AM, Simon Ser wrote:
> > On Monday, April 26th, 2021 at 9:36 AM, Pekka Paalanen 
> >  wrote:
> >   
>  This should probably explain what the bits in the mask correspond to.
>  As in, which CRTC does bit 0 refer to, and so on.  
> >>>
> >>> What about:
> >>>
> >>> "possible_crtcs: Bitmask of CRTC's compatible with the plane. CRTC's are
> >>> created and they receive an index, which corresponds to their position
> >>> in the bitmask. CRTC with index 0 will be in bit 0, and so on."  
> >>
> >> This would still need to explain where can I find this index.  
> >   
> 
> What do you mean?
> 
> > This closed merge request had some docs about possible CRTCs:
> > 
> > https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/102
> >   
> I'm afraid I don't know exactly what you expect to be documented here
> that is still missing. Could you please elaborate?
> 
> Thanks a lot!

The documentation you add is talking about "CRTC index". What defines a
CRTC object's index? How do I determine what index a CRTC object has?

The answer is, AFAIK, that the index is never stored explicitly
anywhere. You have to get the DRM resources structure, which has an
array for CRTC IDs. The index is the index to that array, IIRC. So if
one does not already know this, it is going to be really hard to figure
out what the "index" is. It might even be confused with the object ID,
which it is not but the ID might by complete accident be less than 32
so it would look ok at first glance.

If the index is already explained somewhere else, a reference to that
documentation would be enough.

Thanks,
pq

pgpE8iYxaizOA.pgp
Description: OpenPGP digital signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 00/12] Remove vfio_mdev.c, mdev_parent_ops and more

2021-04-27 Thread Christian Borntraeger





On 26.04.21 19:42, Jason Gunthorpe wrote:

On Mon, Apr 26, 2021 at 06:43:14PM +0200, Christian Borntraeger wrote:

On 24.04.21 01:02, Jason Gunthorpe wrote:

Prologue


This is series #3 in part of a larger work that arose from the minor
remark that the mdev_parent_ops indirection shim is useless and
complicates things.

It applies on top of Alex's current tree and requires the prior two
series.


Do you have a tree somewhere?


[..]

A preview of the future series's is here:
https://github.com/jgunthorpe/linux/pull/3/commits


Has everything, you'll want to go to:
   cover-letter: Remove vfio_mdev.c, mdev_parent_ops and more

As there are additional WIPs in that tree.


I gave this a quick spin on s390x vfio-ap and it seems to work ok.
This is really just a quick test, but no obvious problem.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

On Mon, Apr 26, 2021 at 04:59:28PM -0400, Marek Olšák wrote:
> Thanks everybody. The initial proposal is dead. Here are some thoughts on
> how to do it differently.
> 
> I think we can have direct command submission from userspace via
> memory-mapped queues ("user queues") without changing window systems.
> 
> The memory management doesn't have to use GPU page faults like HMM.
> Instead, it can wait for user queues of a specific process to go idle and
> then unmap the queues, so that userspace can't submit anything. Buffer
> evictions, pinning, etc. can be executed when all queues are unmapped
> (suspended). Thus, no BO fences and page faults are needed.
> 
> Inter-process synchronization can use timeline semaphores. Userspace will
> query the wait and signal value for a shared buffer from the kernel. The
> kernel will keep a history of those queries to know which process is
> responsible for signalling which buffer. There is only the wait-timeout
> issue and how to identify the culprit. One of the solutions is to have the
> GPU send all GPU signal commands and all timed out wait commands via an
> interrupt to the kernel driver to monitor and validate userspace behavior.
> With that, it can be identified whether the culprit is the waiting process
> or the signalling process and which one. Invalid signal/wait parameters can
> also be detected. The kernel can force-signal only the semaphores that time
> out, and punish the processes which caused the timeout or used invalid
> signal/wait parameters.
> 
> The question is whether this synchronization solution is robust enough for
> dma_fence and whatever the kernel and window systems need.

The proper model here is the preempt-ctx dma_fence that amdkfd uses
(without page faults). That means dma_fence for synchronization is doa, at
least as-is, and we're back to figuring out the winsys problem.

"We'll solve it with timeouts" is very tempting, but doesn't work. It's
akin to saying that we're solving deadlock issues in a locking design by
doing a global s/mutex_lock/mutex_lock_timeout/ in the kernel. Sure it
avoids having to reach the reset button, but that's about it.

And the fundamental problem is that once you throw in userspace command
submission (and syncing, at least within the userspace driver, otherwise
there's kinda no point if you still need the kernel for cross-engine sync)
means you get deadlocks if you still use dma_fence for sync under
perfectly legit use-case. We've discussed that one ad nauseam last summer:

https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html?highlight=dma_fence#indefinite-dma-fences

See silly diagramm at the bottom.

Now I think all isn't lost, because imo the first step to getting to this
brave new world is rebuilding the driver on top of userspace fences, and
with the adjusted cmd submit model. You probably don't want to use amdkfd,
but port that as a context flag or similar to render nodes for gl/vk. Of
course that means you can only use this mode in headless, without
glx/wayland winsys support, but it's a start.
-Daniel

> 
> Marek
> 
> On Tue, Apr 20, 2021 at 4:34 PM Daniel Stone  wrote:
> 
> > Hi,
> >
> > On Tue, 20 Apr 2021 at 20:30, Daniel Vetter  wrote:
> >
> >> The thing is, you can't do this in drm/scheduler. At least not without
> >> splitting up the dma_fence in the kernel into separate memory fences
> >> and sync fences
> >
> >
> > I'm starting to think this thread needs its own glossary ...
> >
> > I propose we use 'residency fence' for execution fences which enact
> > memory-residency operations, e.g. faulting in a page ultimately depending
> > on GPU work retiring.
> >
> > And 'value fence' for the pure-userspace model suggested by timeline
> > semaphores, i.e. fences being (*addr == val) rather than being able to look
> > at ctx seqno.
> >
> > Cheers,
> > Daniel
> > ___
> > mesa-dev mailing list
> > mesa-...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm: i915: fix build when ACPI is disabled and BACKLIGHT=m

On Mon, 26 Apr 2021, Randy Dunlap  wrote:
> When CONFIG_DRM_I915=y, CONFIG_ACPI is not set, and
> CONFIG_BACKLIGHT_CLASS_DEVICE=m, not due to I915 config,
> there are build errors trying to reference backlight_device_{un}register().
>
> Changing the use of IS_ENABLED() to IS_REACHABLE() in intel_panel.[ch]
> fixes this.

I feel like a broken record...

CONFIG_DRM_I915=y and CONFIG_BACKLIGHT_CLASS_DEVICE=m is an invalid
configuration. The patch at hand just silently hides the problem,
leaving you without backlight.

i915 should *depend* on backlight, not select it. It would express the
dependency without chances for invalid configuration.

However, i915 alone can't depend on backlight, all users of backlight
should depend on backlight, not select it. Otherwise, you end up with
other configuration problems, circular dependencies and
whatnot. Everyone should change. See also (*) why select is not a good
idea here.

I've sent patches to this effect before, got rejected, and the same
thing gets repeated ad infinitum.

Accepting this patch would stop the inflow of these reports and similar
patches, but it does not fix the root cause. It just sweeps the problem
under the rug.


BR,
Jani.

(*) Documentation/kbuild/kconfig-language.rst:

select should be used with care. select will force
a symbol to a value without visiting the dependencies.
By abusing select you are able to select a symbol FOO even
if FOO depends on BAR that is not set.
In general use select only for non-visible symbols
(no prompts anywhere) and for symbols with no dependencies.
That will limit the usefulness but on the other hand avoid
the illegal configurations all over.


>
> ld: drivers/gpu/drm/i915/display/intel_panel.o: in function 
> `intel_backlight_device_register':
> intel_panel.c:(.text+0x2ec1): undefined reference to 
> `backlight_device_register'
> ld: drivers/gpu/drm/i915/display/intel_panel.o: in function 
> `intel_backlight_device_unregister':
> intel_panel.c:(.text+0x2f93): undefined reference to 
> `backlight_device_unregister'
>
> ld: drivers/gpu/drm/i915/display/intel_panel.o: in function 
> `intel_backlight_device_register':
> intel_panel.c:(.text+0x2ec1): undefined reference to 
> `backlight_device_register'
> ld: drivers/gpu/drm/i915/display/intel_panel.o: in function 
> `intel_backlight_device_unregister':
> intel_panel.c:(.text+0x2f93): undefined reference to 
> `backlight_device_unregister'
>
> Fixes: 912e8b12eedb ("drm/i915: register backlight device also when backlight 
> class is a module")
> Fixes: 44c1220a441c ("drm/i915: extract intel_panel.h from intel_drv.h")
> Signed-off-by: Randy Dunlap 
> Cc: Ville Syrjälä 
> Cc: Jani Nikula 
> Cc: Damien Lespiau 
> Cc: Daniel Vetter 
> Cc: Jani Nikula 
> Cc: Joonas Lahtinen 
> Cc: Rodrigo Vivi 
> Cc: intel-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> ---
> Found in linux-next but applies to mainline (5.12).
>
>  drivers/gpu/drm/i915/display/intel_panel.c |2 +-
>  drivers/gpu/drm/i915/display/intel_panel.h |2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> --- linux-next-20210426.orig/drivers/gpu/drm/i915/display/intel_panel.c
> +++ linux-next-20210426/drivers/gpu/drm/i915/display/intel_panel.c
> @@ -1254,7 +1254,7 @@ void intel_panel_enable_backlight(const
>   mutex_unlock(&dev_priv->backlight_lock);
>  }
>  
> -#if IS_ENABLED(CONFIG_BACKLIGHT_CLASS_DEVICE)
> +#if IS_REACHABLE(CONFIG_BACKLIGHT_CLASS_DEVICE)
>  static u32 intel_panel_get_backlight(struct intel_connector *connector)
>  {
>   struct drm_i915_private *dev_priv = to_i915(connector->base.dev);
> --- linux-next-20210426.orig/drivers/gpu/drm/i915/display/intel_panel.h
> +++ linux-next-20210426/drivers/gpu/drm/i915/display/intel_panel.h
> @@ -54,7 +54,7 @@ u32 intel_panel_invert_pwm_level(struct
>  u32 intel_panel_backlight_level_to_pwm(struct intel_connector *connector, 
> u32 level);
>  u32 intel_panel_backlight_level_from_pwm(struct intel_connector *connector, 
> u32 val);
>  
> -#if IS_ENABLED(CONFIG_BACKLIGHT_CLASS_DEVICE)
> +#if IS_REACHABLE(CONFIG_BACKLIGHT_CLASS_DEVICE)
>  int intel_backlight_device_register(struct intel_connector *connector);
>  void intel_backlight_device_unregister(struct intel_connector *connector);
>  #else /* CONFIG_BACKLIGHT_CLASS_DEVICE */

-- 
Jani Nikula, Intel Open Source Graphics Center
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v4 3/4] drm/vkms: add XRGB planes composition

2021-04-27 Thread Pekka Paalanen

On Mon, 26 Apr 2021 14:31:28 -0300
Melissa Wen  wrote:

> On 04/26, Daniel Vetter wrote:
> > On Mon, Apr 26, 2021 at 11:03:15AM +0300, Pekka Paalanen wrote:  
> > > On Sat, 24 Apr 2021 05:25:31 -0300
> > > Melissa Wen  wrote:
> > >   
> > > > Add support for composing XRGB888 planes in addition to the ARGB
> > > > format. In the case of an XRGB plane at the top, the composition 
> > > > consists
> > > > of copying the RGB values of a pixel from src to dst and clearing alpha
> > > > channel, without the need for alpha blending operations for each pixel.
> > > > 
> > > > Blend equations assume a completely opaque background, i.e., primary 
> > > > plane
> > > > is not cleared before pixel blending but alpha channel is explicitly
> > > > opaque (a = 0xff). Also, there is room for performance evaluation in
> > > > switching pixel blend operation according to the plane format.
> > > > 
> > > > v4:
> > > > - clear alpha channel (0xff) after blend color values by pixel
> > > > - improve comments on blend ops to reflect the current state
> > > > - describe in the commit message future improvements for plane 
> > > > composition
> > > > 
> > > > Signed-off-by: Melissa Wen 
> > > > Reviewed-by: Daniel Vetter 
> > > > ---
> > > >  drivers/gpu/drm/vkms/vkms_composer.c | 56 ++--
> > > >  drivers/gpu/drm/vkms/vkms_plane.c|  7 ++--
> > > >  2 files changed, 48 insertions(+), 15 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
> > > > b/drivers/gpu/drm/vkms/vkms_composer.c
> > > > index 02642801735d..7e01bc39d2a1 100644
> > > > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > > > +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> > > > @@ -4,6 +4,7 @@
> > > >  
> > > >  #include 
> > > >  #include 
> > > > +#include 
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > @@ -64,7 +65,17 @@ static u8 blend_channel(u8 src, u8 dst, u8 alpha)
> > > > return new_color;
> > > >  }
> > > >  
> > > > -static void alpha_blending(const u8 *argb_src, u8 *argb_dst)
> > > > +/**
> > > > + * alpha_blend - alpha blending equation
> > > > + * @argb_src: src pixel on premultiplied alpha mode
> > > > + * @argb_dst: dst pixel completely opaque
> > > > + *
> > > > + * blend pixels using premultiplied blend formula. The current DRM 
> > > > assumption
> > > > + * is that pixel color values have been already pre-multiplied with 
> > > > the alpha
> > > > + * channel values. See more drm_plane_create_blend_mode_property(). 
> > > > Also, this
> > > > + * formula assumes a completely opaque background.
> > > > + */
> > > > +static void alpha_blend(const u8 *argb_src, u8 *argb_dst)
> > > >  {
> > > > u8 alpha;
> > > >  
> > > > @@ -72,8 +83,16 @@ static void alpha_blending(const u8 *argb_src, u8 
> > > > *argb_dst)
> > > > argb_dst[0] = blend_channel(argb_src[0], argb_dst[0], alpha);
> > > > argb_dst[1] = blend_channel(argb_src[1], argb_dst[1], alpha);
> > > > argb_dst[2] = blend_channel(argb_src[2], argb_dst[2], alpha);
> > > > -   /* Opaque primary */
> > > > -   argb_dst[3] = 0xFF;
> > > > +}
> > > > +
> > > > +/**
> > > > + * x_blend - blending equation that ignores the pixel alpha
> > > > + *
> > > > + * overwrites RGB color value from src pixel to dst pixel.
> > > > + */
> > > > +static void x_blend(const u8 *xrgb_src, u8 *xrgb_dst)
> > > > +{
> > > > +   memcpy(xrgb_dst, xrgb_src, sizeof(u8) * 3);  
> > > 
> > > Hi,
> > > 
> > > this function very clearly assumes a very specific pixel format on both
> > > source and destination. I think it would be good if the code comments
> > > called out exactly which DRM_FORMAT_* they assume. This would be good
> > > to do on almost every function that makes such assumptions. I believe that
> > > would help code readability, and also point out explicitly which things
> > > need to be fixed when you add support for even more pixel formats.
> > > 
> > > "xrgb" and "argb" are IMO too vague. You might be referring to
> > > DRM_FORMAT_XRGB* and DRM_FORMAT_ARGB*, or maybe you are referring to any
> > > pixel format that happens to have or not have an alpha channel in
> > > addition to the three RGB channels in some order and width.
> > > 
> > > Being explicit that these refer to specific DRM_FORMAT_* should also
> > > help understanding how things work on big-endian CPUs. My current
> > > understanding is that this memcpy is correct also on big-endian, given
> > > DRM_FORMAT_XRGB.  
> 
> This endianess issue seems a little tricky to me. I remember we have
> already discussed something similar when introducing alpha blend ops.  I
> took little endian as default by a code comment on
> include/drm/drm_fourcc.h: DRM formats are little endian. But also, I am
> not sure if I got it well.

DRM format *definitions* are written on a little-endian CPU. When you
have a big-endian CPU, the byte-to-byte memory contents still remain
the same. That means if you have a uint32_t pixel in a certain
DRM_FOR

Re: [PATCH 3/7] drm/i915/gtt: map the PD up front

2021-04-27 Thread Tvrtko Ursulin



On 26/04/2021 17:18, Matthew Auld wrote:

On 26/04/2021 16:20, Tvrtko Ursulin wrote:


On 26/04/2021 11:18, Matthew Auld wrote:

We need to general our accessor for the page directories and tables from


Generalise?


using the simple kmap_atomic to support local memory, and this setup
must be done on acquisition of the backing storage prior to entering
fence execution contexts. Here we replace the kmap with the object
maping code that for simple single page shmemfs object will return a
plain kmap, that is then kept for the lifetime of the page directory.


How big are the address spaces used for mapping types? On 32-bit?

Do we have a mechanism to free something up if there is address space 
pressure in there and is that a concern if we do not?


It's a concern yes, since while the vma is pinned the mapping remains 
there for the PDs underneath, or at least until the used_count reaches 
zero, at which point we can safely destroy the mapping.


For the 32bit concern, this was brought up in some earlier review[1] 
also. AFAIK the conclusion was just to not care about 32b for modern 
platforms, so platforms with full ppGTT support where this patch 
actually matters the most.


[1] https://patchwork.freedesktop.org/patch/404460/?series=84344&rev=1


Okay thanks.

Can I suggest to capture at least the gist of this discussion in the 
commit message?


Regards,

Tvrtko



Regards,

Tvrtko


v2: (Thomas) Rebase on dma_resv and obj->mm.lock removal.

Signed-off-by: Matthew Auld 
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
  .../drm/i915/gem/selftests/i915_gem_context.c | 11 +
  drivers/gpu/drm/i915/gt/gen6_ppgtt.c  | 11 ++---
  drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 26 --
  drivers/gpu/drm/i915/gt/intel_ggtt.c  |  2 +-
  drivers/gpu/drm/i915/gt/intel_gtt.c   | 48 +--
  drivers/gpu/drm/i915/gt/intel_gtt.h   | 11 +++--
  drivers/gpu/drm/i915/gt/intel_ppgtt.c |  7 ++-
  drivers/gpu/drm/i915/i915_vma.c   |  3 +-
  drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 10 ++--
  drivers/gpu/drm/i915/selftests/i915_perf.c    |  3 +-
  10 files changed, 54 insertions(+), 78 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c

index 5fef592390cb..ce70d0a3afb2 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -1740,7 +1740,6 @@ static int read_from_scratch(struct 
i915_gem_context *ctx,

  static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
  {
  struct i915_address_space *vm;
-    struct page *page;
  u32 *vaddr;
  int err = 0;
@@ -1748,24 +1747,18 @@ static int check_scratch_page(struct 
i915_gem_context *ctx, u32 *out)

  if (!vm)
  return -ENODEV;
-    page = __px_page(vm->scratch[0]);
-    if (!page) {
+    if (!vm->scratch[0]) {
  pr_err("No scratch page!\n");
  return -EINVAL;
  }
-    vaddr = kmap(page);
-    if (!vaddr) {
-    pr_err("No (mappable) scratch page!\n");
-    return -EINVAL;
-    }
+    vaddr = __px_vaddr(vm->scratch[0]);
  memcpy(out, vaddr, sizeof(*out));
  if (memchr_inv(vaddr, *out, PAGE_SIZE)) {
  pr_err("Inconsistent initial state of scratch page!\n");
  err = -EINVAL;
  }
-    kunmap(page);
  return err;
  }
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c

index e08dff376339..21b1085769be 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -96,9 +96,8 @@ static void gen6_ppgtt_clear_range(struct 
i915_address_space *vm,

   * entries back to scratch.
   */
-    vaddr = kmap_atomic_px(pt);
+    vaddr = px_vaddr(pt);
  memset32(vaddr + pte, scratch_pte, count);
-    kunmap_atomic(vaddr);
  pte = 0;
  }
@@ -120,7 +119,7 @@ static void gen6_ppgtt_insert_entries(struct 
i915_address_space *vm,

  GEM_BUG_ON(!pd->entry[act_pt]);
-    vaddr = kmap_atomic_px(i915_pt_entry(pd, act_pt));
+    vaddr = px_vaddr(i915_pt_entry(pd, act_pt));
  do {
  GEM_BUG_ON(sg_dma_len(iter.sg) < I915_GTT_PAGE_SIZE);
  vaddr[act_pte] = pte_encode | GEN6_PTE_ADDR_ENCODE(iter.dma);
@@ -136,12 +135,10 @@ static void gen6_ppgtt_insert_entries(struct 
i915_address_space *vm,

  }
  if (++act_pte == GEN6_PTES) {
-    kunmap_atomic(vaddr);
-    vaddr = kmap_atomic_px(i915_pt_entry(pd, ++act_pt));
+    vaddr = px_vaddr(i915_pt_entry(pd, ++act_pt));
  act_pte = 0;
  }
  } while (1);
-    kunmap_atomic(vaddr);
  vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
  }
@@ -235,7 +232,7 @@ static int gen6_ppgtt_init_scratch(struct 
gen6_ppgtt *ppgtt)

  goto err_scratch0;
  }
-    ret = pin_pt_dma(vm, vm->scratch[1]);
+    ret = map_pt_dma(vm, vm->scratch[1]

Re: [PATCH v4 3/4] drm/vkms: add XRGB planes composition

2021-04-27 Thread Melissa Wen

On 04/27, Pekka Paalanen wrote:
> On Mon, 26 Apr 2021 14:31:28 -0300
> Melissa Wen  wrote:
> 
> > On 04/26, Daniel Vetter wrote:
> > > On Mon, Apr 26, 2021 at 11:03:15AM +0300, Pekka Paalanen wrote:  
> > > > On Sat, 24 Apr 2021 05:25:31 -0300
> > > > Melissa Wen  wrote:
> > > >   
> > > > > Add support for composing XRGB888 planes in addition to the ARGB
> > > > > format. In the case of an XRGB plane at the top, the composition 
> > > > > consists
> > > > > of copying the RGB values of a pixel from src to dst and clearing 
> > > > > alpha
> > > > > channel, without the need for alpha blending operations for each 
> > > > > pixel.
> > > > > 
> > > > > Blend equations assume a completely opaque background, i.e., primary 
> > > > > plane
> > > > > is not cleared before pixel blending but alpha channel is explicitly
> > > > > opaque (a = 0xff). Also, there is room for performance evaluation in
> > > > > switching pixel blend operation according to the plane format.
> > > > > 
> > > > > v4:
> > > > > - clear alpha channel (0xff) after blend color values by pixel
> > > > > - improve comments on blend ops to reflect the current state
> > > > > - describe in the commit message future improvements for plane 
> > > > > composition
> > > > > 
> > > > > Signed-off-by: Melissa Wen 
> > > > > Reviewed-by: Daniel Vetter 
> > > > > ---
> > > > >  drivers/gpu/drm/vkms/vkms_composer.c | 56 
> > > > > ++--
> > > > >  drivers/gpu/drm/vkms/vkms_plane.c|  7 ++--
> > > > >  2 files changed, 48 insertions(+), 15 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
> > > > > b/drivers/gpu/drm/vkms/vkms_composer.c
> > > > > index 02642801735d..7e01bc39d2a1 100644
> > > > > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > > > > +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> > > > > @@ -4,6 +4,7 @@
> > > > >  
> > > > >  #include 
> > > > >  #include 
> > > > > +#include 
> > > > >  #include 
> > > > >  #include 
> > > > >  #include 
> > > > > @@ -64,7 +65,17 @@ static u8 blend_channel(u8 src, u8 dst, u8 alpha)
> > > > >   return new_color;
> > > > >  }
> > > > >  
> > > > > -static void alpha_blending(const u8 *argb_src, u8 *argb_dst)
> > > > > +/**
> > > > > + * alpha_blend - alpha blending equation
> > > > > + * @argb_src: src pixel on premultiplied alpha mode
> > > > > + * @argb_dst: dst pixel completely opaque
> > > > > + *
> > > > > + * blend pixels using premultiplied blend formula. The current DRM 
> > > > > assumption
> > > > > + * is that pixel color values have been already pre-multiplied with 
> > > > > the alpha
> > > > > + * channel values. See more drm_plane_create_blend_mode_property(). 
> > > > > Also, this
> > > > > + * formula assumes a completely opaque background.
> > > > > + */
> > > > > +static void alpha_blend(const u8 *argb_src, u8 *argb_dst)
> > > > >  {
> > > > >   u8 alpha;
> > > > >  
> > > > > @@ -72,8 +83,16 @@ static void alpha_blending(const u8 *argb_src, u8 
> > > > > *argb_dst)
> > > > >   argb_dst[0] = blend_channel(argb_src[0], argb_dst[0], alpha);
> > > > >   argb_dst[1] = blend_channel(argb_src[1], argb_dst[1], alpha);
> > > > >   argb_dst[2] = blend_channel(argb_src[2], argb_dst[2], alpha);
> > > > > - /* Opaque primary */
> > > > > - argb_dst[3] = 0xFF;
> > > > > +}
> > > > > +
> > > > > +/**
> > > > > + * x_blend - blending equation that ignores the pixel alpha
> > > > > + *
> > > > > + * overwrites RGB color value from src pixel to dst pixel.
> > > > > + */
> > > > > +static void x_blend(const u8 *xrgb_src, u8 *xrgb_dst)
> > > > > +{
> > > > > + memcpy(xrgb_dst, xrgb_src, sizeof(u8) * 3);  
> > > > 
> > > > Hi,
> > > > 
> > > > this function very clearly assumes a very specific pixel format on both
> > > > source and destination. I think it would be good if the code comments
> > > > called out exactly which DRM_FORMAT_* they assume. This would be good
> > > > to do on almost every function that makes such assumptions. I believe 
> > > > that
> > > > would help code readability, and also point out explicitly which things
> > > > need to be fixed when you add support for even more pixel formats.
> > > > 
> > > > "xrgb" and "argb" are IMO too vague. You might be referring to
> > > > DRM_FORMAT_XRGB* and DRM_FORMAT_ARGB*, or maybe you are referring to any
> > > > pixel format that happens to have or not have an alpha channel in
> > > > addition to the three RGB channels in some order and width.
> > > > 
> > > > Being explicit that these refer to specific DRM_FORMAT_* should also
> > > > help understanding how things work on big-endian CPUs. My current
> > > > understanding is that this memcpy is correct also on big-endian, given
> > > > DRM_FORMAT_XRGB.  
> > 
> > This endianess issue seems a little tricky to me. I remember we have
> > already discussed something similar when introducing alpha blend ops.  I
> > took little endian as default by a code comment on
> > include/drm/drm_fourcc.h: DRM forma

[PULL] drm-intel-next-fixes for the merge window


Hi Dave & Daniel -

Some fixes to the drm-next feature pull.

drm-intel-next-fixes-2021-04-27:
drm/i915 fixes for v5.13-rc1:
- Several fixes to GLK handling in recent display refactoring (Ville)
- Rare watchdog timer race fix (Tvrtko)
- Cppcheck redundant condition fix (José)
- Overlay error code propagation fix (Dan Carpenter)
- Documentation fix (Maarten)

Seems I forgot to mention GVT fixes in the annotated tag, copy-pasting
here from their pull:

gvt-next-fixes-2021-04-21

- Remove one unused function warning (Jiapeng)
- Fix intel_gvt_init_device() return type (Dan)
- Remove one duplicated register accessible check (Zhenyu)


BR,
Jani.

The following changes since commit af8352f1ff54c4fecf84e36315fd1928809a580b:

  Merge tag 'drm-msm-next-2021-04-11' of https://gitlab.freedesktop.org/drm/msm 
into drm-next (2021-04-13 23:35:54 +0200)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-intel 
tags/drm-intel-next-fixes-2021-04-27

for you to fetch changes up to 270e3cc5aa382f63ea20b93c3d20162a891dc638:

  drm/i915: Fix docbook descriptions for i915_gem_shrinker (2021-04-26 11:54:33 
+0300)


drm/i915 fixes for v5.13-rc1:
- Several fixes to GLK handling in recent display refactoring (Ville)
- Rare watchdog timer race fix (Tvrtko)
- Cppcheck redundant condition fix (José)
- Overlay error code propagation fix (Dan Carpenter)
- Documentation fix (Maarten)


Dan Carpenter (2):
  drm/i915/gvt: Fix error code in intel_gvt_init_device()
  drm/i915: fix an error code in intel_overlay_do_put_image()

Jani Nikula (1):
  Merge tag 'gvt-next-fixes-2021-04-21' of 
https://github.com/intel/gvt-linux into drm-intel-next-fixes

Jiapeng Chong (1):
  drm/i915/gvt: remove useless function

José Roberto de Souza (1):
  drm/i915/display/psr: Fix cppcheck warnings

Maarten Lankhorst (1):
  drm/i915: Fix docbook descriptions for i915_gem_shrinker

Tvrtko Ursulin (1):
  drm/i915: Take request reference before arming the watchdog timer

Ville Syrjälä (3):
  drm/i915: Restore lost glk FBC 16bpp w/a
  drm/i915: Restore lost glk ccs w/a
  drm/i915: Disable LTTPR detection on GLK once again

Zhenyu Wang (1):
  drm/i915/gvt: Remove duplicated register accessible check

 drivers/gpu/drm/i915/display/intel_display.c  | 3 ++-
 drivers/gpu/drm/i915/display/intel_dp_link_training.c | 2 +-
 drivers/gpu/drm/i915/display/intel_fbc.c  | 2 +-
 drivers/gpu/drm/i915/display/intel_overlay.c  | 4 +++-
 drivers/gpu/drm/i915/display/intel_psr.c  | 3 +--
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  | 1 +
 drivers/gpu/drm/i915/gvt/cmd_parser.c | 5 -
 drivers/gpu/drm/i915/gvt/gtt.c| 6 --
 drivers/gpu/drm/i915/gvt/gvt.c| 8 
 drivers/gpu/drm/i915/i915_request.c   | 3 ++-
 10 files changed, 15 insertions(+), 22 deletions(-)

-- 
Jani Nikula, Intel Open Source Graphics Center
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/omap: Fix issue with clocks left on after resume

2021-04-27 Thread Tomi Valkeinen


Hi Tony,

On 26/04/2021 17:12, Tony Lindgren wrote:

On resume, dispc pm_runtime_force_resume() is not enabling the hardware
as we pass the pm_runtime_need_not_resume() test as the device is suspended
with no child devices.

As the resume continues, omap_atomic_comit_tail() calls dispc_runtime_get()
that calls rpm_resume() enabling the hardware, and increasing child_count
for it's parent device.

But at this point device_complete() has not yet been called for dispc. So
when omap_atomic_comit_tail() calls dispc_runtime_get(), it won't idle


Is that supposed to be dispc_runtime_put()?


the hardware, and the clocks are left on after resume.

This can be easily seen for example after suspending Beagleboard-X15 with
no displays connected, and by reading the CM_DSS_DSS_CLKCTRL register at
0x4a009120 after resume. After a suspend and resume cycle, it shows a
value of 0x00040102 instead of 0x0007 like it should.

Let's fix the issue by calling dispc_runtime_suspend() and
dispc_runtime_resume() directly from dispc_suspend() and dispc_resume().
This leaves out the PM runtime related issues for system suspend.

See also earlier commit 88d26136a256 ("PM: Prevent runtime suspend during
system resume") and commit ca8199f13498 ("drm/msm/dpu: ensure device
suspend happens during PM sleep") for more information.

Fixes: ecfdedd7da5d ("drm/omap: force runtime PM suspend on system suspend")
Signed-off-by: Tony Lindgren 


Why is this only needed for dispc, and not the other dss submodules 
which were handled in ecfdedd7da5d?


I have to say I'm pretty confused (maybe partly because it's been a 
while since I debugged this =). Aren't the 
pm_runtime_force_suspend/resume made explicitly for this use case? At 
least that is how I read the documentation.


If I understand right, this is only an issue when the dss was not 
enabled before the system suspend? And as the dispc is not enabled at 
suspend, pm_runtime_force_suspend and pm_runtime_force_resume don't 
really do anything. At resume, the DRM resume functionality causes 
omapdrm to call pm_runtime_get and put, and this somehow causes the dss 
to stay enabled.


I think I'm missing something here, but this patch feels like a hack 
fix. But continuing with the hack mindset, as the PM apparently needs 
DSS to be enabled at suspend for it to work correctly, lets give that to 
the PM. This seems to work also:


diff --git a/drivers/gpu/drm/omapdrm/omap_drv.c 
b/drivers/gpu/drm/omapdrm/omap_drv.c

index 28bbad1353ee..0fd9d80d3e12 100644
--- a/drivers/gpu/drm/omapdrm/omap_drv.c
+++ b/drivers/gpu/drm/omapdrm/omap_drv.c
@@ -695,6 +695,8 @@ static int omap_drm_suspend(struct device *dev)
struct omap_drm_private *priv = dev_get_drvdata(dev);
struct drm_device *drm_dev = priv->ddev;

+   dispc_runtime_get(priv->dispc);
+
return drm_mode_config_helper_suspend(drm_dev);
 }

@@ -705,6 +707,8 @@ static int omap_drm_resume(struct device *dev)

drm_mode_config_helper_resume(drm_dev);

+   dispc_runtime_put(priv->dispc);
+
return omap_gem_resume(drm_dev);
 }
 #endif

But I don't think that helps with the other dss submodules either.

 Tomi
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v2 1/7] drm/i915/dg1: Fix mapping type for default state object

From: Venkata Ramana Nayana 

Use I915_MAP_WC when default state object is allocated in LMEM.

Signed-off-by: Venkata Ramana Nayana 
Reviewed-by: Matthew Auld 
Signed-off-by: Matthew Auld 
---
 drivers/gpu/drm/i915/gt/shmem_utils.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c 
b/drivers/gpu/drm/i915/gt/shmem_utils.c
index f8f02aab842b..0683b27a3890 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -8,6 +8,7 @@
 #include 
 
 #include "gem/i915_gem_object.h"
+#include "gem/i915_gem_lmem.h"
 #include "shmem_utils.h"
 
 struct file *shmem_create_from_data(const char *name, void *data, size_t len)
@@ -39,7 +40,8 @@ struct file *shmem_create_from_object(struct 
drm_i915_gem_object *obj)
return file;
}
 
-   ptr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
+   ptr = i915_gem_object_pin_map_unlocked(obj, 
i915_gem_object_is_lmem(obj) ?
+   I915_MAP_WC : I915_MAP_WB);
if (IS_ERR(ptr))
return ERR_CAST(ptr);
 
-- 
2.26.3

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v2 3/7] drm/i915/gtt: map the PD up front

We need to generalise our accessor for the page directories and tables from
using the simple kmap_atomic to support local memory, and this setup
must be done on acquisition of the backing storage prior to entering
fence execution contexts. Here we replace the kmap with the object
mapping code that for simple single page shmemfs object will return a
plain kmap, that is then kept for the lifetime of the page directory.

Note that keeping the mapping around is a potential concern here, since
while the vma is pinned the mapping remains there for the PDs
underneath, or at least until the used_count reaches zero, at which
point we can safely destroy the mapping. For 32b this will be even worse
since the address space is more limited, but since this change mostly
impacts full ppGTT platforms, the justification is that for modern
platforms we shouldn't care too much about 32b.

Signed-off-by: Matthew Auld 
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 .../drm/i915/gem/selftests/i915_gem_context.c | 11 +
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  | 11 ++---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 26 --
 drivers/gpu/drm/i915/gt/intel_ggtt.c  |  2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c   | 48 +--
 drivers/gpu/drm/i915/gt/intel_gtt.h   | 11 +++--
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  7 ++-
 drivers/gpu/drm/i915/i915_vma.c   |  3 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 10 ++--
 drivers/gpu/drm/i915/selftests/i915_perf.c|  3 +-
 10 files changed, 54 insertions(+), 78 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 5fef592390cb..ce70d0a3afb2 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -1740,7 +1740,6 @@ static int read_from_scratch(struct i915_gem_context *ctx,
 static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
 {
struct i915_address_space *vm;
-   struct page *page;
u32 *vaddr;
int err = 0;
 
@@ -1748,24 +1747,18 @@ static int check_scratch_page(struct i915_gem_context 
*ctx, u32 *out)
if (!vm)
return -ENODEV;
 
-   page = __px_page(vm->scratch[0]);
-   if (!page) {
+   if (!vm->scratch[0]) {
pr_err("No scratch page!\n");
return -EINVAL;
}
 
-   vaddr = kmap(page);
-   if (!vaddr) {
-   pr_err("No (mappable) scratch page!\n");
-   return -EINVAL;
-   }
+   vaddr = __px_vaddr(vm->scratch[0]);
 
memcpy(out, vaddr, sizeof(*out));
if (memchr_inv(vaddr, *out, PAGE_SIZE)) {
pr_err("Inconsistent initial state of scratch page!\n");
err = -EINVAL;
}
-   kunmap(page);
 
return err;
 }
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index e08dff376339..21b1085769be 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -96,9 +96,8 @@ static void gen6_ppgtt_clear_range(struct i915_address_space 
*vm,
 * entries back to scratch.
 */
 
-   vaddr = kmap_atomic_px(pt);
+   vaddr = px_vaddr(pt);
memset32(vaddr + pte, scratch_pte, count);
-   kunmap_atomic(vaddr);
 
pte = 0;
}
@@ -120,7 +119,7 @@ static void gen6_ppgtt_insert_entries(struct 
i915_address_space *vm,
 
GEM_BUG_ON(!pd->entry[act_pt]);
 
-   vaddr = kmap_atomic_px(i915_pt_entry(pd, act_pt));
+   vaddr = px_vaddr(i915_pt_entry(pd, act_pt));
do {
GEM_BUG_ON(sg_dma_len(iter.sg) < I915_GTT_PAGE_SIZE);
vaddr[act_pte] = pte_encode | GEN6_PTE_ADDR_ENCODE(iter.dma);
@@ -136,12 +135,10 @@ static void gen6_ppgtt_insert_entries(struct 
i915_address_space *vm,
}
 
if (++act_pte == GEN6_PTES) {
-   kunmap_atomic(vaddr);
-   vaddr = kmap_atomic_px(i915_pt_entry(pd, ++act_pt));
+   vaddr = px_vaddr(i915_pt_entry(pd, ++act_pt));
act_pte = 0;
}
} while (1);
-   kunmap_atomic(vaddr);
 
vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 }
@@ -235,7 +232,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
goto err_scratch0;
}
 
-   ret = pin_pt_dma(vm, vm->scratch[1]);
+   ret = map_pt_dma(vm, vm->scratch[1]);
if (ret)
goto err_scratch1;
 
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 176c19633412..f83496836f0f 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -242,11 +242,10 @@ static u64 __gen8_ppgtt_clear(struct i915_address_sp

[PATCH v2 7/7] drm/i915: Return error value when bo not in LMEM for discrete

From: Mohammed Khajapasha 

Return EREMOTE value when frame buffer object is not backed by LMEM
for discrete. If Local memory is supported by hardware the framebuffer
backing gem objects should be from local memory.

Signed-off-by: Mohammed Khajapasha 
Signed-off-by: Matthew Auld 
Reviewed-by: Tvrtko Ursulin 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/i915/display/intel_display.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index e246e5cf7586..6280ba7f4c17 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -63,6 +63,7 @@
 #include "display/intel_vdsc.h"
 #include "display/intel_vrr.h"
 
+#include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_object.h"
 
 #include "gt/intel_rps.h"
@@ -11278,11 +11279,20 @@ intel_user_framebuffer_create(struct drm_device *dev,
struct drm_framebuffer *fb;
struct drm_i915_gem_object *obj;
struct drm_mode_fb_cmd2 mode_cmd = *user_mode_cmd;
+   struct drm_i915_private *i915;
 
obj = i915_gem_object_lookup(filp, mode_cmd.handles[0]);
if (!obj)
return ERR_PTR(-ENOENT);
 
+   /* object is backed with LMEM for discrete */
+   i915 = to_i915(obj->base.dev);
+   if (HAS_LMEM(i915) && !i915_gem_object_is_lmem(obj)) {
+   /* object is "remote", not in local memory */
+   i915_gem_object_put(obj);
+   return ERR_PTR(-EREMOTE);
+   }
+
fb = intel_framebuffer_create(obj, &mode_cmd);
i915_gem_object_put(obj);
 
-- 
2.26.3

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v2 6/7] drm/i915/lmem: Bypass aperture when lmem is available

From: Anusha Srivatsa 

In the scenario where local memory is available, we have
rely on CPU access via lmem directly instead of aperture.

v2:
gmch is only relevant for much older hw, therefore we can drop the
has_aperture check since it should always be present on such platforms.
(Chris)

Cc: Ville Syrjälä 
Cc: Dhinakaran Pandiyan 
Cc: Maarten Lankhorst 
Cc: Chris P Wilson 
Cc: Daniel Vetter 
Cc: Joonas Lahtinen 
Cc: Daniele Ceraolo Spurio 
Cc: CQ Tang 
Signed-off-by: Anusha Srivatsa 
Signed-off-by: Matthew Auld 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/display/intel_fbdev.c | 22 +--
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c   | 15 +
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h   |  5 +
 drivers/gpu/drm/i915/i915_vma.c| 25 --
 4 files changed, 54 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c 
b/drivers/gpu/drm/i915/display/intel_fbdev.c
index 2b37959da747..4af40229f5ec 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -139,14 +139,22 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
size = mode_cmd.pitches[0] * mode_cmd.height;
size = PAGE_ALIGN(size);
 
-   /* If the FB is too big, just don't use it since fbdev is not very
-* important and we should probably use that space with FBC or other
-* features. */
obj = ERR_PTR(-ENODEV);
-   if (size * 2 < dev_priv->stolen_usable_size)
-   obj = i915_gem_object_create_stolen(dev_priv, size);
-   if (IS_ERR(obj))
-   obj = i915_gem_object_create_shmem(dev_priv, size);
+   if (HAS_LMEM(dev_priv)) {
+   obj = i915_gem_object_create_lmem(dev_priv, size,
+ I915_BO_ALLOC_CONTIGUOUS);
+   } else {
+   /*
+* If the FB is too big, just don't use it since fbdev is not 
very
+* important and we should probably use that space with FBC or 
other
+* features.
+*/
+   if (size * 2 < dev_priv->stolen_usable_size)
+   obj = i915_gem_object_create_stolen(dev_priv, size);
+   if (IS_ERR(obj))
+   obj = i915_gem_object_create_shmem(dev_priv, size);
+   }
+
if (IS_ERR(obj)) {
drm_err(&dev_priv->drm, "failed to allocate framebuffer\n");
return PTR_ERR(obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index 017db8f71130..f44bdd08f7cb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -17,6 +17,21 @@ const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = 
{
.release = i915_gem_object_release_memory_region,
 };
 
+void __iomem *
+i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
+   unsigned long n,
+   unsigned long size)
+{
+   resource_size_t offset;
+
+   GEM_BUG_ON(!i915_gem_object_is_contiguous(obj));
+
+   offset = i915_gem_object_get_dma_address(obj, n);
+   offset -= obj->mm.region->region.start;
+
+   return io_mapping_map_wc(&obj->mm.region->iomap, offset, size);
+}
+
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
 {
struct intel_memory_region *mr = obj->mm.region;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h 
b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
index 036d53c01de9..fac6bc5a5ebb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
@@ -14,6 +14,11 @@ struct intel_memory_region;
 
 extern const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops;
 
+void __iomem *
+i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
+   unsigned long n,
+   unsigned long size);
+
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
 
 struct drm_i915_gem_object *
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index eb01899ac6b7..468317e3b477 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -27,6 +27,7 @@
 
 #include "display/intel_frontbuffer.h"
 
+#include "gem/i915_gem_lmem.h"
 #include "gt/intel_engine.h"
 #include "gt/intel_engine_heartbeat.h"
 #include "gt/intel_gt.h"
@@ -448,9 +449,11 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
void __iomem *ptr;
int err;
 
-   if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
-   err = -ENODEV;
-   goto err;
+   if (!i915_gem_object_is_lmem(vma->obj)) {
+   if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
+   err = -ENODEV;
+   goto err;
+   }
}
 
GEM_BUG_ON(!i915_vma_is_ggtt(vma));

[PATCH v2 2/7] drm/i915: Update the helper to set correct mapping

From: Venkata Sandeep Dhanalakota 

Determine the possible coherent map type based on object location,
and if target has llc or if user requires an always coherent
mapping.

Cc: Matthew Auld 
Cc: CQ Tang 
Suggested-by: Michal Wajdeczko 
Signed-off-by: Venkata Sandeep Dhanalakota 
Signed-off-by: Matthew Auld 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c|  2 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c  |  4 +++-
 drivers/gpu/drm/i915/gt/intel_ring.c |  9 ++---
 drivers/gpu/drm/i915/gt/selftest_context.c   |  3 ++-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 ++--
 drivers/gpu/drm/i915/gt/selftest_lrc.c   |  4 +++-
 drivers/gpu/drm/i915/gt/uc/intel_guc.c   |  4 +++-
 drivers/gpu/drm/i915/gt/uc/intel_huc.c   |  4 +++-
 drivers/gpu/drm/i915/i915_drv.h  | 11 +--
 drivers/gpu/drm/i915/selftests/igt_spinner.c |  4 ++--
 10 files changed, 34 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 7c9af86fdb1e..47f4397095e5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -23,7 +23,7 @@ static void dbg_poison_ce(struct intel_context *ce)
 
if (ce->state) {
struct drm_i915_gem_object *obj = ce->state->obj;
-   int type = i915_coherent_map_type(ce->engine->i915);
+   int type = i915_coherent_map_type(ce->engine->i915, obj, true);
void *map;
 
if (!i915_gem_object_trylock(obj))
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index e86897cde984..aafe2a4df496 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -903,7 +903,9 @@ lrc_pre_pin(struct intel_context *ce,
GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
 
*vaddr = i915_gem_object_pin_map(ce->state->obj,
-
i915_coherent_map_type(ce->engine->i915) |
+
i915_coherent_map_type(ce->engine->i915,
+   ce->state->obj,
+   false) |
 I915_MAP_OVERRIDE);
 
return PTR_ERR_OR_ZERO(*vaddr);
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c 
b/drivers/gpu/drm/i915/gt/intel_ring.c
index aee0a77c77e0..3cf6c7e68108 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring.c
@@ -53,9 +53,12 @@ int intel_ring_pin(struct intel_ring *ring, struct 
i915_gem_ww_ctx *ww)
 
if (i915_vma_is_map_and_fenceable(vma))
addr = (void __force *)i915_vma_pin_iomap(vma);
-   else
-   addr = i915_gem_object_pin_map(vma->obj,
-  
i915_coherent_map_type(vma->vm->i915));
+   else {
+   int type = i915_coherent_map_type(vma->vm->i915, vma->obj, 
false);
+
+   addr = i915_gem_object_pin_map(vma->obj, type);
+   }
+
if (IS_ERR(addr)) {
ret = PTR_ERR(addr);
goto err_ring;
diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
b/drivers/gpu/drm/i915/gt/selftest_context.c
index b9bdd1d23243..26685b927169 100644
--- a/drivers/gpu/drm/i915/gt/selftest_context.c
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -88,7 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
goto err;
 
vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
-
i915_coherent_map_type(engine->i915));
+
i915_coherent_map_type(engine->i915,
+   
ce->state->obj, false));
if (IS_ERR(vaddr)) {
err = PTR_ERR(vaddr);
intel_context_unpin(ce);
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 746985971c3a..5b63d4df8c93 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -69,7 +69,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
 
vaddr = i915_gem_object_pin_map_unlocked(h->obj,
-
i915_coherent_map_type(gt->i915));
+
i915_coherent_map_type(gt->i915, h->obj, false));
if (IS_ERR(vaddr)) {
err = PTR_ERR(vaddr);
goto err_unpin_hws;
@@ -130,7 +130,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs 
*engine)
return ERR_CAST(obj);
}
 
-   vaddr = i915_gem_object_pin_map_unlocked(obj, 
i915_coherent_map_type(gt->i915));

[PATCH v2 5/7] drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete

From: Mohammed Khajapasha 

Use local memory io BAR address for fbdev's fb_mmap() operation on
discrete, fbdev uses the physical address of our framebuffer for its
fb_mmap() fn.

Signed-off-by: Mohammed Khajapasha 
Reviewed-by: Matthew Auld 
Signed-off-by: Matthew Auld 
---
 drivers/gpu/drm/i915/display/intel_fbdev.c | 29 +-
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c 
b/drivers/gpu/drm/i915/display/intel_fbdev.c
index ccd00e65a5fe..2b37959da747 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -41,6 +41,8 @@
 #include 
 #include 
 
+#include "gem/i915_gem_lmem.h"
+
 #include "i915_drv.h"
 #include "intel_display_types.h"
 #include "intel_fbdev.h"
@@ -178,6 +180,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
unsigned long flags = 0;
bool prealloc = false;
void __iomem *vaddr;
+   struct drm_i915_gem_object *obj;
int ret;
 
if (intel_fb &&
@@ -232,13 +235,27 @@ static int intelfb_create(struct drm_fb_helper *helper,
info->fbops = &intelfb_ops;
 
/* setup aperture base/size for vesafb takeover */
-   info->apertures->ranges[0].base = ggtt->gmadr.start;
-   info->apertures->ranges[0].size = ggtt->mappable_end;
+   obj = intel_fb_obj(&intel_fb->base);
+   if (i915_gem_object_is_lmem(obj)) {
+   struct intel_memory_region *mem = obj->mm.region;
+
+   info->apertures->ranges[0].base = mem->io_start;
+   info->apertures->ranges[0].size = mem->total;
+
+   /* Use fbdev's framebuffer from lmem for discrete */
+   info->fix.smem_start =
+   (unsigned long)(mem->io_start +
+   i915_gem_object_get_dma_address(obj, 
0));
+   info->fix.smem_len = obj->base.size;
+   } else {
+   info->apertures->ranges[0].base = ggtt->gmadr.start;
+   info->apertures->ranges[0].size = ggtt->mappable_end;
 
-   /* Our framebuffer is the entirety of fbdev's system memory */
-   info->fix.smem_start =
-   (unsigned long)(ggtt->gmadr.start + vma->node.start);
-   info->fix.smem_len = vma->node.size;
+   /* Our framebuffer is the entirety of fbdev's system memory */
+   info->fix.smem_start =
+   (unsigned long)(ggtt->gmadr.start + vma->node.start);
+   info->fix.smem_len = vma->node.size;
+   }
 
vaddr = i915_vma_pin_iomap(vma);
if (IS_ERR(vaddr)) {
-- 
2.26.3

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v2 4/7] drm/i915/gtt/dgfx: place the PD in LMEM

It's a requirement that for dgfx we place all the paging structures in
device local-memory.

v2: use i915_coherent_map_type()
v3: improve the shared dma-resv object comment

Signed-off-by: Matthew Auld 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c |  5 -
 drivers/gpu/drm/i915/gt/intel_gtt.c  | 30 +---
 drivers/gpu/drm/i915/gt/intel_gtt.h  |  1 +
 3 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index f83496836f0f..11fb5df45a0f 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -712,7 +712,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
 */
ppgtt->vm.has_read_only = !IS_GEN_RANGE(gt->i915, 11, 12);
 
-   ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
+   if (HAS_LMEM(gt->i915))
+   ppgtt->vm.alloc_pt_dma = alloc_pt_lmem;
+   else
+   ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
 
err = gen8_init_scratch(&ppgtt->vm);
if (err)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index d386b89e2758..061c39d2ad51 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -7,10 +7,26 @@
 
 #include 
 
+#include "gem/i915_gem_lmem.h"
 #include "i915_trace.h"
 #include "intel_gt.h"
 #include "intel_gtt.h"
 
+struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int 
sz)
+{
+   struct drm_i915_gem_object *obj;
+
+   obj = i915_gem_object_create_lmem(vm->i915, sz, 0);
+   /*
+* Ensure all paging structures for this vm share the same dma-resv
+* object underneath, with the idea that one object_lock() will lock
+* them all at once.
+*/
+   if (!IS_ERR(obj))
+   obj->base.resv = &vm->resv;
+   return obj;
+}
+
 struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
 {
struct drm_i915_gem_object *obj;
@@ -19,7 +35,11 @@ struct drm_i915_gem_object *alloc_pt_dma(struct 
i915_address_space *vm, int sz)
i915_gem_shrink_all(vm->i915);
 
obj = i915_gem_object_create_internal(vm->i915, sz);
-   /* ensure all dma objects have the same reservation class */
+   /*
+* Ensure all paging structures for this vm share the same dma-resv
+* object underneath, with the idea that one object_lock() will lock
+* them all at once.
+*/
if (!IS_ERR(obj))
obj->base.resv = &vm->resv;
return obj;
@@ -27,9 +47,11 @@ struct drm_i915_gem_object *alloc_pt_dma(struct 
i915_address_space *vm, int sz)
 
 int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
+   enum i915_map_type type;
void *vaddr;
 
-   vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
+   type = i915_coherent_map_type(vm->i915, obj, true);
+   vaddr = i915_gem_object_pin_map_unlocked(obj, type);
if (IS_ERR(vaddr))
return PTR_ERR(vaddr);
 
@@ -39,9 +61,11 @@ int map_pt_dma(struct i915_address_space *vm, struct 
drm_i915_gem_object *obj)
 
 int map_pt_dma_locked(struct i915_address_space *vm, struct 
drm_i915_gem_object *obj)
 {
+   enum i915_map_type type;
void *vaddr;
 
-   vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+   type = i915_coherent_map_type(vm->i915, obj, true);
+   vaddr = i915_gem_object_pin_map(obj, type);
if (IS_ERR(vaddr))
return PTR_ERR(vaddr);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 40e486704558..44ce27c51631 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -527,6 +527,7 @@ int setup_scratch_page(struct i915_address_space *vm);
 void free_scratch(struct i915_address_space *vm);
 
 struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int 
sz);
+struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int 
sz);
 struct i915_page_table *alloc_pt(struct i915_address_space *vm);
 struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
 struct i915_page_directory *__alloc_pd(int npde);
-- 
2.26.3

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [Intel-gfx] [PATCH 4/4] drm/i915: Rewrite CL/CTG L-shaped memory detection

On Mon, Apr 26, 2021 at 08:18:39PM +0300, Ville Syrjälä wrote:
> On Mon, Apr 26, 2021 at 06:08:59PM +0200, Daniel Vetter wrote:
> > On Thu, Apr 22, 2021 at 04:11:22PM +0300, Ville Syrjälä wrote:
> > > On Thu, Apr 22, 2021 at 11:49:43AM +0200, Daniel Vetter wrote:
> > > > On Wed, Apr 21, 2021 at 06:34:01PM +0300, Ville Syrjala wrote:
> > > > > From: Ville Syrjälä 
> > > > > 
> > > > > Currently we try to detect a symmetric memory configurations
> > > > > using a magic DCC2_MODIFIED_ENHANCED_DISABLE bit. That bit is
> > > > > either only set on a very specific subset of machines or it
> > > > > just does not exist (it's not mentioned in any public chipset
> > > > > datasheets I've found). As it happens my CL/CTG machines never
> > > > > set said bit, even if I populate the channels with identical
> > > > > sticks.
> > > > > 
> > > > > So let's do the L-shaped memory detection the same way as the
> > > > > desktop variants, ie. just look at the DRAM rank boundary
> > > > > registers to see if both channels have an identical size.
> > > > > 
> > > > > With this my CL/CTG no longer claim L-shaped memory when I use
> > > > > identical sticks. Also tested with non-matching sticks just to
> > > > > make sure the L-shaped memory is still properly detected.
> > > > > 
> > > > > And for completeness let's update the debugfs code to dump
> > > > > the correct set of registers on each platform.
> > > > > 
> > > > > Cc: Chris Wilson 
> > > > > Signed-off-by: Ville Syrjälä 
> > > > 
> > > > Did you check this with the swapping igt? I have some vague memories of
> > > > bug reports where somehow the machine was acting like it's L-shaped 
> > > > memory
> > > > despite that banks were populated equally. I've iirc tried all kinds of
> > > > tricks to figure it out, all to absolutely no avail.
> > > 
> > > Did you have a specific test in mind? I ran a bunch of things
> > > that seemed swizzle related. All passed just fine.
> > 
> > gem_tiled_swapping should be the one. It tries to cycle your entire system
> > memory through tiled buffers into swap and out of it.
> 
> Passes with symmetric config, fails with L-shaped config (if I hack
> out the L-shape detection of course). So seems pretty solid.
> 
> A kernel based self test that looks at the physical address would
> still be nice I suppose. Though depending on the size of your RAM
> sticks figuring out where exactly the switchover from two channels
> to one channels happens probably requires a bit of work due to
> the PCI hole/etc.
> 
> Both my cl and ctg report this btw:
>  bit6 swizzle for X-tiling = bit9/bit10/bit11
>  bit6 swizzle for Y-tiling = bit9/bit11
> so unfortunately can't be sure the other swizzle modes would be
> correctly detected.

I think testing-wise this is as good as it gets.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 1/1] drm/doc: document drm_mode_get_plane

On Tue, Apr 27, 2021 at 10:40:24AM +0300, Pekka Paalanen wrote:
> On Mon, 26 Apr 2021 14:30:53 -0300
> Leandro Ribeiro  wrote:
> 
> > On 4/26/21 7:58 AM, Simon Ser wrote:
> > > On Monday, April 26th, 2021 at 9:36 AM, Pekka Paalanen 
> > >  wrote:
> > >   
> >  This should probably explain what the bits in the mask correspond to.
> >  As in, which CRTC does bit 0 refer to, and so on.  
> > >>>
> > >>> What about:
> > >>>
> > >>> "possible_crtcs: Bitmask of CRTC's compatible with the plane. CRTC's are
> > >>> created and they receive an index, which corresponds to their position
> > >>> in the bitmask. CRTC with index 0 will be in bit 0, and so on."  
> > >>
> > >> This would still need to explain where can I find this index.  
> > >   
> > 
> > What do you mean?
> > 
> > > This closed merge request had some docs about possible CRTCs:
> > > 
> > > https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/102
> > >   
> > I'm afraid I don't know exactly what you expect to be documented here
> > that is still missing. Could you please elaborate?
> > 
> > Thanks a lot!
> 
> The documentation you add is talking about "CRTC index". What defines a
> CRTC object's index? How do I determine what index a CRTC object has?
> 
> The answer is, AFAIK, that the index is never stored explicitly
> anywhere. You have to get the DRM resources structure, which has an
> array for CRTC IDs. The index is the index to that array, IIRC. So if
> one does not already know this, it is going to be really hard to figure
> out what the "index" is. It might even be confused with the object ID,
> which it is not but the ID might by complete accident be less than 32
> so it would look ok at first glance.
> 
> If the index is already explained somewhere else, a reference to that
> documentation would be enough.

I think if we do this we should have a DOC: section in the drm_mode.h uapi
header which explains how the index is computed, and then we reference
that everywhere. Because otherwise there's going to be a _lot_ of
duplication of this all over. Kernel-internally we solve this by just
referencing drm_foo_index() family of functions, but for the uapi there's
really nothing, so needs text.

-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v4 3/4] drm/vkms: add XRGB planes composition

On Tue, Apr 27, 2021 at 11:10:59AM +0300, Pekka Paalanen wrote:
> On Mon, 26 Apr 2021 14:31:28 -0300
> Melissa Wen  wrote:
> 
> > On 04/26, Daniel Vetter wrote:
> > > On Mon, Apr 26, 2021 at 11:03:15AM +0300, Pekka Paalanen wrote:  
> > > > On Sat, 24 Apr 2021 05:25:31 -0300
> > > > Melissa Wen  wrote:
> > > >   
> > > > > Add support for composing XRGB888 planes in addition to the ARGB
> > > > > format. In the case of an XRGB plane at the top, the composition 
> > > > > consists
> > > > > of copying the RGB values of a pixel from src to dst and clearing 
> > > > > alpha
> > > > > channel, without the need for alpha blending operations for each 
> > > > > pixel.
> > > > > 
> > > > > Blend equations assume a completely opaque background, i.e., primary 
> > > > > plane
> > > > > is not cleared before pixel blending but alpha channel is explicitly
> > > > > opaque (a = 0xff). Also, there is room for performance evaluation in
> > > > > switching pixel blend operation according to the plane format.
> > > > > 
> > > > > v4:
> > > > > - clear alpha channel (0xff) after blend color values by pixel
> > > > > - improve comments on blend ops to reflect the current state
> > > > > - describe in the commit message future improvements for plane 
> > > > > composition
> > > > > 
> > > > > Signed-off-by: Melissa Wen 
> > > > > Reviewed-by: Daniel Vetter 
> > > > > ---
> > > > >  drivers/gpu/drm/vkms/vkms_composer.c | 56 
> > > > > ++--
> > > > >  drivers/gpu/drm/vkms/vkms_plane.c|  7 ++--
> > > > >  2 files changed, 48 insertions(+), 15 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
> > > > > b/drivers/gpu/drm/vkms/vkms_composer.c
> > > > > index 02642801735d..7e01bc39d2a1 100644
> > > > > --- a/drivers/gpu/drm/vkms/vkms_composer.c
> > > > > +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> > > > > @@ -4,6 +4,7 @@
> > > > >  
> > > > >  #include 
> > > > >  #include 
> > > > > +#include 
> > > > >  #include 
> > > > >  #include 
> > > > >  #include 
> > > > > @@ -64,7 +65,17 @@ static u8 blend_channel(u8 src, u8 dst, u8 alpha)
> > > > >   return new_color;
> > > > >  }
> > > > >  
> > > > > -static void alpha_blending(const u8 *argb_src, u8 *argb_dst)
> > > > > +/**
> > > > > + * alpha_blend - alpha blending equation
> > > > > + * @argb_src: src pixel on premultiplied alpha mode
> > > > > + * @argb_dst: dst pixel completely opaque
> > > > > + *
> > > > > + * blend pixels using premultiplied blend formula. The current DRM 
> > > > > assumption
> > > > > + * is that pixel color values have been already pre-multiplied with 
> > > > > the alpha
> > > > > + * channel values. See more drm_plane_create_blend_mode_property(). 
> > > > > Also, this
> > > > > + * formula assumes a completely opaque background.
> > > > > + */
> > > > > +static void alpha_blend(const u8 *argb_src, u8 *argb_dst)
> > > > >  {
> > > > >   u8 alpha;
> > > > >  
> > > > > @@ -72,8 +83,16 @@ static void alpha_blending(const u8 *argb_src, u8 
> > > > > *argb_dst)
> > > > >   argb_dst[0] = blend_channel(argb_src[0], argb_dst[0], alpha);
> > > > >   argb_dst[1] = blend_channel(argb_src[1], argb_dst[1], alpha);
> > > > >   argb_dst[2] = blend_channel(argb_src[2], argb_dst[2], alpha);
> > > > > - /* Opaque primary */
> > > > > - argb_dst[3] = 0xFF;
> > > > > +}
> > > > > +
> > > > > +/**
> > > > > + * x_blend - blending equation that ignores the pixel alpha
> > > > > + *
> > > > > + * overwrites RGB color value from src pixel to dst pixel.
> > > > > + */
> > > > > +static void x_blend(const u8 *xrgb_src, u8 *xrgb_dst)
> > > > > +{
> > > > > + memcpy(xrgb_dst, xrgb_src, sizeof(u8) * 3);  
> > > > 
> > > > Hi,
> > > > 
> > > > this function very clearly assumes a very specific pixel format on both
> > > > source and destination. I think it would be good if the code comments
> > > > called out exactly which DRM_FORMAT_* they assume. This would be good
> > > > to do on almost every function that makes such assumptions. I believe 
> > > > that
> > > > would help code readability, and also point out explicitly which things
> > > > need to be fixed when you add support for even more pixel formats.
> > > > 
> > > > "xrgb" and "argb" are IMO too vague. You might be referring to
> > > > DRM_FORMAT_XRGB* and DRM_FORMAT_ARGB*, or maybe you are referring to any
> > > > pixel format that happens to have or not have an alpha channel in
> > > > addition to the three RGB channels in some order and width.
> > > > 
> > > > Being explicit that these refer to specific DRM_FORMAT_* should also
> > > > help understanding how things work on big-endian CPUs. My current
> > > > understanding is that this memcpy is correct also on big-endian, given
> > > > DRM_FORMAT_XRGB.  
> > 
> > This endianess issue seems a little tricky to me. I remember we have
> > already discussed something similar when introducing alpha blend ops.  I
> > took little endian as default by a code comment on
> > inc

Re: [RFC PATCH 0/3] A drm_plane API to support HDR planes

On Mon, Apr 26, 2021 at 01:38:49PM -0400, Harry Wentland wrote:
> 
> ## Introduction
> 
> We are looking to enable HDR support for a couple of single-plane and
> multi-plane scenarios. To do this effectively we recommend new
> interfaces to drm_plane. Below I'll give a bit of background on HDR and
> why we propose these interfaces.

I think this is on of the topics that would tremendously benefit from the
uapi rfc process, with lots of compositor people involved.

https://dri.freedesktop.org/docs/drm/gpu/rfc/

Also for this I think we really do need a pretty solid understanding of
the involve compositor protocols, otherwise the kernel uapi is going to be
for naught.
-Daniel

> 
> 
> ## Defining a pixel's luminance
> 
> Currently the luminance space of pixels in a framebuffer/plane presented to 
> the display is not well defined. It's usually assumed to be in a 2.2 or 2.4 
> gamma space and has no mapping to an absolute luminance value but is 
> interpreted in relative terms.
> 
> Luminance can be measured and described in absolute terms as candela per 
> meter squared, or cd/m2, or nits. Even though a pixel value can be mapped to 
> luminance in a linear fashion to do so without losing a lot of detail 
> requires 16-bpc color depth. The reason for this is that human perception can 
> distinguish roughly between a 0.5-1% luminance delta. A linear representation 
> is suboptimal, wasting precision in the highlights and losing precision in 
> the shadows.
> 
> A gamma curve is a decent approximation to a human's perception of luminance, 
> but the PQ (perceptual quantizer) function [1] improves on it. It also 
> defines the luminance values in absolute terms, with the highest value being 
> 10,000 nits and the lowest 0.0005 nits.
> 
> Using a content that's defined in PQ space we can approximate the real world 
> in a much better way.
> 
> Here are some examples of real-life objects and their approximate luminance 
> values:
> 
> | Object| Luminance in nits |
> | - | - |
> | Sun   | 1.6 million   |
> | Fluorescent light | 10,000|
> | Highlights| 1,000 - sunlight  |
> | White Objects | 250 - 1,000   |
> | Typical objects   | 1 - 250   |
> | Shadows   | 0.01 - 1  |
> | Ultra Blacks  | 0 - 0.0005|
> 
> 
> ## Describing the luminance space
> 
> **We propose a new drm_plane property to describe the Eletro-Optical Transfer 
> Function (EOTF) with which its framebuffer was composed.** Examples of EOTF 
> are:
> 
> | EOTF  | Description 
>   |
> | - 
> |:- |
> | Gamma 2.2 | a simple 2.2 gamma  
>   |
> | sRGB  | 2.4 gamma with small initial linear section 
>   |
> | PQ 2084   | SMPTE ST 2084; used for HDR video and allows for up to 10,000 
> nit support |
> | Linear| Linear relationship between pixel value and luminance value 
>   |
> 
> 
> ## Mastering Luminances
> 
> Now we are able to use the PQ 2084 EOTF to define the luminance of pixels in 
> absolute terms. Unfortunately we're again presented with physical limitations 
> of the display technologies on the market today. Here are a few examples of 
> luminance ranges of displays.
> 
> | Display  | Luminance range in nits |
> |  | --- |
> | Typical PC display   | 0.3 - 200   |
> | Excellent LCD HDTV   | 0.3 - 400   |
> | HDR LCD w/ local dimming | 0.05 - 1,500|
> 
> Since no display can currently show the full 0.0005 to 10,000 nits luminance 
> range the display will need to tonemap the HDR content, i.e to fit the 
> content within a display's capabilities. To assist with tonemapping HDR 
> content is usually accompanied with a metadata that describes (among other 
> things) the minimum and maximum mastering luminance, i.e. the maximum and 
> minimum luminance of the display that was used to master the HDR content.
> 
> The HDR metadata is currently defined on the drm_connector via the 
> hdr_output_metadata blob property.
> 
> It might be useful to define per-plane hdr metadata, as different planes 
> might have been mastered differently.
> 
> 
> ## SDR Luminance
> 
> Since SDR covers a smaller luminance range than HDR, an SDR plane might look 
> dark when blended with HDR content. Since the max HDR luminance can be quite 
> variable (200-1,500 nits on actual displays) it is best to make the SDR 
> maximum luminance value configurable.
> 
> **We propose a drm_plane property to specfy the desired maximum luminance of 
> the SDR plane in nits.** This allows us to map the SDR content predictably 
> into HDR's absolute luminance space.
> 
> 
> ## Let There Be Color
> 
> So far we've only talked about lum

Re: [PATCH v2] drm/drm_file.c: Define drm_send_event_helper() as 'static'

On Mon, Apr 26, 2021 at 10:00:51PM +0200, Fabio M. De Francesco wrote:
> drm_send_event_helper() has not prototype, it has internal linkage and
> therefore it should be defined with storage class 'static'.
> 
> Signed-off-by: Fabio M. De Francesco 
> ---
> 
> Changes from v1: As suggested by Daniel Vetter, removed unnecessary
> kernel-doc comments.
> 
>  drivers/gpu/drm/drm_file.c | 10 +++---
>  1 file changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> index 7efbccffc2ea..a32e0d4f3604 100644
> --- a/drivers/gpu/drm/drm_file.c
> +++ b/drivers/gpu/drm/drm_file.c
> @@ -774,19 +774,15 @@ void drm_event_cancel_free(struct drm_device *dev,
>  }
>  EXPORT_SYMBOL(drm_event_cancel_free);
>  
> -/**
> +/*
>   * drm_send_event_helper - send DRM event to file descriptor
> - * @dev: DRM device
> - * @e: DRM event to deliver
> - * @timestamp: timestamp to set for the fence event in kernel's 
> CLOCK_MONOTONIC
> - * time domain
>   *
> - * This helper function sends the event @e, initialized with
> + * This helper function sends the event e, initialized with

Sorry I wasn't clear, I don't think there's anything useful at all in this
comment, so best to entirely remove it. Not just the kerneldoc header. Can
you pls respin?
-Daniel

>   * drm_event_reserve_init(), to its associated userspace DRM file.
>   * The timestamp variant of dma_fence_signal is used when the caller
>   * sends a valid timestamp.
>   */
> -void drm_send_event_helper(struct drm_device *dev,
> +static void drm_send_event_helper(struct drm_device *dev,
>  struct drm_pending_event *e, ktime_t timestamp)
>  {
>   assert_spin_locked(&dev->event_lock);
> -- 
> 2.31.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 1/8] drm/arm: Don't set allow_fb_modifiers explicitly

Since

commit 890880ddfdbe256083170866e49c87618b706ac7
Author: Paul Kocialkowski 
Date:   Fri Jan 4 09:56:10 2019 +0100

drm: Auto-set allow_fb_modifiers when given modifiers at plane init

this is done automatically as part of plane init, if drivers set the
modifier list correctly. Which is the case here for both komeda and
malidp.

Signed-off-by: Daniel Vetter 
Cc: "James (Qian) Wang" 
Cc: Liviu Dudau 
Cc: Mihail Atanassov 
Cc: Brian Starkey 
---
 drivers/gpu/drm/arm/display/komeda/komeda_kms.c | 1 -
 drivers/gpu/drm/arm/malidp_drv.c| 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_kms.c 
b/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
index aeda4e5ec4f4..ff45f23f3d56 100644
--- a/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
@@ -247,7 +247,6 @@ static void komeda_kms_mode_config_init(struct 
komeda_kms_dev *kms,
config->min_height  = 0;
config->max_width   = 4096;
config->max_height  = 4096;
-   config->allow_fb_modifiers = true;
 
config->funcs = &komeda_mode_config_funcs;
config->helper_private = &komeda_mode_config_helpers;
diff --git a/drivers/gpu/drm/arm/malidp_drv.c b/drivers/gpu/drm/arm/malidp_drv.c
index d83c7366b348..de59f3302516 100644
--- a/drivers/gpu/drm/arm/malidp_drv.c
+++ b/drivers/gpu/drm/arm/malidp_drv.c
@@ -403,7 +403,6 @@ static int malidp_init(struct drm_device *drm)
drm->mode_config.max_height = hwdev->max_line_size;
drm->mode_config.funcs = &malidp_mode_config_funcs;
drm->mode_config.helper_private = &malidp_mode_config_helpers;
-   drm->mode_config.allow_fb_modifiers = true;
 
ret = malidp_crtc_init(drm);
if (ret)
-- 
2.31.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 2/8] drm/arm/malidp: Always list modifiers

Even when all we support is linear, make that explicit. Otherwise the
uapi is rather confusing.

Cc: sta...@vger.kernel.org
Cc: Pekka Paalanen 
Cc: Liviu Dudau 
Cc: Brian Starkey 
Signed-off-by: Daniel Vetter 
---
 drivers/gpu/drm/arm/malidp_planes.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/arm/malidp_planes.c 
b/drivers/gpu/drm/arm/malidp_planes.c
index ddbba67f0283..8c2ab3d653b7 100644
--- a/drivers/gpu/drm/arm/malidp_planes.c
+++ b/drivers/gpu/drm/arm/malidp_planes.c
@@ -927,6 +927,11 @@ static const struct drm_plane_helper_funcs 
malidp_de_plane_helper_funcs = {
.atomic_disable = malidp_de_plane_disable,
 };
 
+static const uint64_t linear_only_modifiers[] = {
+   DRM_FORMAT_MOD_LINEAR,
+   DRM_FORMAT_MOD_INVALID
+};
+
 int malidp_de_planes_init(struct drm_device *drm)
 {
struct malidp_drm *malidp = drm->dev_private;
@@ -990,8 +995,8 @@ int malidp_de_planes_init(struct drm_device *drm)
 */
ret = drm_universal_plane_init(drm, &plane->base, crtcs,
&malidp_de_plane_funcs, formats, n,
-   (id == DE_SMART) ? NULL : modifiers, plane_type,
-   NULL);
+   (id == DE_SMART) ? linear_only_modifiers : 
modifiers,
+   plane_type, NULL);
 
if (ret < 0)
goto cleanup;
-- 
2.31.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 3/8] drm/i915: Don't set allow_fb_modifiers explicitly

Since

commit 890880ddfdbe256083170866e49c87618b706ac7
Author: Paul Kocialkowski 
Date:   Fri Jan 4 09:56:10 2019 +0100

drm: Auto-set allow_fb_modifiers when given modifiers at plane init

this is done automatically as part of plane init, if drivers set the
modifier list correctly. Which is the case here.

Signed-off-by: Daniel Vetter 
Cc: "Ville Syrjälä" 
Cc: Manasi Navare 
Cc: Jani Nikula 
Cc: "José Roberto de Souza" 
Cc: Chris Wilson 
Cc: Imre Deak 
Cc: Dave Airlie 
Cc: Maarten Lankhorst 
Cc: Karthik B S 
Cc: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_display.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 4bbc81d8d649..d2c6959190ab 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -11731,8 +11731,6 @@ static void intel_mode_config_init(struct 
drm_i915_private *i915)
mode_config->preferred_depth = 24;
mode_config->prefer_shadow = 1;
 
-   mode_config->allow_fb_modifiers = true;
-
mode_config->funcs = &intel_mode_funcs;
 
mode_config->async_page_flip = has_async_flips(i915);
-- 
2.31.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 5/8] drm/msm/mdp4: Fix modifier support enabling

Setting the cap without the modifier list is very confusing to
userspace. Fix that by listing the ones we support explicitly.

Stable backport so that userspace can rely on this working in a
reasonable way, i.e. that the cap set implies IN_FORMATS is available.

Cc: sta...@vger.kernel.org
Cc: Pekka Paalanen 
Cc: Rob Clark 
Cc: Jordan Crouse 
Cc: Emil Velikov 
Cc: Sam Ravnborg 
Signed-off-by: Daniel Vetter 
---
 drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c   | 2 --
 drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c | 8 +++-
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c 
b/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
index 3d729270bde1..4a5b518288b0 100644
--- a/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
+++ b/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
@@ -88,8 +88,6 @@ static int mdp4_hw_init(struct msm_kms *kms)
if (mdp4_kms->rev > 1)
mdp4_write(mdp4_kms, REG_MDP4_RESET_STATUS, 1);
 
-   dev->mode_config.allow_fb_modifiers = true;
-
 out:
pm_runtime_put_sync(dev->dev);
 
diff --git a/drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c 
b/drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c
index 9aecca919f24..49bdabea8ed5 100644
--- a/drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c
+++ b/drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c
@@ -349,6 +349,12 @@ enum mdp4_pipe mdp4_plane_pipe(struct drm_plane *plane)
return mdp4_plane->pipe;
 }
 
+static const uint64_t supported_format_modifiers[] = {
+   DRM_FORMAT_MOD_SAMSUNG_64_32_TILE,
+   DRM_FORMAT_MOD_LINEAR,
+   DRM_FORMAT_MOD_INVALID
+};
+
 /* initialize plane */
 struct drm_plane *mdp4_plane_init(struct drm_device *dev,
enum mdp4_pipe pipe_id, bool private_plane)
@@ -377,7 +383,7 @@ struct drm_plane *mdp4_plane_init(struct drm_device *dev,
type = private_plane ? DRM_PLANE_TYPE_PRIMARY : DRM_PLANE_TYPE_OVERLAY;
ret = drm_universal_plane_init(dev, plane, 0xff, &mdp4_plane_funcs,
 mdp4_plane->formats, mdp4_plane->nformats,
-NULL, type, NULL);
+supported_format_modifiers, type, NULL);
if (ret)
goto fail;
 
-- 
2.31.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 4/8] drm/msm/dpu1: Don't set allow_fb_modifiers explicitly

Since

commit 890880ddfdbe256083170866e49c87618b706ac7
Author: Paul Kocialkowski 
Date:   Fri Jan 4 09:56:10 2019 +0100

drm: Auto-set allow_fb_modifiers when given modifiers at plane init

this is done automatically as part of plane init, if drivers set the
modifier list correctly. Which is the case here.

v2: Rebase.

Signed-off-by: Daniel Vetter 
Cc: Rob Clark 
Cc: Kalyan Thota 
Cc: Jordan Crouse 
Cc: Eric Anholt 
Cc: Tanmay Shah 
Cc: Rajendra Nayak 
Cc: Jeykumar Sankaran 
Cc: Qinglang Miao 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index 88e9cc38c13b..93bc3575bf53 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -1020,11 +1020,6 @@ static int dpu_kms_hw_init(struct msm_kms *kms)
dpu_kms->catalog->caps->max_mixer_width * 2;
dev->mode_config.max_height = 4096;
 
-   /*
-* Support format modifiers for compression etc.
-*/
-   dev->mode_config.allow_fb_modifiers = true;
-
dev->max_vblank_count = 0x;
/* Disable vblank irqs aggressively for power-saving */
dev->vblank_disable_immediate = true;
-- 
2.31.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 6/8] drm/nouveau: Don't set allow_fb_modifiers explicitly

Since

commit 890880ddfdbe256083170866e49c87618b706ac7
Author: Paul Kocialkowski 
Date:   Fri Jan 4 09:56:10 2019 +0100

drm: Auto-set allow_fb_modifiers when given modifiers at plane init

this is done automatically as part of plane init, if drivers set the
modifier list correctly. Which is the case here.

Note that this fixes an inconsistency: We've set the cap everywhere,
but only nv50+ supports modifiers. Hence cc stable, but not further
back then the patch from Paul.

Cc: sta...@vger.kernel.org # v5.1 +
Cc: Pekka Paalanen 
Signed-off-by: Daniel Vetter 
Cc: Ben Skeggs 
Cc: nouv...@lists.freedesktop.org
---
 drivers/gpu/drm/nouveau/nouveau_display.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c 
b/drivers/gpu/drm/nouveau/nouveau_display.c
index 14101bd2a0ff..929de41c281f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.c
+++ b/drivers/gpu/drm/nouveau/nouveau_display.c
@@ -697,7 +697,6 @@ nouveau_display_create(struct drm_device *dev)
 
dev->mode_config.preferred_depth = 24;
dev->mode_config.prefer_shadow = 1;
-   dev->mode_config.allow_fb_modifiers = true;
 
if (drm->client.device.info.chipset < 0x11)
dev->mode_config.async_page_flip = false;
-- 
2.31.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 8/8] drm/modifiers: Enforce consistency between the cap an IN_FORMATS

It's very confusing for userspace to have to deal with inconsistencies
here, and some drivers screwed this up a bit. Most just ommitted the
format list when they meant to say that only linear modifier is
allowed, but some also meant that only implied modifiers are
acceptable (because actually none of the planes registered supported
modifiers).

Now that this is all done consistently across all drivers, document
the rules and enforce it in the drm core.

v2:
- Make the capability a link (Simon)
- Note that all is lost before 5.1.

Acked-by: Maxime Ripard 
Cc: Simon Ser 
Reviewed-by: Lucas Stach 
Cc: Pekka Paalanen 
Signed-off-by: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/drm_plane.c   | 18 +-
 include/drm/drm_mode_config.h |  2 ++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
index 0dd43882fe7c..20c7a1665414 100644
--- a/drivers/gpu/drm/drm_plane.c
+++ b/drivers/gpu/drm/drm_plane.c
@@ -128,6 +128,13 @@
  * pairs supported by this plane. The blob is a struct
  * drm_format_modifier_blob. Without this property the plane doesn't
  * support buffers with modifiers. Userspace cannot change this property.
+ *
+ * Note that userspace can check the &DRM_CAP_ADDFB2_MODIFIERS driver
+ * capability for general modifier support. If this flag is set then every
+ * plane will have the IN_FORMATS property, even when it only supports
+ * DRM_FORMAT_MOD_LINEAR. Before linux kernel release v5.1 there have been
+ * various bugs in this area with inconsistencies between the capability
+ * flag and per-plane properties.
  */
 
 static unsigned int drm_num_planes(struct drm_device *dev)
@@ -277,8 +284,14 @@ static int __drm_universal_plane_init(struct drm_device 
*dev,
format_modifier_count++;
}
 
-   if (format_modifier_count)
+   /* autoset the cap and check for consistency across all planes */
+   if (format_modifier_count) {
+   WARN_ON(!config->allow_fb_modifiers &&
+   !list_empty(&config->plane_list));
config->allow_fb_modifiers = true;
+   } else {
+   WARN_ON(config->allow_fb_modifiers);
+   }
 
plane->modifier_count = format_modifier_count;
plane->modifiers = kmalloc_array(format_modifier_count,
@@ -360,6 +373,9 @@ static int __drm_universal_plane_init(struct drm_device 
*dev,
  * drm_universal_plane_init() to let the DRM managed resource infrastructure
  * take care of cleanup and deallocation.
  *
+ * Drivers supporting modifiers must set @format_modifiers on all their planes,
+ * even those that only support DRM_FORMAT_MOD_LINEAR.
+ *
  * Returns:
  * Zero on success, error code on failure.
  */
diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
index ab424ddd7665..1ddf7783fdf7 100644
--- a/include/drm/drm_mode_config.h
+++ b/include/drm/drm_mode_config.h
@@ -909,6 +909,8 @@ struct drm_mode_config {
 * @allow_fb_modifiers:
 *
 * Whether the driver supports fb modifiers in the ADDFB2.1 ioctl call.
+* Note that drivers should not set this directly, it is automatically
+* set in drm_universal_plane_init().
 *
 * IMPORTANT:
 *
-- 
2.31.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 7/8] drm/stm: Don't set allow_fb_modifiers explicitly

Since

commit 890880ddfdbe256083170866e49c87618b706ac7
Author: Paul Kocialkowski 
Date:   Fri Jan 4 09:56:10 2019 +0100

drm: Auto-set allow_fb_modifiers when given modifiers at plane init

this is done automatically as part of plane init, if drivers set the
modifier list correctly. Which is the case here.

Signed-off-by: Daniel Vetter 
Cc: Yannick Fertre 
Cc: Philippe Cornu 
Cc: Benjamin Gaignard 
Cc: Maxime Coquelin 
Cc: Alexandre Torgue 
Cc: linux-st...@st-md-mailman.stormreply.com
Cc: linux-arm-ker...@lists.infradead.org
---
 drivers/gpu/drm/stm/ltdc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/stm/ltdc.c b/drivers/gpu/drm/stm/ltdc.c
index 65c3c79ad1d5..e99771b947b6 100644
--- a/drivers/gpu/drm/stm/ltdc.c
+++ b/drivers/gpu/drm/stm/ltdc.c
@@ -1326,8 +1326,6 @@ int ltdc_load(struct drm_device *ddev)
goto err;
}
 
-   ddev->mode_config.allow_fb_modifiers = true;
-
ret = ltdc_crtc_init(ddev, crtc);
if (ret) {
DRM_ERROR("Failed to init crtc\n");
-- 
2.31.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 8/8] drm/modifiers: Enforce consistency between the cap an IN_FORMATS

2021-04-27 Thread Simon Ser

Reviewed-by: Simon Ser 
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 00/57] Rid W=1 warnings from Staging

2021-04-27 Thread Greg Kroah-Hartman

On Wed, Apr 14, 2021 at 07:10:32PM +0100, Lee Jones wrote:
> This set is part of a larger effort attempting to clean-up W=1
> kernel builds, which are currently overwhelmingly riddled with
> niggly little warnings.
> 
> Lee Jones (57):

44 of these applied to my tree, I'll keep them in my "testing" branch
for now until -rc1 comes out.  Feel free to rebase your series on that
and fix up the remaining ones and resend.

Note, the comedi drivers have moved to drivers/comedi/ so those patches
need to be sent as a different series, if you still want to make those
changes based on the review comments.

thanks,

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 01/21] drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE

On Fri, Apr 23, 2021 at 05:31:11PM -0500, Jason Ekstrand wrote:
> This reverts commit 88be76cdafc7 ("drm/i915: Allow userspace to specify
> ringsize on construction").  This API was originally added for OpenCL
> but the compute-runtime PR has sat open for a year without action so we
> can still pull it out if we want.  I argue we should drop it for three
> reasons:
> 
>  1. If the compute-runtime PR has sat open for a year, this clearly
> isn't that important.
> 
>  2. It's a very leaky API.  Ring size is an implementation detail of the
> current execlist scheduler and really only makes sense there.  It
> can't apply to the older ring-buffer scheduler on pre-execlist
> hardware because that's shared across all contexts and it won't
> apply to the GuC scheduler that's in the pipeline.
> 
>  3. Having userspace set a ring size in bytes is a bad solution to the
> problem of having too small a ring.  There is no way that userspace
> has the information to know how to properly set the ring size so
> it's just going to detect the feature and always set it to the
> maximum of 512K.  This is what the compute-runtime PR does.  The
> scheduler in i915, on the other hand, does have the information to
> make an informed choice.  It could detect if the ring size is a
> problem and grow it itself.  Or, if that's too hard, we could just
> increase the default size from 16K to 32K or even 64K instead of
> relying on userspace to do it.
> 
> Let's drop this API for now and, if someone decides they really care
> about solving this problem, they can do it properly.
> 
> Signed-off-by: Jason Ekstrand 

Two things:
- I'm assuming you have an igt change to make sure we get EINVAL for both
  set and getparam now? Just to make sure.

- intel_context->ring is either a ring pointer when CONTEXT_ALLOC_BIT is
  set in ce->flags, or the size of the ring stored in the pointer if not.
  I'm seriously hoping you get rid of this complexity with your
  proto-context series, and also delete __intel_context_ring_size() in the
  end. That function has no business existing imo.

  If not, please make sure that's the case.

Aside from these patch looks good.

Reviewed-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/i915/Makefile |  1 -
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 85 +--
>  drivers/gpu/drm/i915/gt/intel_context_param.c | 63 --
>  drivers/gpu/drm/i915/gt/intel_context_param.h |  3 -
>  include/uapi/drm/i915_drm.h   | 20 +
>  5 files changed, 4 insertions(+), 168 deletions(-)
>  delete mode 100644 drivers/gpu/drm/i915/gt/intel_context_param.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index d0d936d9137bc..afa22338fa343 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -88,7 +88,6 @@ gt-y += \
>   gt/gen8_ppgtt.o \
>   gt/intel_breadcrumbs.o \
>   gt/intel_context.o \
> - gt/intel_context_param.o \
>   gt/intel_context_sseu.o \
>   gt/intel_engine_cs.o \
>   gt/intel_engine_heartbeat.o \
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index fd8ee52e17a47..e52b85b8f923d 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1335,63 +1335,6 @@ static int set_ppgtt(struct drm_i915_file_private 
> *file_priv,
>   return err;
>  }
>  
> -static int __apply_ringsize(struct intel_context *ce, void *sz)
> -{
> - return intel_context_set_ring_size(ce, (unsigned long)sz);
> -}
> -
> -static int set_ringsize(struct i915_gem_context *ctx,
> - struct drm_i915_gem_context_param *args)
> -{
> - if (!HAS_LOGICAL_RING_CONTEXTS(ctx->i915))
> - return -ENODEV;
> -
> - if (args->size)
> - return -EINVAL;
> -
> - if (!IS_ALIGNED(args->value, I915_GTT_PAGE_SIZE))
> - return -EINVAL;
> -
> - if (args->value < I915_GTT_PAGE_SIZE)
> - return -EINVAL;
> -
> - if (args->value > 128 * I915_GTT_PAGE_SIZE)
> - return -EINVAL;
> -
> - return context_apply_all(ctx,
> -  __apply_ringsize,
> -  __intel_context_ring_size(args->value));
> -}
> -
> -static int __get_ringsize(struct intel_context *ce, void *arg)
> -{
> - long sz;
> -
> - sz = intel_context_get_ring_size(ce);
> - GEM_BUG_ON(sz > INT_MAX);
> -
> - return sz; /* stop on first engine */
> -}
> -
> -static int get_ringsize(struct i915_gem_context *ctx,
> - struct drm_i915_gem_context_param *args)
> -{
> - int sz;
> -
> - if (!HAS_LOGICAL_RING_CONTEXTS(ctx->i915))
> - return -ENODEV;
> -
> - if (args->size)
> - return -EINVAL;
> -
> - sz = context_apply_all(ctx, __get_ringsize, NULL);
> - if (sz < 0)
> - ret

Re: [Intel-gfx] [PATCH 02/21] drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAP

On Fri, Apr 23, 2021 at 05:31:12PM -0500, Jason Ekstrand wrote:
> The idea behind this param is to support OpenCL drivers with relocations
> because OpenCL reserves 0x0 for NULL and, if we placed memory there, it
> would confuse CL kernels.  It was originally sent out as part of a patch
> series including libdrm [1] and Beignet [2] support.  However, the
> libdrm and Beignet patches never landed in their respective upstream
> projects so this API has never been used.  It's never been used in Mesa
> or any other driver, either.
> 
> Dropping this API allows us to delete a small bit of code.
> 
> [1]: https://lists.freedesktop.org/archives/intel-gfx/2015-May/067030.html
> [2]: https://lists.freedesktop.org/archives/intel-gfx/2015-May/067031.html
> 
> Signed-off-by: Jason Ekstrand 

Same thing about an igt making sure we reject these. Maybe an entire
wash-up igt which validates all the new restrictions on get/setparam
(including that after execbuf it's even more strict).
-Daniel

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c  | 16 ++--
>  .../gpu/drm/i915/gem/i915_gem_context_types.h|  1 -
>  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c   |  8 
>  include/uapi/drm/i915_drm.h  |  4 
>  4 files changed, 6 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index e52b85b8f923d..35bcdeddfbf3f 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1922,15 +1922,6 @@ static int ctx_setparam(struct drm_i915_file_private 
> *fpriv,
>   int ret = 0;
>  
>   switch (args->param) {
> - case I915_CONTEXT_PARAM_NO_ZEROMAP:
> - if (args->size)
> - ret = -EINVAL;
> - else if (args->value)
> - set_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
> - else
> - clear_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
> - break;
> -
>   case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
>   if (args->size)
>   ret = -EINVAL;
> @@ -1980,6 +1971,7 @@ static int ctx_setparam(struct drm_i915_file_private 
> *fpriv,
>   ret = set_persistence(ctx, args);
>   break;
>  
> + case I915_CONTEXT_PARAM_NO_ZEROMAP:
>   case I915_CONTEXT_PARAM_BAN_PERIOD:
>   case I915_CONTEXT_PARAM_RINGSIZE:
>   default:
> @@ -2360,11 +2352,6 @@ int i915_gem_context_getparam_ioctl(struct drm_device 
> *dev, void *data,
>   return -ENOENT;
>  
>   switch (args->param) {
> - case I915_CONTEXT_PARAM_NO_ZEROMAP:
> - args->size = 0;
> - args->value = test_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
> - break;
> -
>   case I915_CONTEXT_PARAM_GTT_SIZE:
>   args->size = 0;
>   rcu_read_lock();
> @@ -2412,6 +2399,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device 
> *dev, void *data,
>   args->value = i915_gem_context_is_persistent(ctx);
>   break;
>  
> + case I915_CONTEXT_PARAM_NO_ZEROMAP:
>   case I915_CONTEXT_PARAM_BAN_PERIOD:
>   case I915_CONTEXT_PARAM_RINGSIZE:
>   default:
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h 
> b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> index 340473aa70de0..5ae71ec936f7c 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> @@ -129,7 +129,6 @@ struct i915_gem_context {
>* @user_flags: small set of booleans controlled by the user
>*/
>   unsigned long user_flags;
> -#define UCONTEXT_NO_ZEROMAP  0
>  #define UCONTEXT_NO_ERROR_CAPTURE1
>  #define UCONTEXT_BANNABLE2
>  #define UCONTEXT_RECOVERABLE 3
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 297143511f99b..b812f313422a9 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -290,7 +290,6 @@ struct i915_execbuffer {
>   struct intel_context *reloc_context;
>  
>   u64 invalid_flags; /** Set of execobj.flags that are invalid */
> - u32 context_flags; /** Set of execobj.flags to insert from the ctx */
>  
>   u64 batch_len; /** Length of batch within object */
>   u32 batch_start_offset; /** Location within object of batch */
> @@ -541,9 +540,6 @@ eb_validate_vma(struct i915_execbuffer *eb,
>   entry->flags |= EXEC_OBJECT_NEEDS_GTT | 
> __EXEC_OBJECT_NEEDS_MAP;
>   }
>  
> - if (!(entry->flags & EXEC_OBJECT_PINNED))
> - entry->flags |= eb->context_flags;
> -
>   return 0;
>  }
>  
> @@ -750,10 +746,6 @@ static int eb_select_context(struct i915_execbuffer *eb)
>   if (rcu_access_pointer(ctx->vm))
>   eb->in

Re: [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem

On Fri, Apr 23, 2021 at 05:31:13PM -0500, Jason Ekstrand wrote:
> Instead of handling it like a context param, unconditionally set it when
> intel_contexts are created.  This doesn't fix anything but does simplify
> the code a bit.
> 
> Signed-off-by: Jason Ekstrand 

So the idea here is that since years we've had a watchdog uapi floating
about. Aim was for media, so that they could set very tight deadlines for
their transcodes jobs, so that if you have a corrupt bitstream (especially
for decoding) you don't hang your desktop unecessarily wrong.

But it's been stuck in limbo since forever, plus I get how this gets a bit
in the way of the proto ctx work, so makes sense to remove this prep work
again.

Maybe include the above in the commit message for a notch more context.

Reviewed-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++
>  .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
>  drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
>  3 files changed, 6 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 35bcdeddfbf3f..1091cc04a242a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context 
> *ce,
>   intel_engine_has_timeslices(ce->engine))
>   __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
>  
> - intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> + if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> + ctx->i915->params.request_timeout_ms) {
> + unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> + intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
> + }
>  }
>  
>  static void __free_engines(struct i915_gem_engines *e, unsigned int count)
> @@ -792,41 +796,6 @@ static void __assign_timeline(struct i915_gem_context 
> *ctx,
>   context_apply_all(ctx, __apply_timeline, timeline);
>  }
>  
> -static int __apply_watchdog(struct intel_context *ce, void *timeout_us)
> -{
> - return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
> -}
> -
> -static int
> -__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us)
> -{
> - int ret;
> -
> - ret = context_apply_all(ctx, __apply_watchdog,
> - (void *)(uintptr_t)timeout_us);
> - if (!ret)
> - ctx->watchdog.timeout_us = timeout_us;
> -
> - return ret;
> -}
> -
> -static void __set_default_fence_expiry(struct i915_gem_context *ctx)
> -{
> - struct drm_i915_private *i915 = ctx->i915;
> - int ret;
> -
> - if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) ||
> - !i915->params.request_timeout_ms)
> - return;
> -
> - /* Default expiry for user fences. */
> - ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000);
> - if (ret)
> - drm_notice(&i915->drm,
> -"Failed to configure default fence expiry! (%d)",
> -ret);
> -}
> -
>  static struct i915_gem_context *
>  i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>  {
> @@ -871,8 +840,6 @@ i915_gem_create_context(struct drm_i915_private *i915, 
> unsigned int flags)
>   intel_timeline_put(timeline);
>   }
>  
> - __set_default_fence_expiry(ctx);
> -
>   trace_i915_context_create(ctx);
>  
>   return ctx;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h 
> b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> index 5ae71ec936f7c..676592e27e7d2 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> @@ -153,10 +153,6 @@ struct i915_gem_context {
>*/
>   atomic_t active_count;
>  
> - struct {
> - u64 timeout_us;
> - } watchdog;
> -
>   /**
>* @hang_timestamp: The last time(s) this context caused a GPU hang
>*/
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h 
> b/drivers/gpu/drm/i915/gt/intel_context_param.h
> index dffedd983693d..0c69cb42d075c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
> @@ -10,11 +10,10 @@
>  
>  #include "intel_context.h"
>  
> -static inline int
> +static inline void
>  intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
>  {
>   ce->watchdog.timeout_us = timeout_us;
> - return 0;
>  }
>  
>  #endif /* INTEL_CONTEXT_PARAM_H */
> -- 
> 2.31.1
> 
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-d

Re: [PATCH] drm/i9i5/gt: Fix a double free in gen8_preallocate_top_level_pdp

On Mon, 26 Apr 2021 at 13:44, Lv Yunlong  wrote:
>
> Our code analyzer reported a double free bug.
>
> In gen8_preallocate_top_level_pdp, pde and pde->pt.base are allocated
> via alloc_pd(vm) with one reference. If pin_pt_dma() failed, pde->pt.base
> is freed by i915_gem_object_put() with a reference dropped. Then free_pd
> calls free_px() defined in intel_ppgtt.c, which calls i915_gem_object_put()
> to put pde->pt.base again.
>
> As pde->pt.base is protected by refcount, so the second put will not free
> pde->pt.base actually. But, maybe it is better to remove the first put?
>
> Fixes: 82adf901138cc ("drm/i915/gt: Shrink i915_page_directory's slab bucket")
> Signed-off-by: Lv Yunlong 

Yes, it looks like this fixes a potential use-after-free. Thanks for the patch,
Reviewed-by: Matthew Auld 

Pushed to drm-intel-gt-next.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 04/21] drm/i915/gem: Return void from context_apply_all

On Fri, Apr 23, 2021 at 05:31:14PM -0500, Jason Ekstrand wrote:
> None of the callbacks we use with it return an error code anymore; they
> all return 0 unconditionally.
> 
> Signed-off-by: Jason Ekstrand 

Nice.

Reviewed-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 26 +++--
>  1 file changed, 8 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 1091cc04a242a..8a77855123cec 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -718,32 +718,25 @@ __context_engines_await(const struct i915_gem_context 
> *ctx,
>   return engines;
>  }
>  
> -static int
> +static void
>  context_apply_all(struct i915_gem_context *ctx,
> -   int (*fn)(struct intel_context *ce, void *data),
> +   void (*fn)(struct intel_context *ce, void *data),
> void *data)
>  {
>   struct i915_gem_engines_iter it;
>   struct i915_gem_engines *e;
>   struct intel_context *ce;
> - int err = 0;
>  
>   e = __context_engines_await(ctx, NULL);
> - for_each_gem_engine(ce, e, it) {
> - err = fn(ce, data);
> - if (err)
> - break;
> - }
> + for_each_gem_engine(ce, e, it)
> + fn(ce, data);
>   i915_sw_fence_complete(&e->fence);
> -
> - return err;
>  }
>  
> -static int __apply_ppgtt(struct intel_context *ce, void *vm)
> +static void __apply_ppgtt(struct intel_context *ce, void *vm)
>  {
>   i915_vm_put(ce->vm);
>   ce->vm = i915_vm_get(vm);
> - return 0;
>  }
>  
>  static struct i915_address_space *
> @@ -783,10 +776,9 @@ static void __set_timeline(struct intel_timeline **dst,
>   intel_timeline_put(old);
>  }
>  
> -static int __apply_timeline(struct intel_context *ce, void *timeline)
> +static void __apply_timeline(struct intel_context *ce, void *timeline)
>  {
>   __set_timeline(&ce->timeline, timeline);
> - return 0;
>  }
>  
>  static void __assign_timeline(struct i915_gem_context *ctx,
> @@ -1842,19 +1834,17 @@ set_persistence(struct i915_gem_context *ctx,
>   return __context_set_persistence(ctx, args->value);
>  }
>  
> -static int __apply_priority(struct intel_context *ce, void *arg)
> +static void __apply_priority(struct intel_context *ce, void *arg)
>  {
>   struct i915_gem_context *ctx = arg;
>  
>   if (!intel_engine_has_timeslices(ce->engine))
> - return 0;
> + return;
>  
>   if (ctx->sched.priority >= I915_PRIORITY_NORMAL)
>   intel_context_set_use_semaphores(ce);
>   else
>   intel_context_clear_use_semaphores(ce);
> -
> - return 0;
>  }
>  
>  static int set_priority(struct i915_gem_context *ctx,
> -- 
> 2.31.1
> 
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v4] drm/amd/amdgpu/amdgpu_drv.c: Replace drm_modeset_lock_all with drm_modeset_lock

2021-04-27 Thread Fabio M. De Francesco

drm_modeset_lock_all() is not needed here, so it is replaced with
drm_modeset_lock(). The crtc list around which we are looping never
changes, therefore the only lock we need is to protect access to
crtc->state.

Suggested-by: Daniel Vetter 
Suggested-by: Matthew Wilcox 
Signed-off-by: Fabio M. De Francesco 
Reviewed-by: Matthew Wilcox (Oracle) 
---

Changes from v3: CC'ed more (previously missing) maintainers.
Changes from v2: Drop file name from the Subject. Cc'ed all maintainers.
Changes from v1: Removed unnecessary braces around single statement
block.

 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 80130c1c0c68..39204dbc168b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1595,17 +1595,15 @@ static int amdgpu_pmops_runtime_idle(struct device *dev)
if (amdgpu_device_has_dc_support(adev)) {
struct drm_crtc *crtc;
 
-   drm_modeset_lock_all(drm_dev);
-
drm_for_each_crtc(crtc, drm_dev) {
-   if (crtc->state->active) {
+   drm_modeset_lock(&crtc->mutex, NULL);
+   if (crtc->state->active)
ret = -EBUSY;
+   drm_modeset_unlock(&crtc->mutex);
+   if (ret < 0)
break;
-   }
}
 
-   drm_modeset_unlock_all(drm_dev);
-
} else {
struct drm_connector *list_connector;
struct drm_connector_list_iter iter;
-- 
2.31.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [Intel-gfx] [PATCH 05/21] drm/i915: Drop the CONTEXT_CLONE API

On Fri, Apr 23, 2021 at 05:31:15PM -0500, Jason Ekstrand wrote:
> This API allows one context to grab bits out of another context upon
> creation.  It can be used as a short-cut for setparam(getparam()) for
> things like I915_CONTEXT_PARAM_VM.  However, it's never been used by any
> real userspace.  It's used by a few IGT tests and that's it.  Since it
> doesn't add any real value (most of the stuff you can CLONE you can copy
> in other ways), drop it.
> 
> There is one thing that this API allows you to clone which you cannot
> clone via getparam/setparam: timelines.  However, timelines are an
> implementation detail of i915 and not really something that needs to be
> exposed to userspace.  Also, sharing timelines between contexts isn't
> obviously useful and supporting it has the potential to complicate i915
> internally.  It also doesn't add any functionality that the client can't
> get in other ways.  If a client really wants a shared timeline, they can
> use a syncobj and set it as an in and out fence on every submit.
> 
> Signed-off-by: Jason Ekstrand 
> Cc: Tvrtko Ursulin 
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 199 +---
>  include/uapi/drm/i915_drm.h |  16 +-
>  2 files changed, 6 insertions(+), 209 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 8a77855123cec..2c2fefa912805 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1958,207 +1958,14 @@ static int create_setparam(struct 
> i915_user_extension __user *ext, void *data)
>   return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
>  }
>  
> -static int clone_engines(struct i915_gem_context *dst,
> -  struct i915_gem_context *src)
> +static int invalid_ext(struct i915_user_extension __user *ext, void *data)
>  {
> - struct i915_gem_engines *clone, *e;
> - bool user_engines;
> - unsigned long n;
> -
> - e = __context_engines_await(src, &user_engines);
> - if (!e)
> - return -ENOENT;
> -
> - clone = alloc_engines(e->num_engines);
> - if (!clone)
> - goto err_unlock;
> -
> - for (n = 0; n < e->num_engines; n++) {
> - struct intel_engine_cs *engine;
> -
> - if (!e->engines[n]) {
> - clone->engines[n] = NULL;
> - continue;
> - }
> - engine = e->engines[n]->engine;
> -
> - /*
> -  * Virtual engines are singletons; they can only exist
> -  * inside a single context, because they embed their
> -  * HW context... As each virtual context implies a single
> -  * timeline (each engine can only dequeue a single request
> -  * at any time), it would be surprising for two contexts
> -  * to use the same engine. So let's create a copy of
> -  * the virtual engine instead.
> -  */
> - if (intel_engine_is_virtual(engine))
> - clone->engines[n] =
> - intel_execlists_clone_virtual(engine);

You forgot to gc this function here ^^

> - else
> - clone->engines[n] = intel_context_create(engine);
> - if (IS_ERR_OR_NULL(clone->engines[n])) {
> - __free_engines(clone, n);
> - goto err_unlock;
> - }
> -
> - intel_context_set_gem(clone->engines[n], dst);

Not peeked ahead, but I'm really hoping intel_context_set_gem gets removed
eventually too ...

> - }
> - clone->num_engines = n;
> - i915_sw_fence_complete(&e->fence);
> -
> - /* Serialised by constructor */
> - engines_idle_release(dst, rcu_replace_pointer(dst->engines, clone, 1));
> - if (user_engines)
> - i915_gem_context_set_user_engines(dst);
> - else
> - i915_gem_context_clear_user_engines(dst);
> - return 0;
> -
> -err_unlock:
> - i915_sw_fence_complete(&e->fence);
> - return -ENOMEM;
> -}
> -
> -static int clone_flags(struct i915_gem_context *dst,
> -struct i915_gem_context *src)
> -{
> - dst->user_flags = src->user_flags;
> - return 0;
> -}
> -
> -static int clone_schedattr(struct i915_gem_context *dst,
> -struct i915_gem_context *src)
> -{
> - dst->sched = src->sched;
> - return 0;
> -}
> -
> -static int clone_sseu(struct i915_gem_context *dst,
> -   struct i915_gem_context *src)
> -{
> - struct i915_gem_engines *e = i915_gem_context_lock_engines(src);
> - struct i915_gem_engines *clone;
> - unsigned long n;
> - int err;
> -
> - /* no locking required; sole access under constructor*/
> - clone = __context_engines_static(dst);
> - if (e->num_engines != clone->num_engines) {
> - err = -EINVAL;
> -

Re: [Intel-gfx] [PATCH 06/21] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)

On Fri, Apr 23, 2021 at 05:31:16PM -0500, Jason Ekstrand wrote:
> This API is entirely unnecessary and I'd love to get rid of it.  If
> userspace wants a single timeline across multiple contexts, they can
> either use implicit synchronization or a syncobj, both of which existed
> at the time this feature landed.  The justification given at the time
> was that it would help GL drivers which are inherently single-timeline.
> However, neither of our GL drivers actually wanted the feature.  i965
> was already in maintenance mode at the time and iris uses syncobj for
> everything.
> 
> Unfortunately, as much as I'd love to get rid of it, it is used by the
> media driver so we can't do that.  We can, however, do the next-best
> thing which is to embed a syncobj in the context and do exactly what
> we'd expect from userspace internally.  This isn't an entirely identical
> implementation because it's no longer atomic if userspace races with
> itself by calling execbuffer2 twice simultaneously from different
> threads.  It won't crash in that case; it just doesn't guarantee any
> ordering between those two submits.
> 
> Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
> advantages beyond mere annoyance.  One is that intel_timeline is no
> longer an api-visible object and can remain entirely an implementation
> detail.  This may be advantageous as we make scheduler changes going
> forward.  Second is that, together with deleting the CLONE_CONTEXT API,
> we should now have a 1:1 mapping between intel_context and
> intel_timeline which may help us reduce locking.
> 
> v2 (Jason Ekstrand):
>  - Update the comment on i915_gem_context::syncobj to mention that it's
>an emulation and the possible race if userspace calls execbuffer2
>twice on the same context concurrently.
>  - Wrap the checks for eb.gem_context->syncobj in unlikely()
>  - Drop the dma_fence reference
>  - Improved commit message
> 
> v3 (Jason Ekstrand):
>  - Move the dma_fence_put() to before the error exit
> 
> Signed-off-by: Jason Ekstrand 
> Cc: Maarten Lankhorst 
> Cc: Matthew Brost 

Reviewed-by: Daniel Vetter 

I'm assuming that igt coverage is good enough. Otoh if CI didn't catch
that racing execbuf are now unsynced maybe it wasn't good enough, but
whatever :-)
-Daniel


> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 49 +--
>  .../gpu/drm/i915/gem/i915_gem_context_types.h | 14 +-
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 16 ++
>  3 files changed, 40 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 2c2fefa912805..a72c9b256723b 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -67,6 +67,8 @@
>  #include 
>  #include 
>  
> +#include 
> +
>  #include "gt/gen6_ppgtt.h"
>  #include "gt/intel_context.h"
>  #include "gt/intel_context_param.h"
> @@ -225,10 +227,6 @@ static void intel_context_set_gem(struct intel_context 
> *ce,
>   ce->vm = vm;
>   }
>  
> - GEM_BUG_ON(ce->timeline);
> - if (ctx->timeline)
> - ce->timeline = intel_timeline_get(ctx->timeline);
> -
>   if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
>   intel_engine_has_timeslices(ce->engine))
>   __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> @@ -351,9 +349,6 @@ void i915_gem_context_release(struct kref *ref)
>   mutex_destroy(&ctx->engines_mutex);
>   mutex_destroy(&ctx->lut_mutex);
>  
> - if (ctx->timeline)
> - intel_timeline_put(ctx->timeline);
> -
>   put_pid(ctx->pid);
>   mutex_destroy(&ctx->mutex);
>  
> @@ -570,6 +565,9 @@ static void context_close(struct i915_gem_context *ctx)
>   if (vm)
>   i915_vm_close(vm);
>  
> + if (ctx->syncobj)
> + drm_syncobj_put(ctx->syncobj);
> +
>   ctx->file_priv = ERR_PTR(-EBADF);
>  
>   /*
> @@ -765,33 +763,11 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
>   i915_vm_close(vm);
>  }
>  
> -static void __set_timeline(struct intel_timeline **dst,
> -struct intel_timeline *src)
> -{
> - struct intel_timeline *old = *dst;
> -
> - *dst = src ? intel_timeline_get(src) : NULL;
> -
> - if (old)
> - intel_timeline_put(old);
> -}
> -
> -static void __apply_timeline(struct intel_context *ce, void *timeline)
> -{
> - __set_timeline(&ce->timeline, timeline);
> -}
> -
> -static void __assign_timeline(struct i915_gem_context *ctx,
> -   struct intel_timeline *timeline)
> -{
> - __set_timeline(&ctx->timeline, timeline);
> - context_apply_all(ctx, __apply_timeline, timeline);
> -}
> -
>  static struct i915_gem_context *
>  i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>  {
>   struct i915_gem_context *ctx;
> + int ret;
>  
>   if (flags & I915_C

Re: [PATCH v2] drm/bochs: Add screen blanking support

2021-04-27 Thread Gerd Hoffmann

> > I'm fine to change in any better way, of course, so feel free to
> > modify the patch.
> 
> If no one objects, I'll merge it as-is. It's somewhat wrong wrt to VGA, but
> apparently what qemu wants.

No objections.

Acked-by: Gerd Hoffmann 

FYI: cirrus is in the same situation, the modesetting works with qemu
but is possibly incomplete and might not work on cirrus real hardware
(it only binds to the qemu subsystem id for that reason).

take care,
  Gerd

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 07/21] drm/i915: Drop getparam support for I915_CONTEXT_PARAM_ENGINES

On Fri, Apr 23, 2021 at 05:31:17PM -0500, Jason Ekstrand wrote:
> This has never been used by any userspace except IGT and provides no
> real functionality beyond parroting back parameters userspace passed in
> as part of context creation or via setparam.  If the context is in
> legacy mode (where you use I915_EXEC_RENDER and friends), it returns
> success with zero data so it's not useful for discovering what engines
> are in the context.  It's also not a replacement for the recently
> removed I915_CONTEXT_CLONE_ENGINES because it doesn't return any of the
> balancing or bonding information.
> 
> Signed-off-by: Jason Ekstrand 
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 77 +
>  1 file changed, 1 insertion(+), 76 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index a72c9b256723b..e8179918fa306 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1725,78 +1725,6 @@ set_engines(struct i915_gem_context *ctx,
>   return 0;
>  }
>  
> -static int
> -get_engines(struct i915_gem_context *ctx,
> - struct drm_i915_gem_context_param *args)
> -{
> - struct i915_context_param_engines __user *user;
> - struct i915_gem_engines *e;
> - size_t n, count, size;
> - bool user_engines;
> - int err = 0;
> -
> - e = __context_engines_await(ctx, &user_engines);
> - if (!e)
> - return -ENOENT;
> -
> - if (!user_engines) {
> - i915_sw_fence_complete(&e->fence);
> - args->size = 0;
> - return 0;
> - }
> -
> - count = e->num_engines;
> -
> - /* Be paranoid in case we have an impedance mismatch */
> - if (!check_struct_size(user, engines, count, &size)) {
> - err = -EINVAL;
> - goto err_free;
> - }
> - if (overflows_type(size, args->size)) {
> - err = -EINVAL;
> - goto err_free;
> - }
> -
> - if (!args->size) {
> - args->size = size;
> - goto err_free;
> - }
> -
> - if (args->size < size) {
> - err = -EINVAL;
> - goto err_free;
> - }
> -
> - user = u64_to_user_ptr(args->value);
> - if (put_user(0, &user->extensions)) {
> - err = -EFAULT;
> - goto err_free;
> - }
> -
> - for (n = 0; n < count; n++) {
> - struct i915_engine_class_instance ci = {
> - .engine_class = I915_ENGINE_CLASS_INVALID,
> - .engine_instance = I915_ENGINE_CLASS_INVALID_NONE,
> - };
> -
> - if (e->engines[n]) {
> - ci.engine_class = e->engines[n]->engine->uabi_class;
> - ci.engine_instance = 
> e->engines[n]->engine->uabi_instance;
> - }
> -
> - if (copy_to_user(&user->engines[n], &ci, sizeof(ci))) {
> - err = -EFAULT;
> - goto err_free;
> - }
> - }
> -
> - args->size = size;
> -
> -err_free:
> - i915_sw_fence_complete(&e->fence);
> - return err;
> -}
> -
>  static int
>  set_persistence(struct i915_gem_context *ctx,
>   const struct drm_i915_gem_context_param *args)
> @@ -2127,10 +2055,6 @@ int i915_gem_context_getparam_ioctl(struct drm_device 
> *dev, void *data,
>   ret = get_ppgtt(file_priv, ctx, args);
>   break;
>  
> - case I915_CONTEXT_PARAM_ENGINES:
> - ret = get_engines(ctx, args);
> - break;
> -
>   case I915_CONTEXT_PARAM_PERSISTENCE:
>   args->size = 0;
>   args->value = i915_gem_context_is_persistent(ctx);
> @@ -2138,6 +2062,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device 
> *dev, void *data,
>  
>   case I915_CONTEXT_PARAM_NO_ZEROMAP:
>   case I915_CONTEXT_PARAM_BAN_PERIOD:
> + case I915_CONTEXT_PARAM_ENGINES:
>   case I915_CONTEXT_PARAM_RINGSIZE:

I like how this list keeps growing. Same thing as usual about "pls check
igt coverage".

Reviewed-by: Daniel Vetter 

>   default:
>   ret = -EINVAL;
> -- 
> 2.31.1
> 
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/omap: Fix issue with clocks left on after resume

2021-04-27 Thread Tony Lindgren

Hi,

* Tomi Valkeinen  [210427 08:47]:
> Hi Tony,
> 
> On 26/04/2021 17:12, Tony Lindgren wrote:
> > On resume, dispc pm_runtime_force_resume() is not enabling the hardware
> > as we pass the pm_runtime_need_not_resume() test as the device is suspended
> > with no child devices.
> > 
> > As the resume continues, omap_atomic_comit_tail() calls dispc_runtime_get()
> > that calls rpm_resume() enabling the hardware, and increasing child_count
> > for it's parent device.
> > 
> > But at this point device_complete() has not yet been called for dispc. So
> > when omap_atomic_comit_tail() calls dispc_runtime_get(), it won't idle
> 
> Is that supposed to be dispc_runtime_put()?

Oops that's right, yes it should have dispc_runtime_put() there.

> > the hardware, and the clocks are left on after resume.
> > 
> > This can be easily seen for example after suspending Beagleboard-X15 with
> > no displays connected, and by reading the CM_DSS_DSS_CLKCTRL register at
> > 0x4a009120 after resume. After a suspend and resume cycle, it shows a
> > value of 0x00040102 instead of 0x0007 like it should.
> > 
> > Let's fix the issue by calling dispc_runtime_suspend() and
> > dispc_runtime_resume() directly from dispc_suspend() and dispc_resume().
> > This leaves out the PM runtime related issues for system suspend.
> > 
> > See also earlier commit 88d26136a256 ("PM: Prevent runtime suspend during
> > system resume") and commit ca8199f13498 ("drm/msm/dpu: ensure device
> > suspend happens during PM sleep") for more information.
> > 
> > Fixes: ecfdedd7da5d ("drm/omap: force runtime PM suspend on system suspend")
> > Signed-off-by: Tony Lindgren 
> 
> Why is this only needed for dispc, and not the other dss submodules which
> were handled in ecfdedd7da5d?

For dispc, the other components are also calling dispc_runtime_get() and
dispc_runtime_put(). I don't think the other parts have such dss cross
component calls happening during system suspend and resume.

> I have to say I'm pretty confused (maybe partly because it's been a while
> since I debugged this =). Aren't the pm_runtime_force_suspend/resume made
> explicitly for this use case? At least that is how I read the documentation.

IMO using pm_runtime_force_suspend() and pm_runtime_force_resume() in the
system suspend and resume path is a big hack. In the long run it's best to
just use the same device internal functions for both PM runtime and system
suspend and leave out the unnecessary PM runtime calls for system suspend.

> If I understand right, this is only an issue when the dss was not enabled
> before the system suspend? And as the dispc is not enabled at suspend,
> pm_runtime_force_suspend and pm_runtime_force_resume don't really do
> anything. At resume, the DRM resume functionality causes omapdrm to call
> pm_runtime_get and put, and this somehow causes the dss to stay enabled.

We do have dss enabled at system suspend from omap_atomic_comit_tail()
until pm_runtime_force_suspend(). Then we have pm_runtime_force_resume()
enable it.

Then on resume PM runtime prevents disable of the hardware on resume path
until after device_complete(). Until then we have rpm_suspend() return
-EBUSY, and so the parent child_count is not going to get decreased.
Something would have to handle the -EBUSY error here it seems.

> I think I'm missing something here, but this patch feels like a hack fix.

Probably the lack of handling for rpm_suspend() -EBUSY is the missing
part :)

> But continuing with the hack mindset, as the PM apparently needs DSS to be
> enabled at suspend for it to work correctly, lets give that to the PM. This
> seems to work also:
> 
> diff --git a/drivers/gpu/drm/omapdrm/omap_drv.c
> b/drivers/gpu/drm/omapdrm/omap_drv.c
> index 28bbad1353ee..0fd9d80d3e12 100644
> --- a/drivers/gpu/drm/omapdrm/omap_drv.c
> +++ b/drivers/gpu/drm/omapdrm/omap_drv.c
> @@ -695,6 +695,8 @@ static int omap_drm_suspend(struct device *dev)
> struct omap_drm_private *priv = dev_get_drvdata(dev);
> struct drm_device *drm_dev = priv->ddev;
> 
> +   dispc_runtime_get(priv->dispc);
> +
> return drm_mode_config_helper_suspend(drm_dev);
>  }
> 
> @@ -705,6 +707,8 @@ static int omap_drm_resume(struct device *dev)
> 
> drm_mode_config_helper_resume(drm_dev);
> 
> +   dispc_runtime_put(priv->dispc);
> +
> return omap_gem_resume(drm_dev);
>  }
>  #endif

Yeah sure this works too.. That is if you want to keep the dependency
around for pm_runtime_force_suspend(), cause some extra PM runtime
calls, and also add few more dss cross-component calls to dispc PM
runtime while saving a few lines of code :)

Naturally removing the dependency to pm_runtime_force_suspend() can
be done separately later on too, up to you.

> But I don't think that helps with the other dss submodules either.

I don't think we have other dss components call other components'
PM runtime functions during system suspend and resume like we have
for dispc. But yeah similar issues could be lu

[PATCH] drm/i915: Simplify userptr locking

2021-04-27 Thread Thomas Hellström

Use an rwlock instead of spinlock for the global notifier lock
to reduce risk of contention in execbuf.

Protect object state with the object lock whenever possible rather
than with the global notifier lock

Don't take an explicit page_ref in userptr_submit_init() but rather
call get_pages() after obtaining the page list so that
get_pages() holds the page_ref. This means we don't need to call
userptr_submit_fini(), which is needed to avoid awkward locking
in our upcoming VM_BIND code.

Cc: Maarten Lankhorst 
Cc: Daniel Vetter 
Cc: Dave Airlie 
Cc: dri-devel@lists.freedesktop.org
Cc: Intel Graphics Development 
Signed-off-by: Thomas Hellström 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 21 +++---
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  2 -
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   | 72 ++-
 drivers/gpu/drm/i915/i915_drv.h   |  2 +-
 4 files changed, 31 insertions(+), 66 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 297143511f99..407ad0f223e3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -994,7 +994,7 @@ eb_get_vma(const struct i915_execbuffer *eb, unsigned long 
handle)
}
 }
 
-static void eb_release_vmas(struct i915_execbuffer *eb, bool final, bool 
release_userptr)
+static void eb_release_vmas(struct i915_execbuffer *eb, bool final)
 {
const unsigned int count = eb->buffer_count;
unsigned int i;
@@ -1008,11 +1008,6 @@ static void eb_release_vmas(struct i915_execbuffer *eb, 
bool final, bool release
 
eb_unreserve_vma(ev);
 
-   if (release_userptr && ev->flags & __EXEC_OBJECT_USERPTR_INIT) {
-   ev->flags &= ~__EXEC_OBJECT_USERPTR_INIT;
-   i915_gem_object_userptr_submit_fini(vma->obj);
-   }
-
if (final)
i915_vma_put(vma);
}
@@ -1990,7 +1985,7 @@ static noinline int eb_relocate_parse_slow(struct 
i915_execbuffer *eb,
}
 
/* We may process another execbuffer during the unlock... */
-   eb_release_vmas(eb, false, true);
+   eb_release_vmas(eb, false);
i915_gem_ww_ctx_fini(&eb->ww);
 
if (rq) {
@@ -2094,7 +2089,7 @@ static noinline int eb_relocate_parse_slow(struct 
i915_execbuffer *eb,
 
 err:
if (err == -EDEADLK) {
-   eb_release_vmas(eb, false, false);
+   eb_release_vmas(eb, false);
err = i915_gem_ww_ctx_backoff(&eb->ww);
if (!err)
goto repeat_validate;
@@ -2191,7 +2186,7 @@ static int eb_relocate_parse(struct i915_execbuffer *eb)
 
 err:
if (err == -EDEADLK) {
-   eb_release_vmas(eb, false, false);
+   eb_release_vmas(eb, false);
err = i915_gem_ww_ctx_backoff(&eb->ww);
if (!err)
goto retry;
@@ -2268,7 +2263,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 
 #ifdef CONFIG_MMU_NOTIFIER
if (!err && (eb->args->flags & __EXEC_USERPTR_USED)) {
-   spin_lock(&eb->i915->mm.notifier_lock);
+   read_lock(&eb->i915->mm.notifier_lock);
 
/*
 * count is always at least 1, otherwise __EXEC_USERPTR_USED
@@ -2286,7 +2281,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
break;
}
 
-   spin_unlock(&eb->i915->mm.notifier_lock);
+   read_unlock(&eb->i915->mm.notifier_lock);
}
 #endif
 
@@ -3435,7 +3430,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 
err = eb_lookup_vmas(&eb);
if (err) {
-   eb_release_vmas(&eb, true, true);
+   eb_release_vmas(&eb, true);
goto err_engine;
}
 
@@ -3528,7 +3523,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
i915_request_put(eb.request);
 
 err_vma:
-   eb_release_vmas(&eb, true, true);
+   eb_release_vmas(&eb, true);
if (eb.trampoline)
i915_vma_unpin(eb.trampoline);
WARN_ON(err == -EDEADLK);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 4a7388ce472e..b6dbafd88a2e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -606,14 +606,12 @@ i915_gem_object_is_userptr(struct drm_i915_gem_object 
*obj)
 
 int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj);
 int i915_gem_object_userptr_submit_done(struct drm_i915_gem_object *obj);
-void i915_gem_object_userptr_submit_fini(struct drm_i915_gem_object *obj);
 int i915_gem_object_userptr_validate(struct drm_i915_gem_object *obj);
 #else
 static inline bool i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) 
{ return false; }
 
 static inline int i915_gem_object_us

Re: [PATCH] drm/omap: Fix issue with clocks left on after resume

2021-04-27 Thread Tony Lindgren

* Tony Lindgren  [210427 10:12]:
> * Tomi Valkeinen  [210427 08:47]:
> > If I understand right, this is only an issue when the dss was not enabled
> > before the system suspend? And as the dispc is not enabled at suspend,
> > pm_runtime_force_suspend and pm_runtime_force_resume don't really do
> > anything. At resume, the DRM resume functionality causes omapdrm to call
> > pm_runtime_get and put, and this somehow causes the dss to stay enabled.
> 
> We do have dss enabled at system suspend from omap_atomic_comit_tail()
> until pm_runtime_force_suspend(). Then we have pm_runtime_force_resume()
> enable it.

Sorry I already forgot that pm_runtime_force_resume() is not enabling
it because pm_runtime_need_not_resume().. It's the omapdrm calling
pm_runtime_get() that enables the hardware on resume.

> Then on resume PM runtime prevents disable of the hardware on resume path
> until after device_complete(). Until then we have rpm_suspend() return
> -EBUSY, and so the parent child_count is not going to get decreased.
> Something would have to handle the -EBUSY error here it seems.

Regards,

Tony
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v3] drm/drm_file.c: Define drm_send_event_helper() as 'static'

2021-04-27 Thread Fabio M. De Francesco

drm_send_event_helper() has not prototype, it has internal linkage and
therefore it should be defined with storage class 'static'.

Signed-off-by: Fabio M. De Francesco 
---

Changes from v2: Removed all the other lines in function comment.
Changes from v1: As suggested by Daniel Vetter, removed unnecessary
kernel-doc comments.

 drivers/gpu/drm/drm_file.c | 14 +-
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index 7efbccffc2ea..d4f0bac6f8f8 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -774,19 +774,7 @@ void drm_event_cancel_free(struct drm_device *dev,
 }
 EXPORT_SYMBOL(drm_event_cancel_free);
 
-/**
- * drm_send_event_helper - send DRM event to file descriptor
- * @dev: DRM device
- * @e: DRM event to deliver
- * @timestamp: timestamp to set for the fence event in kernel's CLOCK_MONOTONIC
- * time domain
- *
- * This helper function sends the event @e, initialized with
- * drm_event_reserve_init(), to its associated userspace DRM file.
- * The timestamp variant of dma_fence_signal is used when the caller
- * sends a valid timestamp.
- */
-void drm_send_event_helper(struct drm_device *dev,
+static void drm_send_event_helper(struct drm_device *dev,
   struct drm_pending_event *e, ktime_t timestamp)
 {
assert_spin_locked(&dev->event_lock);
-- 
2.31.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v5] drm/ast: Fixed CVE for DP501


Hi

Am 21.04.21 um 10:58 schrieb KuoHsiang Chou:

[Bug][DP501]
If ASPEED P2A (PCI to AHB) bridge is disabled and disallowed for
CVE_2019_6260 item3, and then the monitor's EDID is unable read through
Parade DP501.
The reason is the DP501's FW is mapped to BMC addressing space rather
than Host addressing space.
The resolution is that using "pci_iomap_range()" maps to DP501's FW that
stored on the end of FB (Frame Buffer).
In this case, FrameBuffer reserves the last 2MB used for the image of
DP501.



Your patches are missing a short changelog, so that reviewers can see 
what changed between versions. Anyway, I merged your patch into 
drm-misc-next now. Thanks for the fix.



More generally speaking, the DP501 code needs a major refactoring. It's 
currently bolted onto the regular VGA connector code. It should rather 
be a separate connector or a DRM bridge. I always wanted to work on 
this, but don't have a device for testing. If I'd provide patches, would 
you be in a position to test them?


Best regards
Thomas



Signed-off-by: KuoHsiang Chou 
Reported-by: kernel test robot 
---
  drivers/gpu/drm/ast/ast_dp501.c | 139 +++-
  drivers/gpu/drm/ast/ast_drv.h   |  12 +++
  drivers/gpu/drm/ast/ast_main.c  |  11 ++-
  3 files changed, 125 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_dp501.c b/drivers/gpu/drm/ast/ast_dp501.c
index 88121c0e0..cd93c44f2 100644
--- a/drivers/gpu/drm/ast/ast_dp501.c
+++ b/drivers/gpu/drm/ast/ast_dp501.c
@@ -189,6 +189,9 @@ bool ast_backup_fw(struct drm_device *dev, u8 *addr, u32 
size)
u32 i, data;
u32 boot_address;

+   if (ast->config_mode != ast_use_p2a)
+   return false;
+
data = ast_mindwm(ast, 0x1e6e2100) & 0x01;
if (data) {
boot_address = get_fw_base(ast);
@@ -207,6 +210,9 @@ static bool ast_launch_m68k(struct drm_device *dev)
u8 *fw_addr = NULL;
u8 jreg;

+   if (ast->config_mode != ast_use_p2a)
+   return false;
+
data = ast_mindwm(ast, 0x1e6e2100) & 0x01;
if (!data) {

@@ -271,25 +277,55 @@ u8 ast_get_dp501_max_clk(struct drm_device *dev)
struct ast_private *ast = to_ast_private(dev);
u32 boot_address, offset, data;
u8 linkcap[4], linkrate, linklanes, maxclk = 0xff;
+   u32 *plinkcap;

-   boot_address = get_fw_base(ast);
-
-   /* validate FW version */
-   offset = 0xf000;
-   data = ast_mindwm(ast, boot_address + offset);
-   if ((data & 0xf0) != 0x10) /* version: 1x */
-   return maxclk;
-
-   /* Read Link Capability */
-   offset  = 0xf014;
-   *(u32 *)linkcap = ast_mindwm(ast, boot_address + offset);
-   if (linkcap[2] == 0) {
-   linkrate = linkcap[0];
-   linklanes = linkcap[1];
-   data = (linkrate == 0x0a) ? (90 * linklanes) : (54 * linklanes);
-   if (data > 0xff)
-   data = 0xff;
-   maxclk = (u8)data;
+   if (ast->config_mode == ast_use_p2a) {
+   boot_address = get_fw_base(ast);
+
+   /* validate FW version */
+   offset = AST_DP501_GBL_VERSION;
+   data = ast_mindwm(ast, boot_address + offset);
+   if ((data & AST_DP501_FW_VERSION_MASK) != 
AST_DP501_FW_VERSION_1) /* version: 1x */
+   return maxclk;
+
+   /* Read Link Capability */
+   offset  = AST_DP501_LINKRATE;
+   plinkcap = (u32 *)linkcap;
+   *plinkcap  = ast_mindwm(ast, boot_address + offset);
+   if (linkcap[2] == 0) {
+   linkrate = linkcap[0];
+   linklanes = linkcap[1];
+   data = (linkrate == 0x0a) ? (90 * linklanes) : (54 * 
linklanes);
+   if (data > 0xff)
+   data = 0xff;
+   maxclk = (u8)data;
+   }
+   } else {
+   if (!ast->dp501_fw_buf)
+   return AST_DP501_DEFAULT_DCLK;  /* 1024x768 as default 
*/
+
+   /* dummy read */
+   offset = 0x;
+   data = readl(ast->dp501_fw_buf + offset);
+
+   /* validate FW version */
+   offset = AST_DP501_GBL_VERSION;
+   data = readl(ast->dp501_fw_buf + offset);
+   if ((data & AST_DP501_FW_VERSION_MASK) != 
AST_DP501_FW_VERSION_1) /* version: 1x */
+   return maxclk;
+
+   /* Read Link Capability */
+   offset = AST_DP501_LINKRATE;
+   plinkcap = (u32 *)linkcap;
+   *plinkcap = readl(ast->dp501_fw_buf + offset);
+   if (linkcap[2] == 0) {
+   linkrate = linkcap[0];
+   linklanes = linkcap[1];
+   data = (linkrate == 0x0a) ? (90 * linklanes) : (54 * 
linklanes);
+   if (data > 0xff)
+

Re: [PATCH v2 01/13] vfio/mdev: Remove CONFIG_VFIO_MDEV_DEVICE

2021-04-27 Thread Cornelia Huck

On Mon, 26 Apr 2021 17:00:03 -0300
Jason Gunthorpe  wrote:

> For some reason the vfio_mdev shim mdev_driver has its own module and
> kconfig. As the next patch requires access to it from mdev.ko merge the
> two modules together and remove VFIO_MDEV_DEVICE.
> 
> A later patch deletes this driver entirely.
> 
> Signed-off-by: Jason Gunthorpe 
> ---
>  Documentation/s390/vfio-ap.rst   |  1 -
>  arch/s390/Kconfig|  2 +-
>  drivers/gpu/drm/i915/Kconfig |  2 +-
>  drivers/vfio/mdev/Kconfig|  7 ---
>  drivers/vfio/mdev/Makefile   |  3 +--
>  drivers/vfio/mdev/mdev_core.c| 16 ++--
>  drivers/vfio/mdev/mdev_private.h |  2 ++
>  drivers/vfio/mdev/vfio_mdev.c| 24 +---
>  samples/Kconfig  |  6 +++---
>  9 files changed, 23 insertions(+), 40 deletions(-)

This also fixes the dependencies for vfio-ccw, which never depended on
VFIO_MDEV_DEVICE directly...

Reviewed-by: Cornelia Huck 

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 1/4] fbtft: Replace custom ->reset() with generic one

2021-04-27 Thread Greg Kroah-Hartman

On Fri, Apr 16, 2021 at 05:20:41PM +0300, Andy Shevchenko wrote:
> The custom ->reset() repeats the generic one, replace it.
> 
> Note, in newer kernels the context of the function is a sleeping one,
> it's fine to switch over to the sleeping functions. Keeping the reset
> line asserted longer than 20 microseconds is also okay, it's an idling
> state of the hardware.
> 
> Fixes: b2ebd4be6fa1 ("staging: fbtft: add fb_agm1264k-fl driver")

What does this "fix"?  A bug or just a "it shouldn't have been done this
way"?

And as others pointed out, if you could put "staging: fbtft:" as a
prefix here, that would be much better.

thanks,

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH] drm/i915/gem: Remove reference to struct drm_device.pdev

References to struct drm_device.pdev should be used any longer as
the field will be moved into the struct's legacy section. Add a fix
for the rsp commit.

Signed-off-by: Thomas Zimmermann 
Fixes: d57d4a1daf5e ("drm/i915: Create stolen memory region from local memory")
Cc: CQ Tang 
Cc: Matthew Auld 
Cc: Tvrtko Ursulin 
Cc: Xinyun Liu 
Cc: Tvrtko Ursulin 
Cc: Jani Nikula 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
Cc: Chris Wilson 
Cc: Mika Kuoppala 
Cc: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: "Thomas Hellström" 
Cc: "Gustavo A. R. Silva" 
Cc: Dan Carpenter 
Cc: intel-...@lists.freedesktop.org
---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index c5b64b2400e8..e1a32672bbe8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -773,7 +773,7 @@ struct intel_memory_region *
 i915_gem_stolen_lmem_setup(struct drm_i915_private *i915)
 {
struct intel_uncore *uncore = &i915->uncore;
-   struct pci_dev *pdev = i915->drm.pdev;
+   struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
struct intel_memory_region *mem;
resource_size_t io_start;
resource_size_t lmem_size;
-- 
2.31.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v7 0/4] drm: Move struct drm_device.pdev to legacy

V7 of the patchset fixes some bitrot in the intel driver.

The pdev field in struct drm_device points to a PCI device structure and
goes back to UMS-only days when all DRM drivers were for PCI devices.
Meanwhile we also support USB, SPI and platform devices. Each of those
uses the generic device stored in struct drm_device.dev.

To reduce duplication and remove the special case of PCI, this patchset
converts all modesetting drivers from pdev to dev and makes pdev a field
for legacy UMS drivers.

For PCI devices, the pointer in struct drm_device.dev can be upcasted to
struct pci_device; or tested for PCI with dev_is_pci(). In several places
the code can use the dev field directly.

After converting all drivers and the DRM core, the pdev fields becomes
only relevant for legacy drivers. In a later patchset, we may want to
convert these as well and remove pdev entirely.

v7:
* fix instances of pdev that have benn added under i915/
v6:
* also remove assignment in i915/selftests in later patch (Chris)
v5:
* remove assignment in later patch (Chris)
v4:
* merged several patches
* moved core changes into separate patch
* vmwgfx build fix
v3:
* merged several patches
* fix one pdev reference in nouveau (Jeremy)
* rebases
v2:
* move whitespace fixes into separate patches (Alex, Sam)
* move i915 gt/ and gvt/ changes into separate patches (Joonas)

Thomas Zimmermann (4):
  drm/i915/gt: Remove reference to struct drm_device.pdev
  drm/i915: Remove reference to struct drm_device.pdev
  drm/i915: Don't assign to struct drm_device.pdev
  drm: Move struct drm_device.pdev to legacy section

 drivers/gpu/drm/i915/gt/intel_region_lmem.c  | 2 +-
 drivers/gpu/drm/i915/i915_drv.c  | 1 -
 drivers/gpu/drm/i915/intel_runtime_pm.h  | 2 +-
 drivers/gpu/drm/i915/selftests/mock_gem_device.c | 1 -
 include/drm/drm_device.h | 6 +++---
 5 files changed, 5 insertions(+), 7 deletions(-)

--
2.31.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v7 3/4] drm/i915: Don't assign to struct drm_device.pdev

Using struct drm_device.pdev is deprecated. Don't assign it. Users
should upcast from struct drm_device.dev.

v6:
* also fix the assignment in selftests in this patch (Chris)

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Chris Wilson 
Cc: Jani Nikula 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/i915_drv.c  | 1 -
 drivers/gpu/drm/i915/selftests/mock_gem_device.c | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 785dcf20c77b..db513f93f0f5 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -758,7 +758,6 @@ i915_driver_create(struct pci_dev *pdev, const struct 
pci_device_id *ent)
if (IS_ERR(i915))
return i915;
 
-   i915->drm.pdev = pdev;
pci_set_drvdata(pdev, i915);
 
/* Device parameters start as a copy of module parameters. */
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c 
b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 2ffc763fe90d..cf40004bc92a 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -146,7 +146,6 @@ struct drm_i915_private *mock_gem_device(void)
}
 
pci_set_drvdata(pdev, i915);
-   i915->drm.pdev = pdev;
 
dev_pm_domain_set(&pdev->dev, &pm_domain);
pm_runtime_enable(&pdev->dev);
-- 
2.31.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v7 1/4] drm/i915/gt: Remove reference to struct drm_device.pdev

References to struct drm_device.pdev should be used any longer as
the field will be moved into the struct's legacy section. Add a fix
for the rsp commit.

Signed-off-by: Thomas Zimmermann 
Fixes: a50ca39fbd01 ("drm/i915: setup the LMEM region")
Cc: Lucas De Marchi 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
Cc: Matthew Auld 
Cc: Jani Nikula 
Cc: Chris Wilson 
Cc: Daniel Vetter 
Cc: Tvrtko Ursulin 
Cc: Daniele Ceraolo Spurio 
Cc: Mika Kuoppala 
Cc: Maarten Lankhorst 
Cc: Venkata Sandeep Dhanalakota 
Cc: "Michał Winiarski" 
---
 drivers/gpu/drm/i915/gt/intel_region_lmem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c 
b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
index be6f2c8f5184..73fceb0c25fc 100644
--- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
@@ -177,7 +177,7 @@ static struct intel_memory_region *setup_lmem(struct 
intel_gt *gt)
 {
struct drm_i915_private *i915 = gt->i915;
struct intel_uncore *uncore = gt->uncore;
-   struct pci_dev *pdev = i915->drm.pdev;
+   struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
struct intel_memory_region *mem;
resource_size_t io_start;
resource_size_t lmem_size;
-- 
2.31.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v7 2/4] drm/i915: Remove reference to struct drm_device.pdev

References to struct drm_device.pdev should be used any longer as
the field will be moved into the struct's legacy section. Fix a rsp
comment.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/i915/intel_runtime_pm.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.h 
b/drivers/gpu/drm/i915/intel_runtime_pm.h
index 1e4ddd11c12b..183ea2b187fe 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.h
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.h
@@ -49,7 +49,7 @@ enum i915_drm_suspend_mode {
  */
 struct intel_runtime_pm {
atomic_t wakeref_count;
-   struct device *kdev; /* points to i915->drm.pdev->dev */
+   struct device *kdev; /* points to i915->drm.dev */
bool available;
bool suspended;
bool irqs_enabled;
-- 
2.31.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v7 4/4] drm: Move struct drm_device.pdev to legacy section

Struct drm_device.pdev is being moved to legacy status as only legacy
DRM drivers use it. A possible follow-up patchset could remove pdev
entirely.

v4:
* rebased

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Chris Wilson 
Acked-by: Sam Ravnborg 
---
 include/drm/drm_device.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
index d647223e8390..c5a195676e8f 100644
--- a/include/drm/drm_device.h
+++ b/include/drm/drm_device.h
@@ -279,9 +279,6 @@ struct drm_device {
/** @agp: AGP data */
struct drm_agp_head *agp;
 
-   /** @pdev: PCI device structure */
-   struct pci_dev *pdev;
-
/** @num_crtcs: Number of CRTCs on this device */
unsigned int num_crtcs;
 
@@ -324,6 +321,9 @@ struct drm_device {
/* List of devices per driver for stealth attach cleanup */
struct list_head legacy_dev_list;
 
+   /* PCI device structure */
+   struct pci_dev *pdev;
+
 #ifdef __alpha__
/** @hose: PCI hose, only used on ALPHA platforms. */
struct pci_controller *hose;
-- 
2.31.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/gud: cleanup coding style a bit

2021-04-27 Thread Noralf Trønnes




Den 02.04.2021 10.55, skrev Bernard Zhao:
> Fix coccicheck warning:
> drivers/gpu/drm/gud/gud_internal.h:89:2-3: Unneeded semicolon
> drivers/gpu/drm/gud/gud_internal.h:107:2-3: Unneeded semicolon
> 
> Signed-off-by: Bernard Zhao 
> ---

Applied to drm-misc-next.

Thanks,
Noralf.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v1 1/7] drm/st7735r: Avoid spamming logs if probe is deferred

2021-04-27 Thread Noralf Trønnes




Den 21.04.2021 18.31, skrev Andy Shevchenko:
> The GPIO request can fail and probe may be deferred. Thus,
> the error message may be printed again and again. Avoid
> this by replacing DRM_DEV_ERROR() by dev_err_probe().
> 
> Signed-off-by: Andy Shevchenko 
> ---

Series is applied to drm-misc-next.

Thanks,
Noralf.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v2] drm/amd/amdgpu: Fix errors in documentation of function parameters

2021-04-27 Thread Fabio M. De Francesco

In the documentation of functions, removed excess parameters, described
undocumented ones, and fixed syntax errors.

Signed-off-by: Fabio M. De Francesco 
---

Changes from v1: Cc'ed all the maintainers.

 drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c  | 12 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c  |  4 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  8 
 3 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
index 2e9b16fb3fcd..bf2939b6eb43 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
@@ -76,7 +76,7 @@ struct amdgpu_atif {
 /**
  * amdgpu_atif_call - call an ATIF method
  *
- * @handle: acpi handle
+ * @atif: acpi handle
  * @function: the ATIF function to execute
  * @params: ATIF function params
  *
@@ -166,7 +166,6 @@ static void amdgpu_atif_parse_functions(struct 
amdgpu_atif_functions *f, u32 mas
 /**
  * amdgpu_atif_verify_interface - verify ATIF
  *
- * @handle: acpi handle
  * @atif: amdgpu atif struct
  *
  * Execute the ATIF_FUNCTION_VERIFY_INTERFACE ATIF function
@@ -240,8 +239,7 @@ static acpi_handle amdgpu_atif_probe_handle(acpi_handle 
dhandle)
 /**
  * amdgpu_atif_get_notification_params - determine notify configuration
  *
- * @handle: acpi handle
- * @n: atif notification configuration struct
+ * @atif: acpi handle
  *
  * Execute the ATIF_FUNCTION_GET_SYSTEM_PARAMETERS ATIF function
  * to determine if a notifier is used and if so which one
@@ -304,7 +302,7 @@ static int amdgpu_atif_get_notification_params(struct 
amdgpu_atif *atif)
 /**
  * amdgpu_atif_query_backlight_caps - get min and max backlight input signal
  *
- * @handle: acpi handle
+ * @atif: acpi handle
  *
  * Execute the QUERY_BRIGHTNESS_TRANSFER_CHARACTERISTICS ATIF function
  * to determine the acceptable range of backlight values
@@ -363,7 +361,7 @@ static int amdgpu_atif_query_backlight_caps(struct 
amdgpu_atif *atif)
 /**
  * amdgpu_atif_get_sbios_requests - get requested sbios event
  *
- * @handle: acpi handle
+ * @atif: acpi handle
  * @req: atif sbios request struct
  *
  * Execute the ATIF_FUNCTION_GET_SYSTEM_BIOS_REQUESTS ATIF function
@@ -899,6 +897,8 @@ void amdgpu_acpi_fini(struct amdgpu_device *adev)
 /**
  * amdgpu_acpi_is_s0ix_supported
  *
+ * @adev: amdgpu_device_pointer
+ *
  * returns true if supported, false if not.
  */
 bool amdgpu_acpi_is_s0ix_supported(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
index 5af464933976..98d31ebad9ce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
@@ -111,6 +111,8 @@ static const char *amdkfd_fence_get_timeline_name(struct 
dma_fence *f)
  *  a KFD BO and schedules a job to move the BO.
  *  If fence is already signaled return true.
  *  If fence is not signaled schedule a evict KFD process work item.
+ *
+ *  @f: dma_fence
  */
 static bool amdkfd_fence_enable_signaling(struct dma_fence *f)
 {
@@ -131,7 +133,7 @@ static bool amdkfd_fence_enable_signaling(struct dma_fence 
*f)
 /**
  * amdkfd_fence_release - callback that fence can be freed
  *
- * @fence: fence
+ * @f: dma_fence
  *
  * This function is called when the reference count becomes zero.
  * Drops the mm_struct reference and RCU schedules freeing up the fence.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
index b43e68fc1378..ed3014fbb563 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
@@ -719,7 +719,7 @@ static void unlock_spi_csq_mutexes(struct amdgpu_device 
*adev)
 }
 
 /**
- * @get_wave_count: Read device registers to get number of waves in flight for
+ * get_wave_count: Read device registers to get number of waves in flight for
  * a particular queue. The method also returns the VMID associated with the
  * queue.
  *
@@ -755,19 +755,19 @@ static void get_wave_count(struct amdgpu_device *adev, 
int queue_idx,
 }
 
 /**
- * @kgd_gfx_v9_get_cu_occupancy: Reads relevant registers associated with each
+ * kgd_gfx_v9_get_cu_occupancy: Reads relevant registers associated with each
  * shader engine and aggregates the number of waves that are in flight for the
  * process whose pasid is provided as a parameter. The process could have ZERO
  * or more queues running and submitting waves to compute units.
  *
  * @kgd: Handle of device from which to get number of waves in flight
  * @pasid: Identifies the process for which this query call is invoked
- * @wave_cnt: Output parameter updated with number of waves in flight that
+ * @pasid_wave_cnt: Output parameter updated with number of waves in flight 
that
  * belong to process with given pasid
  * @max_waves_per_cu: Output parameter updated with maximum number of waves

Re: [PATCH 8/8] drm/modifiers: Enforce consistency between the cap an IN_FORMATS

2021-04-27 Thread Emil Velikov

Hi Daniel,

On Tue, 27 Apr 2021 at 10:20, Daniel Vetter  wrote:

> @@ -360,6 +373,9 @@ static int __drm_universal_plane_init(struct drm_device 
> *dev,
>   * drm_universal_plane_init() to let the DRM managed resource infrastructure
>   * take care of cleanup and deallocation.
>   *
> + * Drivers supporting modifiers must set @format_modifiers on all their 
> planes,
> + * even those that only support DRM_FORMAT_MOD_LINEAR.
> + *
The comment says "must", yet we have an "if (format_modifiers)" in the codebase.
Shouldn't we add a WARN_ON() + return -EINVAL (or similar) so people
can see and fix their drivers?

As a follow-up one could even go a step further, by erroring out when
the driver hasn't provided valid modifier(s) and even removing
config::allow_fb_modifiers all together.

Although for stable - this series + WARN_ON (no return since it might
break buggy drivers) sounds good.

> @@ -909,6 +909,8 @@ struct drm_mode_config {
>  * @allow_fb_modifiers:
>  *
>  * Whether the driver supports fb modifiers in the ADDFB2.1 ioctl 
> call.
> +* Note that drivers should not set this directly, it is automatically
> +* set in drm_universal_plane_init().
>  *
>  * IMPORTANT:
>  *
The new note and the existing IMPORTANT are in a weird mix.
Quoting the latter since it doesn't show in the diff.

If this is set the driver must fill out the full implicit modifier
information in their &drm_mode_config_funcs.fb_create hook for legacy
userspace which does not set modifiers. Otherwise the GETFB2 ioctl is
broken for modifier aware userspace.

In particular:
As the new note says "don't set it" and the existing note one says "if
it's set". Yet no drivers do "if (config->allow_fb_modifiers)".

Sadly, nothing comes to mind atm wrt alternative wording.

With the WARN_ON() added or s/must/should/ in the documentation, the series is:
Reviewed-by: Emil Velikov 

HTH
-Emil
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

If we don't use future fences for DMA fences at all, e.g. we don't use them
for memory management, it can work, right? Memory management can suspend
user queues anytime. It doesn't need to use DMA fences. There might be
something that I'm missing here.

What would we lose without DMA fences? Just inter-device synchronization? I
think that might be acceptable.

The only case when the kernel will wait on a future fence is before a page
flip. Everything today already depends on userspace not hanging the gpu,
which makes everything a future fence.

Marek

On Tue., Apr. 27, 2021, 04:02 Daniel Vetter,  wrote:

> On Mon, Apr 26, 2021 at 04:59:28PM -0400, Marek Olšák wrote:
> > Thanks everybody. The initial proposal is dead. Here are some thoughts on
> > how to do it differently.
> >
> > I think we can have direct command submission from userspace via
> > memory-mapped queues ("user queues") without changing window systems.
> >
> > The memory management doesn't have to use GPU page faults like HMM.
> > Instead, it can wait for user queues of a specific process to go idle and
> > then unmap the queues, so that userspace can't submit anything. Buffer
> > evictions, pinning, etc. can be executed when all queues are unmapped
> > (suspended). Thus, no BO fences and page faults are needed.
> >
> > Inter-process synchronization can use timeline semaphores. Userspace will
> > query the wait and signal value for a shared buffer from the kernel. The
> > kernel will keep a history of those queries to know which process is
> > responsible for signalling which buffer. There is only the wait-timeout
> > issue and how to identify the culprit. One of the solutions is to have
> the
> > GPU send all GPU signal commands and all timed out wait commands via an
> > interrupt to the kernel driver to monitor and validate userspace
> behavior.
> > With that, it can be identified whether the culprit is the waiting
> process
> > or the signalling process and which one. Invalid signal/wait parameters
> can
> > also be detected. The kernel can force-signal only the semaphores that
> time
> > out, and punish the processes which caused the timeout or used invalid
> > signal/wait parameters.
> >
> > The question is whether this synchronization solution is robust enough
> for
> > dma_fence and whatever the kernel and window systems need.
>
> The proper model here is the preempt-ctx dma_fence that amdkfd uses
> (without page faults). That means dma_fence for synchronization is doa, at
> least as-is, and we're back to figuring out the winsys problem.
>
> "We'll solve it with timeouts" is very tempting, but doesn't work. It's
> akin to saying that we're solving deadlock issues in a locking design by
> doing a global s/mutex_lock/mutex_lock_timeout/ in the kernel. Sure it
> avoids having to reach the reset button, but that's about it.
>
> And the fundamental problem is that once you throw in userspace command
> submission (and syncing, at least within the userspace driver, otherwise
> there's kinda no point if you still need the kernel for cross-engine sync)
> means you get deadlocks if you still use dma_fence for sync under
> perfectly legit use-case. We've discussed that one ad nauseam last summer:
>
>
> https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html?highlight=dma_fence#indefinite-dma-fences
>
> See silly diagramm at the bottom.
>
> Now I think all isn't lost, because imo the first step to getting to this
> brave new world is rebuilding the driver on top of userspace fences, and
> with the adjusted cmd submit model. You probably don't want to use amdkfd,
> but port that as a context flag or similar to render nodes for gl/vk. Of
> course that means you can only use this mode in headless, without
> glx/wayland winsys support, but it's a start.
> -Daniel
>
> >
> > Marek
> >
> > On Tue, Apr 20, 2021 at 4:34 PM Daniel Stone 
> wrote:
> >
> > > Hi,
> > >
> > > On Tue, 20 Apr 2021 at 20:30, Daniel Vetter  wrote:
> > >
> > >> The thing is, you can't do this in drm/scheduler. At least not without
> > >> splitting up the dma_fence in the kernel into separate memory fences
> > >> and sync fences
> > >
> > >
> > > I'm starting to think this thread needs its own glossary ...
> > >
> > > I propose we use 'residency fence' for execution fences which enact
> > > memory-residency operations, e.g. faulting in a page ultimately
> depending
> > > on GPU work retiring.
> > >
> > > And 'value fence' for the pure-userspace model suggested by timeline
> > > semaphores, i.e. fences being (*addr == val) rather than being able to
> look
> > > at ctx seqno.
> > >
> > > Cheers,
> > > Daniel
> > > ___
> > > mesa-dev mailing list
> > > mesa-...@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
ht

Re: [PATCH v7 0/4] drm: Move struct drm_device.pdev to legacy

On Tue, 27 Apr 2021, Thomas Zimmermann  wrote:
> V7 of the patchset fixes some bitrot in the intel driver.
>
> The pdev field in struct drm_device points to a PCI device structure and
> goes back to UMS-only days when all DRM drivers were for PCI devices.
> Meanwhile we also support USB, SPI and platform devices. Each of those
> uses the generic device stored in struct drm_device.dev.
>
> To reduce duplication and remove the special case of PCI, this patchset
> converts all modesetting drivers from pdev to dev and makes pdev a field
> for legacy UMS drivers.
>
> For PCI devices, the pointer in struct drm_device.dev can be upcasted to
> struct pci_device; or tested for PCI with dev_is_pci(). In several places
> the code can use the dev field directly.
>
> After converting all drivers and the DRM core, the pdev fields becomes
> only relevant for legacy drivers. In a later patchset, we may want to
> convert these as well and remove pdev entirely.

On the series,

Reviewed-by: Jani Nikula 

How should we merge these?



>
> v7:
>   * fix instances of pdev that have benn added under i915/
> v6:
>   * also remove assignment in i915/selftests in later patch (Chris)
> v5:
>   * remove assignment in later patch (Chris)
> v4:
>   * merged several patches
>   * moved core changes into separate patch
>   * vmwgfx build fix
> v3:
>   * merged several patches
>   * fix one pdev reference in nouveau (Jeremy)
>   * rebases
> v2:
>   * move whitespace fixes into separate patches (Alex, Sam)
>   * move i915 gt/ and gvt/ changes into separate patches (Joonas)
>
> Thomas Zimmermann (4):
>   drm/i915/gt: Remove reference to struct drm_device.pdev
>   drm/i915: Remove reference to struct drm_device.pdev
>   drm/i915: Don't assign to struct drm_device.pdev
>   drm: Move struct drm_device.pdev to legacy section
>
>  drivers/gpu/drm/i915/gt/intel_region_lmem.c  | 2 +-
>  drivers/gpu/drm/i915/i915_drv.c  | 1 -
>  drivers/gpu/drm/i915/intel_runtime_pm.h  | 2 +-
>  drivers/gpu/drm/i915/selftests/mock_gem_device.c | 1 -
>  include/drm/drm_device.h | 6 +++---
>  5 files changed, 5 insertions(+), 7 deletions(-)
>
> --
> 2.31.1
>

-- 
Jani Nikula, Intel Open Source Graphics Center
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

Correct, we wouldn't have synchronization between device with and 
without user queues any more.

That could only be a problem for A+I Laptops.

Memory management will just work with preemption fences which pause the 
user queues of a process before evicting something. That will be a 
dma_fence, but also a well known approach.

Christian.

Am 27.04.21 um 13:49 schrieb Marek Olšák:
If we don't use future fences for DMA fences at all, e.g. we don't use 
them for memory management, it can work, right? Memory management can 
suspend user queues anytime. It doesn't need to use DMA fences. There 
might be something that I'm missing here.

What would we lose without DMA fences? Just inter-device 
synchronization? I think that might be acceptable.

The only case when the kernel will wait on a future fence is before a 
page flip. Everything today already depends on userspace not hanging 
the gpu, which makes everything a future fence.

Marek

On Tue., Apr. 27, 2021, 04:02 Daniel Vetter, > wrote:

On Mon, Apr 26, 2021 at 04:59:28PM -0400, Marek Olšák wrote:
> Thanks everybody. The initial proposal is dead. Here are some
thoughts on
> how to do it differently.
>
> I think we can have direct command submission from userspace via
> memory-mapped queues ("user queues") without changing window
systems.
>
> The memory management doesn't have to use GPU page faults like HMM.
> Instead, it can wait for user queues of a specific process to go
idle and
> then unmap the queues, so that userspace can't submit anything.
Buffer
> evictions, pinning, etc. can be executed when all queues are
unmapped
> (suspended). Thus, no BO fences and page faults are needed.
>
> Inter-process synchronization can use timeline semaphores.
Userspace will
> query the wait and signal value for a shared buffer from the
kernel. The
> kernel will keep a history of those queries to know which process is
> responsible for signalling which buffer. There is only the
wait-timeout
> issue and how to identify the culprit. One of the solutions is
to have the
> GPU send all GPU signal commands and all timed out wait commands
via an
> interrupt to the kernel driver to monitor and validate userspace
behavior.
> With that, it can be identified whether the culprit is the
waiting process
> or the signalling process and which one. Invalid signal/wait
parameters can
> also be detected. The kernel can force-signal only the
semaphores that time
> out, and punish the processes which caused the timeout or used
invalid
> signal/wait parameters.
>
> The question is whether this synchronization solution is robust
enough for
> dma_fence and whatever the kernel and window systems need.

The proper model here is the preempt-ctx dma_fence that amdkfd uses
(without page faults). That means dma_fence for synchronization is
doa, at
least as-is, and we're back to figuring out the winsys problem.

"We'll solve it with timeouts" is very tempting, but doesn't work.
It's
akin to saying that we're solving deadlock issues in a locking
design by
doing a global s/mutex_lock/mutex_lock_timeout/ in the kernel. Sure it
avoids having to reach the reset button, but that's about it.

And the fundamental problem is that once you throw in userspace
command
submission (and syncing, at least within the userspace driver,
otherwise
there's kinda no point if you still need the kernel for
cross-engine sync)
means you get deadlocks if you still use dma_fence for sync under
perfectly legit use-case. We've discussed that one ad nauseam last
summer:

https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html?highlight=dma_fence#indefinite-dma-fences

See silly diagramm at the bottom.

Now I think all isn't lost, because imo the first step to getting
to this
brave new world is rebuilding the driver on top of userspace
fences, and
with the adjusted cmd submit model. You probably don't want to use
amdkfd,
but port that as a context flag or similar to render nodes for
gl/vk. Of
course that means you can only use this mode in headless, without
glx/wayland winsys support, but it's a start.
-Daniel

>
> Marek
>
> On Tue, Apr 20, 2021 at 4:34 PM Daniel Stone
mailto:dan...@fooishbar.org>> wrote:
>
> > Hi,
> >
> > On Tue, 20 Apr 2021 at 20:30, Daniel Vetter mailto:dan...@ffwll.ch>> wrote:
> >
> >> The thing is, you can't do this in drm/scheduler. At least
not without
> >> splitting up the dma_fence in the kernel into separate memory
fences
> >> and sync fences
> >
> >
> > I'm starting to think this thread need

Re: [PATCH v2 1/4] fbtft: Replace custom ->reset() with generic one

2021-04-27 Thread Andy Shevchenko

On Tue, Apr 27, 2021 at 2:09 PM Greg Kroah-Hartman
 wrote:
> On Fri, Apr 16, 2021 at 05:20:41PM +0300, Andy Shevchenko wrote:
> > The custom ->reset() repeats the generic one, replace it.
> >
> > Note, in newer kernels the context of the function is a sleeping one,
> > it's fine to switch over to the sleeping functions. Keeping the reset
> > line asserted longer than 20 microseconds is also okay, it's an idling
> > state of the hardware.
> >
> > Fixes: b2ebd4be6fa1 ("staging: fbtft: add fb_agm1264k-fl driver")
>
> What does this "fix"?  A bug or just a "it shouldn't have been done this
> way"?

There is nothing to fix actually, it's rather a pointer where this
change has been missed for some reason. I'll remove the tag.

> And as others pointed out, if you could put "staging: fbtft:" as a
> prefix here, that would be much better.

Got it, thanks!

-- 
With Best Regards,
Andy Shevchenko
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v7 0/4] drm: Move struct drm_device.pdev to legacy


Hi Jani

Am 27.04.21 um 14:04 schrieb Jani Nikula:

On Tue, 27 Apr 2021, Thomas Zimmermann  wrote:

V7 of the patchset fixes some bitrot in the intel driver.

The pdev field in struct drm_device points to a PCI device structure and
goes back to UMS-only days when all DRM drivers were for PCI devices.
Meanwhile we also support USB, SPI and platform devices. Each of those
uses the generic device stored in struct drm_device.dev.

To reduce duplication and remove the special case of PCI, this patchset
converts all modesetting drivers from pdev to dev and makes pdev a field
for legacy UMS drivers.

For PCI devices, the pointer in struct drm_device.dev can be upcasted to
struct pci_device; or tested for PCI with dev_is_pci(). In several places
the code can use the dev field directly.

After converting all drivers and the DRM core, the pdev fields becomes
only relevant for legacy drivers. In a later patchset, we may want to
convert these as well and remove pdev entirely.


On the series,

Reviewed-by: Jani Nikula 

How should we merge these?


Thanks for the quick reply.

There is another pdev patch that I just sent out. [1] It has to go into 
the intel tree. After it landed, I want to get this patchset into 
drm-misc-next ASAP. Otherwise, drm-tip would stop building.


This should fix things in the correct order and finally remove pdev for 
current drivers.


Best regards
Thomas

[1] 
https://lore.kernel.org/dri-devel/20210427110747.2065-1-tzimmerm...@suse.de/T/#u








v7:
* fix instances of pdev that have benn added under i915/
v6:
* also remove assignment in i915/selftests in later patch (Chris)
v5:
* remove assignment in later patch (Chris)
v4:
* merged several patches
* moved core changes into separate patch
* vmwgfx build fix
v3:
* merged several patches
* fix one pdev reference in nouveau (Jeremy)
* rebases
v2:
* move whitespace fixes into separate patches (Alex, Sam)
* move i915 gt/ and gvt/ changes into separate patches (Joonas)

Thomas Zimmermann (4):
   drm/i915/gt: Remove reference to struct drm_device.pdev
   drm/i915: Remove reference to struct drm_device.pdev
   drm/i915: Don't assign to struct drm_device.pdev
   drm: Move struct drm_device.pdev to legacy section

  drivers/gpu/drm/i915/gt/intel_region_lmem.c  | 2 +-
  drivers/gpu/drm/i915/i915_drv.c  | 1 -
  drivers/gpu/drm/i915/intel_runtime_pm.h  | 2 +-
  drivers/gpu/drm/i915/selftests/mock_gem_device.c | 1 -
  include/drm/drm_device.h | 6 +++---
  5 files changed, 5 insertions(+), 7 deletions(-)

--
2.31.1





--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

Ok. I'll interpret this as "yes, it will work, let's do it".

Marek

On Tue., Apr. 27, 2021, 08:06 Christian König, <
ckoenig.leichtzumer...@gmail.com> wrote:

> Correct, we wouldn't have synchronization between device with and without
> user queues any more.
>
> That could only be a problem for A+I Laptops.
>
> Memory management will just work with preemption fences which pause the
> user queues of a process before evicting something. That will be a
> dma_fence, but also a well known approach.
>
> Christian.
>
> Am 27.04.21 um 13:49 schrieb Marek Olšák:
>
> If we don't use future fences for DMA fences at all, e.g. we don't use
> them for memory management, it can work, right? Memory management can
> suspend user queues anytime. It doesn't need to use DMA fences. There might
> be something that I'm missing here.
>
> What would we lose without DMA fences? Just inter-device synchronization?
> I think that might be acceptable.
>
> The only case when the kernel will wait on a future fence is before a page
> flip. Everything today already depends on userspace not hanging the gpu,
> which makes everything a future fence.
>
> Marek
>
> On Tue., Apr. 27, 2021, 04:02 Daniel Vetter,  wrote:
>
>> On Mon, Apr 26, 2021 at 04:59:28PM -0400, Marek Olšák wrote:
>> > Thanks everybody. The initial proposal is dead. Here are some thoughts
>> on
>> > how to do it differently.
>> >
>> > I think we can have direct command submission from userspace via
>> > memory-mapped queues ("user queues") without changing window systems.
>> >
>> > The memory management doesn't have to use GPU page faults like HMM.
>> > Instead, it can wait for user queues of a specific process to go idle
>> and
>> > then unmap the queues, so that userspace can't submit anything. Buffer
>> > evictions, pinning, etc. can be executed when all queues are unmapped
>> > (suspended). Thus, no BO fences and page faults are needed.
>> >
>> > Inter-process synchronization can use timeline semaphores. Userspace
>> will
>> > query the wait and signal value for a shared buffer from the kernel. The
>> > kernel will keep a history of those queries to know which process is
>> > responsible for signalling which buffer. There is only the wait-timeout
>> > issue and how to identify the culprit. One of the solutions is to have
>> the
>> > GPU send all GPU signal commands and all timed out wait commands via an
>> > interrupt to the kernel driver to monitor and validate userspace
>> behavior.
>> > With that, it can be identified whether the culprit is the waiting
>> process
>> > or the signalling process and which one. Invalid signal/wait parameters
>> can
>> > also be detected. The kernel can force-signal only the semaphores that
>> time
>> > out, and punish the processes which caused the timeout or used invalid
>> > signal/wait parameters.
>> >
>> > The question is whether this synchronization solution is robust enough
>> for
>> > dma_fence and whatever the kernel and window systems need.
>>
>> The proper model here is the preempt-ctx dma_fence that amdkfd uses
>> (without page faults). That means dma_fence for synchronization is doa, at
>> least as-is, and we're back to figuring out the winsys problem.
>>
>> "We'll solve it with timeouts" is very tempting, but doesn't work. It's
>> akin to saying that we're solving deadlock issues in a locking design by
>> doing a global s/mutex_lock/mutex_lock_timeout/ in the kernel. Sure it
>> avoids having to reach the reset button, but that's about it.
>>
>> And the fundamental problem is that once you throw in userspace command
>> submission (and syncing, at least within the userspace driver, otherwise
>> there's kinda no point if you still need the kernel for cross-engine sync)
>> means you get deadlocks if you still use dma_fence for sync under
>> perfectly legit use-case. We've discussed that one ad nauseam last summer:
>>
>>
>> https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html?highlight=dma_fence#indefinite-dma-fences
>>
>> See silly diagramm at the bottom.
>>
>> Now I think all isn't lost, because imo the first step to getting to this
>> brave new world is rebuilding the driver on top of userspace fences, and
>> with the adjusted cmd submit model. You probably don't want to use amdkfd,
>> but port that as a context flag or similar to render nodes for gl/vk. Of
>> course that means you can only use this mode in headless, without
>> glx/wayland winsys support, but it's a start.
>> -Daniel
>>
>> >
>> > Marek
>> >
>> > On Tue, Apr 20, 2021 at 4:34 PM Daniel Stone 
>> wrote:
>> >
>> > > Hi,
>> > >
>> > > On Tue, 20 Apr 2021 at 20:30, Daniel Vetter  wrote:
>> > >
>> > >> The thing is, you can't do this in drm/scheduler. At least not
>> without
>> > >> splitting up the dma_fence in the kernel into separate memory fences
>> > >> and sync fences
>> > >
>> > >
>> > > I'm starting to think this thread needs its own glossary ...
>> > >
>> > > I propose we use 'residency fence' for execution fences which enact
>> > > memory-residency

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

On Tue, Apr 27, 2021 at 1:49 PM Marek Olšák  wrote:
>
> If we don't use future fences for DMA fences at all, e.g. we don't use them 
> for memory management, it can work, right? Memory management can suspend user 
> queues anytime. It doesn't need to use DMA fences. There might be something 
> that I'm missing here.

Other drivers use dma_fence for their memory management. So unles
you've converted them all over to the dma_fence/memory fence split,
dma_fence fences stay memory fences. In theory this is possible, but
maybe not if you want to complete the job this decade :-)

> What would we lose without DMA fences? Just inter-device synchronization? I 
> think that might be acceptable.
>
> The only case when the kernel will wait on a future fence is before a page 
> flip. Everything today already depends on userspace not hanging the gpu, 
> which makes everything a future fence.

That's not quite what we defined as future fences, because tdr
guarantees those complete, even if userspace hangs. It's when you put
userspace fence waits into the cs buffer you've submitted to the
kernel (or directly to hw) where the "real" future fences kick in.
-Daniel

>
> Marek
>
> On Tue., Apr. 27, 2021, 04:02 Daniel Vetter,  wrote:
>>
>> On Mon, Apr 26, 2021 at 04:59:28PM -0400, Marek Olšák wrote:
>> > Thanks everybody. The initial proposal is dead. Here are some thoughts on
>> > how to do it differently.
>> >
>> > I think we can have direct command submission from userspace via
>> > memory-mapped queues ("user queues") without changing window systems.
>> >
>> > The memory management doesn't have to use GPU page faults like HMM.
>> > Instead, it can wait for user queues of a specific process to go idle and
>> > then unmap the queues, so that userspace can't submit anything. Buffer
>> > evictions, pinning, etc. can be executed when all queues are unmapped
>> > (suspended). Thus, no BO fences and page faults are needed.
>> >
>> > Inter-process synchronization can use timeline semaphores. Userspace will
>> > query the wait and signal value for a shared buffer from the kernel. The
>> > kernel will keep a history of those queries to know which process is
>> > responsible for signalling which buffer. There is only the wait-timeout
>> > issue and how to identify the culprit. One of the solutions is to have the
>> > GPU send all GPU signal commands and all timed out wait commands via an
>> > interrupt to the kernel driver to monitor and validate userspace behavior.
>> > With that, it can be identified whether the culprit is the waiting process
>> > or the signalling process and which one. Invalid signal/wait parameters can
>> > also be detected. The kernel can force-signal only the semaphores that time
>> > out, and punish the processes which caused the timeout or used invalid
>> > signal/wait parameters.
>> >
>> > The question is whether this synchronization solution is robust enough for
>> > dma_fence and whatever the kernel and window systems need.
>>
>> The proper model here is the preempt-ctx dma_fence that amdkfd uses
>> (without page faults). That means dma_fence for synchronization is doa, at
>> least as-is, and we're back to figuring out the winsys problem.
>>
>> "We'll solve it with timeouts" is very tempting, but doesn't work. It's
>> akin to saying that we're solving deadlock issues in a locking design by
>> doing a global s/mutex_lock/mutex_lock_timeout/ in the kernel. Sure it
>> avoids having to reach the reset button, but that's about it.
>>
>> And the fundamental problem is that once you throw in userspace command
>> submission (and syncing, at least within the userspace driver, otherwise
>> there's kinda no point if you still need the kernel for cross-engine sync)
>> means you get deadlocks if you still use dma_fence for sync under
>> perfectly legit use-case. We've discussed that one ad nauseam last summer:
>>
>> https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html?highlight=dma_fence#indefinite-dma-fences
>>
>> See silly diagramm at the bottom.
>>
>> Now I think all isn't lost, because imo the first step to getting to this
>> brave new world is rebuilding the driver on top of userspace fences, and
>> with the adjusted cmd submit model. You probably don't want to use amdkfd,
>> but port that as a context flag or similar to render nodes for gl/vk. Of
>> course that means you can only use this mode in headless, without
>> glx/wayland winsys support, but it's a start.
>> -Daniel
>>
>> >
>> > Marek
>> >
>> > On Tue, Apr 20, 2021 at 4:34 PM Daniel Stone  wrote:
>> >
>> > > Hi,
>> > >
>> > > On Tue, 20 Apr 2021 at 20:30, Daniel Vetter  wrote:
>> > >
>> > >> The thing is, you can't do this in drm/scheduler. At least not without
>> > >> splitting up the dma_fence in the kernel into separate memory fences
>> > >> and sync fences
>> > >
>> > >
>> > > I'm starting to think this thread needs its own glossary ...
>> > >
>> > > I propose we use 'residency fence' for execution fences which enact
>> > > memory-residency op

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

On Tue, Apr 27, 2021 at 2:11 PM Marek Olšák  wrote:
> Ok. I'll interpret this as "yes, it will work, let's do it".

It works if all you care about is drm/amdgpu. I'm not sure that's a
reasonable approach for upstream, but it definitely is an approach :-)

We've already gone somewhat through the pain of drm/amdgpu redefining
how implicit sync works without sufficiently talking with other
people, maybe we should avoid a repeat of this ...
-Daniel

>
> Marek
>
> On Tue., Apr. 27, 2021, 08:06 Christian König, 
>  wrote:
>>
>> Correct, we wouldn't have synchronization between device with and without 
>> user queues any more.
>>
>> That could only be a problem for A+I Laptops.
>>
>> Memory management will just work with preemption fences which pause the user 
>> queues of a process before evicting something. That will be a dma_fence, but 
>> also a well known approach.
>>
>> Christian.
>>
>> Am 27.04.21 um 13:49 schrieb Marek Olšák:
>>
>> If we don't use future fences for DMA fences at all, e.g. we don't use them 
>> for memory management, it can work, right? Memory management can suspend 
>> user queues anytime. It doesn't need to use DMA fences. There might be 
>> something that I'm missing here.
>>
>> What would we lose without DMA fences? Just inter-device synchronization? I 
>> think that might be acceptable.
>>
>> The only case when the kernel will wait on a future fence is before a page 
>> flip. Everything today already depends on userspace not hanging the gpu, 
>> which makes everything a future fence.
>>
>> Marek
>>
>> On Tue., Apr. 27, 2021, 04:02 Daniel Vetter,  wrote:
>>>
>>> On Mon, Apr 26, 2021 at 04:59:28PM -0400, Marek Olšák wrote:
>>> > Thanks everybody. The initial proposal is dead. Here are some thoughts on
>>> > how to do it differently.
>>> >
>>> > I think we can have direct command submission from userspace via
>>> > memory-mapped queues ("user queues") without changing window systems.
>>> >
>>> > The memory management doesn't have to use GPU page faults like HMM.
>>> > Instead, it can wait for user queues of a specific process to go idle and
>>> > then unmap the queues, so that userspace can't submit anything. Buffer
>>> > evictions, pinning, etc. can be executed when all queues are unmapped
>>> > (suspended). Thus, no BO fences and page faults are needed.
>>> >
>>> > Inter-process synchronization can use timeline semaphores. Userspace will
>>> > query the wait and signal value for a shared buffer from the kernel. The
>>> > kernel will keep a history of those queries to know which process is
>>> > responsible for signalling which buffer. There is only the wait-timeout
>>> > issue and how to identify the culprit. One of the solutions is to have the
>>> > GPU send all GPU signal commands and all timed out wait commands via an
>>> > interrupt to the kernel driver to monitor and validate userspace behavior.
>>> > With that, it can be identified whether the culprit is the waiting process
>>> > or the signalling process and which one. Invalid signal/wait parameters 
>>> > can
>>> > also be detected. The kernel can force-signal only the semaphores that 
>>> > time
>>> > out, and punish the processes which caused the timeout or used invalid
>>> > signal/wait parameters.
>>> >
>>> > The question is whether this synchronization solution is robust enough for
>>> > dma_fence and whatever the kernel and window systems need.
>>>
>>> The proper model here is the preempt-ctx dma_fence that amdkfd uses
>>> (without page faults). That means dma_fence for synchronization is doa, at
>>> least as-is, and we're back to figuring out the winsys problem.
>>>
>>> "We'll solve it with timeouts" is very tempting, but doesn't work. It's
>>> akin to saying that we're solving deadlock issues in a locking design by
>>> doing a global s/mutex_lock/mutex_lock_timeout/ in the kernel. Sure it
>>> avoids having to reach the reset button, but that's about it.
>>>
>>> And the fundamental problem is that once you throw in userspace command
>>> submission (and syncing, at least within the userspace driver, otherwise
>>> there's kinda no point if you still need the kernel for cross-engine sync)
>>> means you get deadlocks if you still use dma_fence for sync under
>>> perfectly legit use-case. We've discussed that one ad nauseam last summer:
>>>
>>> https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html?highlight=dma_fence#indefinite-dma-fences
>>>
>>> See silly diagramm at the bottom.
>>>
>>> Now I think all isn't lost, because imo the first step to getting to this
>>> brave new world is rebuilding the driver on top of userspace fences, and
>>> with the adjusted cmd submit model. You probably don't want to use amdkfd,
>>> but port that as a context flag or similar to render nodes for gl/vk. Of
>>> course that means you can only use this mode in headless, without
>>> glx/wayland winsys support, but it's a start.
>>> -Daniel
>>>
>>> >
>>> > Marek
>>> >
>>> > On Tue, Apr 20, 2021 at 4:34 PM Daniel Stone  wrote:
>>> >
>>> > > H

Re: [PATCH v2] drm/bochs: Add screen blanking support




Am 27.04.21 um 11:56 schrieb Gerd Hoffmann:

I'm fine to change in any better way, of course, so feel free to
modify the patch.


If no one objects, I'll merge it as-is. It's somewhat wrong wrt to VGA, but
apparently what qemu wants.


No objections.

Acked-by: Gerd Hoffmann 


Great. Merged now. Thanks everyone.



FYI: cirrus is in the same situation, the modesetting works with qemu
but is possibly incomplete and might not work on cirrus real hardware
(it only binds to the qemu subsystem id for that reason).

take care,
   Gerd

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 8/8] drm/modifiers: Enforce consistency between the cap an IN_FORMATS

On Tue, Apr 27, 2021 at 12:32:19PM +0100, Emil Velikov wrote:
> Hi Daniel,
> 
> On Tue, 27 Apr 2021 at 10:20, Daniel Vetter  wrote:
> 
> > @@ -360,6 +373,9 @@ static int __drm_universal_plane_init(struct drm_device 
> > *dev,
> >   * drm_universal_plane_init() to let the DRM managed resource 
> > infrastructure
> >   * take care of cleanup and deallocation.
> >   *
> > + * Drivers supporting modifiers must set @format_modifiers on all their 
> > planes,
> > + * even those that only support DRM_FORMAT_MOD_LINEAR.
> > + *
> The comment says "must", yet we have an "if (format_modifiers)" in the 
> codebase.
> Shouldn't we add a WARN_ON() + return -EINVAL (or similar) so people
> can see and fix their drivers?

This is a must only for drivers supporting modifiers, not all drivers.
Hence the check in the if. I did add WARN_ON for the combos that get stuff
wrong though (like only supply one side of the modifier info, not both).

> As a follow-up one could even go a step further, by erroring out when
> the driver hasn't provided valid modifier(s) and even removing
> config::allow_fb_modifiers all together.

Well that currently only exists to avoid walking the plane list (which we
need to do for validation that all planes are the same). It's quite tricky
code for tiny benefit, so I don't think it's worth it trying to remove
allow_fb_modifiers completely.

> Although for stable - this series + WARN_ON (no return since it might
> break buggy drivers) sounds good.
> 
> > @@ -909,6 +909,8 @@ struct drm_mode_config {
> >  * @allow_fb_modifiers:
> >  *
> >  * Whether the driver supports fb modifiers in the ADDFB2.1 ioctl 
> > call.
> > +* Note that drivers should not set this directly, it is 
> > automatically
> > +* set in drm_universal_plane_init().
> >  *
> >  * IMPORTANT:
> >  *
> The new note and the existing IMPORTANT are in a weird mix.
> Quoting the latter since it doesn't show in the diff.
> 
> If this is set the driver must fill out the full implicit modifier
> information in their &drm_mode_config_funcs.fb_create hook for legacy
> userspace which does not set modifiers. Otherwise the GETFB2 ioctl is
> broken for modifier aware userspace.
> 
> In particular:
> As the new note says "don't set it" and the existing note one says "if
> it's set". Yet no drivers do "if (config->allow_fb_modifiers)".
> 
> Sadly, nothing comes to mind atm wrt alternative wording.

Yeah it's a bit disappointing.

> With the WARN_ON() added or s/must/should/ in the documentation, the series 
> is:

With my clarification, can you please recheck whether as-is it's not
correct?

Thanks, Daniel

> Reviewed-by: Emil Velikov 
> 
> HTH
> -Emil

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v3] drm/drm_file.c: Define drm_send_event_helper() as 'static'

On Tue, Apr 27, 2021 at 12:55:03PM +0200, Fabio M. De Francesco wrote:
> drm_send_event_helper() has not prototype, it has internal linkage and
> therefore it should be defined with storage class 'static'.
> 
> Signed-off-by: Fabio M. De Francesco 

Applied to drm-misc-next for 5.14, thanks for your patch.
-Daniel

> ---
> 
> Changes from v2: Removed all the other lines in function comment.
> Changes from v1: As suggested by Daniel Vetter, removed unnecessary
> kernel-doc comments.
> 
>  drivers/gpu/drm/drm_file.c | 14 +-
>  1 file changed, 1 insertion(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> index 7efbccffc2ea..d4f0bac6f8f8 100644
> --- a/drivers/gpu/drm/drm_file.c
> +++ b/drivers/gpu/drm/drm_file.c
> @@ -774,19 +774,7 @@ void drm_event_cancel_free(struct drm_device *dev,
>  }
>  EXPORT_SYMBOL(drm_event_cancel_free);
>  
> -/**
> - * drm_send_event_helper - send DRM event to file descriptor
> - * @dev: DRM device
> - * @e: DRM event to deliver
> - * @timestamp: timestamp to set for the fence event in kernel's 
> CLOCK_MONOTONIC
> - * time domain
> - *
> - * This helper function sends the event @e, initialized with
> - * drm_event_reserve_init(), to its associated userspace DRM file.
> - * The timestamp variant of dma_fence_signal is used when the caller
> - * sends a valid timestamp.
> - */
> -void drm_send_event_helper(struct drm_device *dev,
> +static void drm_send_event_helper(struct drm_device *dev,
>  struct drm_pending_event *e, ktime_t timestamp)
>  {
>   assert_spin_locked(&dev->event_lock);
> -- 
> 2.31.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

Am 27.04.21 um 14:15 schrieb Daniel Vetter:

On Tue, Apr 27, 2021 at 2:11 PM Marek Olšák wrote:

Ok. I'll interpret this as "yes, it will work, let's do it".

It works if all you care about is drm/amdgpu. I'm not sure that's a
reasonable approach for upstream, but it definitely is an approach :-)

We've already gone somewhat through the pain of drm/amdgpu redefining
how implicit sync works without sufficiently talking with other
people, maybe we should avoid a repeat of this ...

BTW: This is coming up again for the plan here.

We once more need to think about the "other" fences which don't
participate in the implicit sync here.

Christian.

-Daniel

Marek

On Tue., Apr. 27, 2021, 08:06 Christian König,
wrote:

Correct, we wouldn't have synchronization between device with and without user
queues any more.

That could only be a problem for A+I Laptops.

Memory management will just work with preemption fences which pause the user
queues of a process before evicting something. That will be a dma_fence, but
also a well known approach.

Christian.

Am 27.04.21 um 13:49 schrieb Marek Olšák:

If we don't use future fences for DMA fences at all, e.g. we don't use them for
memory management, it can work, right? Memory management can suspend user
queues anytime. It doesn't need to use DMA fences. There might be something
that I'm missing here.

What would we lose without DMA fences? Just inter-device synchronization? I
think that might be acceptable.

The only case when the kernel will wait on a future fence is before a page
flip. Everything today already depends on userspace not hanging the gpu, which
makes everything a future fence.

Marek

On Tue., Apr. 27, 2021, 04:02 Daniel Vetter, wrote:

On Mon, Apr 26, 2021 at 04:59:28PM -0400, Marek Olšák wrote:

Thanks everybody. The initial proposal is dead. Here are some thoughts on
how to do it differently.

I think we can have direct command submission from userspace via
memory-mapped queues ("user queues") without changing window systems.

The memory management doesn't have to use GPU page faults like HMM.
Instead, it can wait for user queues of a specific process to go idle and
then unmap the queues, so that userspace can't submit anything. Buffer
evictions, pinning, etc. can be executed when all queues are unmapped
(suspended). Thus, no BO fences and page faults are needed.

Inter-process synchronization can use timeline semaphores. Userspace will
query the wait and signal value for a shared buffer from the kernel. The
kernel will keep a history of those queries to know which process is
responsible for signalling which buffer. There is only the wait-timeout
issue and how to identify the culprit. One of the solutions is to have the
GPU send all GPU signal commands and all timed out wait commands via an
interrupt to the kernel driver to monitor and validate userspace behavior.
With that, it can be identified whether the culprit is the waiting process
or the signalling process and which one. Invalid signal/wait parameters can
also be detected. The kernel can force-signal only the semaphores that time
out, and punish the processes which caused the timeout or used invalid
signal/wait parameters.

The question is whether this synchronization solution is robust enough for
dma_fence and whatever the kernel and window systems need.

The proper model here is the preempt-ctx dma_fence that amdkfd uses
(without page faults). That means dma_fence for synchronization is doa, at
least as-is, and we're back to figuring out the winsys problem.

"We'll solve it with timeouts" is very tempting, but doesn't work. It's
akin to saying that we're solving deadlock issues in a locking design by
doing a global s/mutex_lock/mutex_lock_timeout/ in the kernel. Sure it
avoids having to reach the reset button, but that's about it.

And the fundamental problem is that once you throw in userspace command
submission (and syncing, at least within the userspace driver, otherwise
there's kinda no point if you still need the kernel for cross-engine sync)
means you get deadlocks if you still use dma_fence for sync under
perfectly legit use-case. We've discussed that one ad nauseam last summer:

https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html?highlight=dma_fence#indefinite-dma-fences

See silly diagramm at the bottom.

Now I think all isn't lost, because imo the first step to getting to this
brave new world is rebuilding the driver on top of userspace fences, and
with the adjusted cmd submit model. You probably don't want to use amdkfd,
but port that as a context flag or similar to render nodes for gl/vk. Of
course that means you can only use this mode in headless, without
glx/wayland winsys support, but it's a start.
-Daniel

Marek

On Tue, Apr 20, 2021 at 4:34 PM Daniel Stone wrote:

Hi,

On Tue, 20 Apr 2021 at 20:30, Daniel Vetter wrote:

The thing is, you can't do this in drm/scheduler. At least not without
splitting up the dma_fence in the kernel into separate mem

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

I'll defer to Christian and Alex to decide whether dropping sync with
non-amd devices (GPUs, cameras etc.) is acceptable.

Rewriting those drivers to this new sync model could be done on a case by
case basis.

For now, would we only lose the "amd -> external" dependency? Or the
"external -> amd" dependency too?

Marek

On Tue., Apr. 27, 2021, 08:15 Daniel Vetter,  wrote:

> On Tue, Apr 27, 2021 at 2:11 PM Marek Olšák  wrote:
> > Ok. I'll interpret this as "yes, it will work, let's do it".
>
> It works if all you care about is drm/amdgpu. I'm not sure that's a
> reasonable approach for upstream, but it definitely is an approach :-)
>
> We've already gone somewhat through the pain of drm/amdgpu redefining
> how implicit sync works without sufficiently talking with other
> people, maybe we should avoid a repeat of this ...
> -Daniel
>
> >
> > Marek
> >
> > On Tue., Apr. 27, 2021, 08:06 Christian König, <
> ckoenig.leichtzumer...@gmail.com> wrote:
> >>
> >> Correct, we wouldn't have synchronization between device with and
> without user queues any more.
> >>
> >> That could only be a problem for A+I Laptops.
> >>
> >> Memory management will just work with preemption fences which pause the
> user queues of a process before evicting something. That will be a
> dma_fence, but also a well known approach.
> >>
> >> Christian.
> >>
> >> Am 27.04.21 um 13:49 schrieb Marek Olšák:
> >>
> >> If we don't use future fences for DMA fences at all, e.g. we don't use
> them for memory management, it can work, right? Memory management can
> suspend user queues anytime. It doesn't need to use DMA fences. There might
> be something that I'm missing here.
> >>
> >> What would we lose without DMA fences? Just inter-device
> synchronization? I think that might be acceptable.
> >>
> >> The only case when the kernel will wait on a future fence is before a
> page flip. Everything today already depends on userspace not hanging the
> gpu, which makes everything a future fence.
> >>
> >> Marek
> >>
> >> On Tue., Apr. 27, 2021, 04:02 Daniel Vetter,  wrote:
> >>>
> >>> On Mon, Apr 26, 2021 at 04:59:28PM -0400, Marek Olšák wrote:
> >>> > Thanks everybody. The initial proposal is dead. Here are some
> thoughts on
> >>> > how to do it differently.
> >>> >
> >>> > I think we can have direct command submission from userspace via
> >>> > memory-mapped queues ("user queues") without changing window systems.
> >>> >
> >>> > The memory management doesn't have to use GPU page faults like HMM.
> >>> > Instead, it can wait for user queues of a specific process to go
> idle and
> >>> > then unmap the queues, so that userspace can't submit anything.
> Buffer
> >>> > evictions, pinning, etc. can be executed when all queues are unmapped
> >>> > (suspended). Thus, no BO fences and page faults are needed.
> >>> >
> >>> > Inter-process synchronization can use timeline semaphores. Userspace
> will
> >>> > query the wait and signal value for a shared buffer from the kernel.
> The
> >>> > kernel will keep a history of those queries to know which process is
> >>> > responsible for signalling which buffer. There is only the
> wait-timeout
> >>> > issue and how to identify the culprit. One of the solutions is to
> have the
> >>> > GPU send all GPU signal commands and all timed out wait commands via
> an
> >>> > interrupt to the kernel driver to monitor and validate userspace
> behavior.
> >>> > With that, it can be identified whether the culprit is the waiting
> process
> >>> > or the signalling process and which one. Invalid signal/wait
> parameters can
> >>> > also be detected. The kernel can force-signal only the semaphores
> that time
> >>> > out, and punish the processes which caused the timeout or used
> invalid
> >>> > signal/wait parameters.
> >>> >
> >>> > The question is whether this synchronization solution is robust
> enough for
> >>> > dma_fence and whatever the kernel and window systems need.
> >>>
> >>> The proper model here is the preempt-ctx dma_fence that amdkfd uses
> >>> (without page faults). That means dma_fence for synchronization is
> doa, at
> >>> least as-is, and we're back to figuring out the winsys problem.
> >>>
> >>> "We'll solve it with timeouts" is very tempting, but doesn't work. It's
> >>> akin to saying that we're solving deadlock issues in a locking design
> by
> >>> doing a global s/mutex_lock/mutex_lock_timeout/ in the kernel. Sure it
> >>> avoids having to reach the reset button, but that's about it.
> >>>
> >>> And the fundamental problem is that once you throw in userspace command
> >>> submission (and syncing, at least within the userspace driver,
> otherwise
> >>> there's kinda no point if you still need the kernel for cross-engine
> sync)
> >>> means you get deadlocks if you still use dma_fence for sync under
> >>> perfectly legit use-case. We've discussed that one ad nauseam last
> summer:
> >>>
> >>>
> https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html?highlight=dma_fence#indefinite-dma-fences
> >>>
> >>> See

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal


Only amd -> external.

We can easily install something in an user queue which waits for a 
dma_fence in the kernel.


But we can't easily wait for an user queue as dependency of a dma_fence.

The good thing is we have this wait before signal case on Vulkan 
timeline semaphores which have the same problem in the kernel.


The good news is I think we can relatively easily convert i915 and older 
amdgpu device to something which is compatible with user fences.


So yes, getting that fixed case by case should work.

Christian

Am 27.04.21 um 14:46 schrieb Marek Olšák:
I'll defer to Christian and Alex to decide whether dropping sync with 
non-amd devices (GPUs, cameras etc.) is acceptable.


Rewriting those drivers to this new sync model could be done on a case 
by case basis.


For now, would we only lose the "amd -> external" dependency? Or the 
"external -> amd" dependency too?


Marek

On Tue., Apr. 27, 2021, 08:15 Daniel Vetter, > wrote:


On Tue, Apr 27, 2021 at 2:11 PM Marek Olšák mailto:mar...@gmail.com>> wrote:
> Ok. I'll interpret this as "yes, it will work, let's do it".

It works if all you care about is drm/amdgpu. I'm not sure that's a
reasonable approach for upstream, but it definitely is an approach :-)

We've already gone somewhat through the pain of drm/amdgpu redefining
how implicit sync works without sufficiently talking with other
people, maybe we should avoid a repeat of this ...
-Daniel

>
> Marek
>
> On Tue., Apr. 27, 2021, 08:06 Christian König,
mailto:ckoenig.leichtzumer...@gmail.com>> wrote:
>>
>> Correct, we wouldn't have synchronization between device with
and without user queues any more.
>>
>> That could only be a problem for A+I Laptops.
>>
>> Memory management will just work with preemption fences which
pause the user queues of a process before evicting something. That
will be a dma_fence, but also a well known approach.
>>
>> Christian.
>>
>> Am 27.04.21 um 13:49 schrieb Marek Olšák:
>>
>> If we don't use future fences for DMA fences at all, e.g. we
don't use them for memory management, it can work, right? Memory
management can suspend user queues anytime. It doesn't need to use
DMA fences. There might be something that I'm missing here.
>>
>> What would we lose without DMA fences? Just inter-device
synchronization? I think that might be acceptable.
>>
>> The only case when the kernel will wait on a future fence is
before a page flip. Everything today already depends on userspace
not hanging the gpu, which makes everything a future fence.
>>
>> Marek
>>
>> On Tue., Apr. 27, 2021, 04:02 Daniel Vetter, mailto:dan...@ffwll.ch>> wrote:
>>>
>>> On Mon, Apr 26, 2021 at 04:59:28PM -0400, Marek Olšák wrote:
>>> > Thanks everybody. The initial proposal is dead. Here are
some thoughts on
>>> > how to do it differently.
>>> >
>>> > I think we can have direct command submission from userspace via
>>> > memory-mapped queues ("user queues") without changing window
systems.
>>> >
>>> > The memory management doesn't have to use GPU page faults
like HMM.
>>> > Instead, it can wait for user queues of a specific process
to go idle and
>>> > then unmap the queues, so that userspace can't submit
anything. Buffer
>>> > evictions, pinning, etc. can be executed when all queues are
unmapped
>>> > (suspended). Thus, no BO fences and page faults are needed.
>>> >
>>> > Inter-process synchronization can use timeline semaphores.
Userspace will
>>> > query the wait and signal value for a shared buffer from the
kernel. The
>>> > kernel will keep a history of those queries to know which
process is
>>> > responsible for signalling which buffer. There is only the
wait-timeout
>>> > issue and how to identify the culprit. One of the solutions
is to have the
>>> > GPU send all GPU signal commands and all timed out wait
commands via an
>>> > interrupt to the kernel driver to monitor and validate
userspace behavior.
>>> > With that, it can be identified whether the culprit is the
waiting process
>>> > or the signalling process and which one. Invalid signal/wait
parameters can
>>> > also be detected. The kernel can force-signal only the
semaphores that time
>>> > out, and punish the processes which caused the timeout or
used invalid
>>> > signal/wait parameters.
>>> >
>>> > The question is whether this synchronization solution is
robust enough for
>>> > dma_fence and whatever the kernel and window systems need.
>>>
>>> The proper model here is the preempt-ctx dma_fence that amdkfd
uses
>>> (without page faults). That means dma_fence for
synchronization is doa, at
>>> least as-is, and we're back to figuring out the winsy

Re: [PATCH] drm/i915/gem: Remove reference to struct drm_device.pdev

On Tue, 27 Apr 2021, Thomas Zimmermann  wrote:
> References to struct drm_device.pdev should be used any longer as
> the field will be moved into the struct's legacy section. Add a fix
> for the rsp commit.
>
> Signed-off-by: Thomas Zimmermann 
> Fixes: d57d4a1daf5e ("drm/i915: Create stolen memory region from local 
> memory")
> Cc: CQ Tang 
> Cc: Matthew Auld 
> Cc: Tvrtko Ursulin 
> Cc: Xinyun Liu 
> Cc: Tvrtko Ursulin 
> Cc: Jani Nikula 
> Cc: Joonas Lahtinen 
> Cc: Rodrigo Vivi 
> Cc: Chris Wilson 
> Cc: Mika Kuoppala 
> Cc: Daniel Vetter 
> Cc: Maarten Lankhorst 
> Cc: "Thomas Hellström" 
> Cc: "Gustavo A. R. Silva" 
> Cc: Dan Carpenter 
> Cc: intel-...@lists.freedesktop.org

Reviewed-by: Jani Nikula 

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> index c5b64b2400e8..e1a32672bbe8 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> @@ -773,7 +773,7 @@ struct intel_memory_region *
>  i915_gem_stolen_lmem_setup(struct drm_i915_private *i915)
>  {
>   struct intel_uncore *uncore = &i915->uncore;
> - struct pci_dev *pdev = i915->drm.pdev;
> + struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
>   struct intel_memory_region *mem;
>   resource_size_t io_start;
>   resource_size_t lmem_size;

-- 
Jani Nikula, Intel Open Source Graphics Center
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v7 0/4] drm: Move struct drm_device.pdev to legacy

On Tue, 27 Apr 2021, Thomas Zimmermann  wrote:
> Hi Jani
>
> Am 27.04.21 um 14:04 schrieb Jani Nikula:
>> On Tue, 27 Apr 2021, Thomas Zimmermann  wrote:
>>> V7 of the patchset fixes some bitrot in the intel driver.
>>>
>>> The pdev field in struct drm_device points to a PCI device structure and
>>> goes back to UMS-only days when all DRM drivers were for PCI devices.
>>> Meanwhile we also support USB, SPI and platform devices. Each of those
>>> uses the generic device stored in struct drm_device.dev.
>>>
>>> To reduce duplication and remove the special case of PCI, this patchset
>>> converts all modesetting drivers from pdev to dev and makes pdev a field
>>> for legacy UMS drivers.
>>>
>>> For PCI devices, the pointer in struct drm_device.dev can be upcasted to
>>> struct pci_device; or tested for PCI with dev_is_pci(). In several places
>>> the code can use the dev field directly.
>>>
>>> After converting all drivers and the DRM core, the pdev fields becomes
>>> only relevant for legacy drivers. In a later patchset, we may want to
>>> convert these as well and remove pdev entirely.
>> 
>> On the series,
>> 
>> Reviewed-by: Jani Nikula 
>> 
>> How should we merge these?
>
> Thanks for the quick reply.
>
> There is another pdev patch that I just sent out. [1] It has to go into 
> the intel tree. After it landed, I want to get this patchset into 
> drm-misc-next ASAP. Otherwise, drm-tip would stop building.

On merging the series via drm-misc-next,

Acked-by: Jani Nikula 

>
> This should fix things in the correct order and finally remove pdev for 
> current drivers.
>
> Best regards
> Thomas
>
> [1] 
> https://lore.kernel.org/dri-devel/20210427110747.2065-1-tzimmerm...@suse.de/T/#u
>
>> 
>> 
>> 
>>>
>>> v7:
>>> * fix instances of pdev that have benn added under i915/
>>> v6:
>>> * also remove assignment in i915/selftests in later patch (Chris)
>>> v5:
>>> * remove assignment in later patch (Chris)
>>> v4:
>>> * merged several patches
>>> * moved core changes into separate patch
>>> * vmwgfx build fix
>>> v3:
>>> * merged several patches
>>> * fix one pdev reference in nouveau (Jeremy)
>>> * rebases
>>> v2:
>>> * move whitespace fixes into separate patches (Alex, Sam)
>>> * move i915 gt/ and gvt/ changes into separate patches (Joonas)
>>>
>>> Thomas Zimmermann (4):
>>>drm/i915/gt: Remove reference to struct drm_device.pdev
>>>drm/i915: Remove reference to struct drm_device.pdev
>>>drm/i915: Don't assign to struct drm_device.pdev
>>>drm: Move struct drm_device.pdev to legacy section
>>>
>>>   drivers/gpu/drm/i915/gt/intel_region_lmem.c  | 2 +-
>>>   drivers/gpu/drm/i915/i915_drv.c  | 1 -
>>>   drivers/gpu/drm/i915/intel_runtime_pm.h  | 2 +-
>>>   drivers/gpu/drm/i915/selftests/mock_gem_device.c | 1 -
>>>   include/drm/drm_device.h | 6 +++---
>>>   5 files changed, 5 insertions(+), 7 deletions(-)
>>>
>>> --
>>> 2.31.1
>>>
>> 

-- 
Jani Nikula, Intel Open Source Graphics Center
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

Ok. So that would only make the following use cases broken for now:
- amd render -> external gpu
- amd video encode -> network device

What about the case when we get a buffer from an external device and we're
supposed to make it "busy" when we are using it, and the external device
wants to wait until we stop using it? Is it something that can happen, thus
turning "external -> amd" into "external <-> amd"?

Marek

On Tue., Apr. 27, 2021, 08:50 Christian König, <
ckoenig.leichtzumer...@gmail.com> wrote:

> Only amd -> external.
>
> We can easily install something in an user queue which waits for a
> dma_fence in the kernel.
>
> But we can't easily wait for an user queue as dependency of a dma_fence.
>
> The good thing is we have this wait before signal case on Vulkan timeline
> semaphores which have the same problem in the kernel.
>
> The good news is I think we can relatively easily convert i915 and older
> amdgpu device to something which is compatible with user fences.
>
> So yes, getting that fixed case by case should work.
>
> Christian
>
> Am 27.04.21 um 14:46 schrieb Marek Olšák:
>
> I'll defer to Christian and Alex to decide whether dropping sync with
> non-amd devices (GPUs, cameras etc.) is acceptable.
>
> Rewriting those drivers to this new sync model could be done on a case by
> case basis.
>
> For now, would we only lose the "amd -> external" dependency? Or the
> "external -> amd" dependency too?
>
> Marek
>
> On Tue., Apr. 27, 2021, 08:15 Daniel Vetter,  wrote:
>
>> On Tue, Apr 27, 2021 at 2:11 PM Marek Olšák  wrote:
>> > Ok. I'll interpret this as "yes, it will work, let's do it".
>>
>> It works if all you care about is drm/amdgpu. I'm not sure that's a
>> reasonable approach for upstream, but it definitely is an approach :-)
>>
>> We've already gone somewhat through the pain of drm/amdgpu redefining
>> how implicit sync works without sufficiently talking with other
>> people, maybe we should avoid a repeat of this ...
>> -Daniel
>>
>> >
>> > Marek
>> >
>> > On Tue., Apr. 27, 2021, 08:06 Christian König, <
>> ckoenig.leichtzumer...@gmail.com> wrote:
>> >>
>> >> Correct, we wouldn't have synchronization between device with and
>> without user queues any more.
>> >>
>> >> That could only be a problem for A+I Laptops.
>> >>
>> >> Memory management will just work with preemption fences which pause
>> the user queues of a process before evicting something. That will be a
>> dma_fence, but also a well known approach.
>> >>
>> >> Christian.
>> >>
>> >> Am 27.04.21 um 13:49 schrieb Marek Olšák:
>> >>
>> >> If we don't use future fences for DMA fences at all, e.g. we don't use
>> them for memory management, it can work, right? Memory management can
>> suspend user queues anytime. It doesn't need to use DMA fences. There might
>> be something that I'm missing here.
>> >>
>> >> What would we lose without DMA fences? Just inter-device
>> synchronization? I think that might be acceptable.
>> >>
>> >> The only case when the kernel will wait on a future fence is before a
>> page flip. Everything today already depends on userspace not hanging the
>> gpu, which makes everything a future fence.
>> >>
>> >> Marek
>> >>
>> >> On Tue., Apr. 27, 2021, 04:02 Daniel Vetter,  wrote:
>> >>>
>> >>> On Mon, Apr 26, 2021 at 04:59:28PM -0400, Marek Olšák wrote:
>> >>> > Thanks everybody. The initial proposal is dead. Here are some
>> thoughts on
>> >>> > how to do it differently.
>> >>> >
>> >>> > I think we can have direct command submission from userspace via
>> >>> > memory-mapped queues ("user queues") without changing window
>> systems.
>> >>> >
>> >>> > The memory management doesn't have to use GPU page faults like HMM.
>> >>> > Instead, it can wait for user queues of a specific process to go
>> idle and
>> >>> > then unmap the queues, so that userspace can't submit anything.
>> Buffer
>> >>> > evictions, pinning, etc. can be executed when all queues are
>> unmapped
>> >>> > (suspended). Thus, no BO fences and page faults are needed.
>> >>> >
>> >>> > Inter-process synchronization can use timeline semaphores.
>> Userspace will
>> >>> > query the wait and signal value for a shared buffer from the
>> kernel. The
>> >>> > kernel will keep a history of those queries to know which process is
>> >>> > responsible for signalling which buffer. There is only the
>> wait-timeout
>> >>> > issue and how to identify the culprit. One of the solutions is to
>> have the
>> >>> > GPU send all GPU signal commands and all timed out wait commands
>> via an
>> >>> > interrupt to the kernel driver to monitor and validate userspace
>> behavior.
>> >>> > With that, it can be identified whether the culprit is the waiting
>> process
>> >>> > or the signalling process and which one. Invalid signal/wait
>> parameters can
>> >>> > also be detected. The kernel can force-signal only the semaphores
>> that time
>> >>> > out, and punish the processes which caused the timeout or used
>> invalid
>> >>> > signal/wait parameters.
>> >>> >
>> >>> > The

RE: [Intel-gfx] [PATCH v2 4/7] drm/i915/gtt/dgfx: place the PD in LMEM

2021-04-27 Thread Tang, CQ




> -Original Message-
> From: Intel-gfx  On Behalf Of
> Matthew Auld
> Sent: Tuesday, April 27, 2021 1:54 AM
> To: intel-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Subject: [Intel-gfx] [PATCH v2 4/7] drm/i915/gtt/dgfx: place the PD in LMEM
> 
> It's a requirement that for dgfx we place all the paging structures in device
> local-memory.
> 
> v2: use i915_coherent_map_type()
> v3: improve the shared dma-resv object comment
> 
> Signed-off-by: Matthew Auld 
> Cc: Tvrtko Ursulin 
> ---
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c |  5 -
> drivers/gpu/drm/i915/gt/intel_gtt.c  | 30 +---
> drivers/gpu/drm/i915/gt/intel_gtt.h  |  1 +
>  3 files changed, 32 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index f83496836f0f..11fb5df45a0f 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -712,7 +712,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt
> *gt)
>*/
>   ppgtt->vm.has_read_only = !IS_GEN_RANGE(gt->i915, 11, 12);
> 
> - ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
> + if (HAS_LMEM(gt->i915))
> + ppgtt->vm.alloc_pt_dma = alloc_pt_lmem;

Here we might want to allocate lmem from the 'gt' in the argument,  however, 
below inside alloc_pt_lmem(), we always allocate lmem to tile0.
Is this desired?

--CQ

> + else
> + ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
> 
>   err = gen8_init_scratch(&ppgtt->vm);
>   if (err)
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c
> b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index d386b89e2758..061c39d2ad51 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -7,10 +7,26 @@
> 
>  #include 
> 
> +#include "gem/i915_gem_lmem.h"
>  #include "i915_trace.h"
>  #include "intel_gt.h"
>  #include "intel_gtt.h"
> 
> +struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space
> +*vm, int sz) {
> + struct drm_i915_gem_object *obj;
> +
> + obj = i915_gem_object_create_lmem(vm->i915, sz, 0);
> + /*
> +  * Ensure all paging structures for this vm share the same dma-resv
> +  * object underneath, with the idea that one object_lock() will lock
> +  * them all at once.
> +  */
> + if (!IS_ERR(obj))
> + obj->base.resv = &vm->resv;
> + return obj;
> +}
> +
>  struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm,
> int sz)  {
>   struct drm_i915_gem_object *obj;
> @@ -19,7 +35,11 @@ struct drm_i915_gem_object *alloc_pt_dma(struct
> i915_address_space *vm, int sz)
>   i915_gem_shrink_all(vm->i915);
> 
>   obj = i915_gem_object_create_internal(vm->i915, sz);
> - /* ensure all dma objects have the same reservation class */
> + /*
> +  * Ensure all paging structures for this vm share the same dma-resv
> +  * object underneath, with the idea that one object_lock() will lock
> +  * them all at once.
> +  */
>   if (!IS_ERR(obj))
>   obj->base.resv = &vm->resv;
>   return obj;
> @@ -27,9 +47,11 @@ struct drm_i915_gem_object *alloc_pt_dma(struct
> i915_address_space *vm, int sz)
> 
>  int map_pt_dma(struct i915_address_space *vm, struct
> drm_i915_gem_object *obj)  {
> + enum i915_map_type type;
>   void *vaddr;
> 
> - vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
> + type = i915_coherent_map_type(vm->i915, obj, true);
> + vaddr = i915_gem_object_pin_map_unlocked(obj, type);
>   if (IS_ERR(vaddr))
>   return PTR_ERR(vaddr);
> 
> @@ -39,9 +61,11 @@ int map_pt_dma(struct i915_address_space *vm,
> struct drm_i915_gem_object *obj)
> 
>  int map_pt_dma_locked(struct i915_address_space *vm, struct
> drm_i915_gem_object *obj)  {
> + enum i915_map_type type;
>   void *vaddr;
> 
> - vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> + type = i915_coherent_map_type(vm->i915, obj, true);
> + vaddr = i915_gem_object_pin_map(obj, type);
>   if (IS_ERR(vaddr))
>   return PTR_ERR(vaddr);
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h
> b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index 40e486704558..44ce27c51631 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -527,6 +527,7 @@ int setup_scratch_page(struct i915_address_space
> *vm);  void free_scratch(struct i915_address_space *vm);
> 
>  struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm,
> int sz);
> +struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space
> +*vm, int sz);
>  struct i915_page_table *alloc_pt(struct i915_address_space *vm);  struct
> i915_page_directory *alloc_pd(struct i915_address_space *vm);  struct
> i915_page_directory *__alloc_pd(int npde);
> --
> 2.26.3
> 
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.

Re: [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines

2021-04-27 Thread Jason Ekstrand

On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand  wrote:
>
> This adds a bunch of complexity which the media driver has never
> actually used.  The media driver does technically bond a balanced engine
> to another engine but the balanced engine only has one engine in the
> sibling set.  This doesn't actually result in a virtual engine.
>
> Unless some userspace badly wants it, there's no good reason to support
> this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> leave the validation code in place in case we ever decide we want to do
> something interesting with the bonding information.
>
> Signed-off-by: Jason Ekstrand 
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c|   2 +-
>  drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
>  .../drm/i915/gt/intel_execlists_submission.c  | 100 
>  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
>  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 --
>  6 files changed, 7 insertions(+), 353 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index e8179918fa306..5f8d0faf783aa 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user 
> *base, void *data)
> }
> virtual = set->engines->engines[idx]->engine;
>
> +   if (intel_engine_is_virtual(virtual)) {
> +   drm_dbg(&i915->drm,
> +   "Bonding with virtual engines not allowed\n");
> +   return -EINVAL;
> +   }
> +
> err = check_user_mbz(&ext->flags);
> if (err)
> return err;
> @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user 
> *base, void *data)
> n, ci.engine_class, ci.engine_instance);
> return -EINVAL;
> }
> -
> -   /*
> -* A non-virtual engine has no siblings to choose between; and
> -* a submit fence will always be directed to the one engine.
> -*/
> -   if (intel_engine_is_virtual(virtual)) {
> -   err = intel_virtual_engine_attach_bond(virtual,
> -  master,
> -  bond);
> -   if (err)
> -   return err;
> -   }
> }
>
> return 0;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index d640bba6ad9ab..efb2fa3522a42 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> if (args->flags & I915_EXEC_FENCE_SUBMIT)
> err = i915_request_await_execution(eb.request,
>in_fence,
> -  
> eb.engine->bond_execute);
> +  NULL);
> else
> err = i915_request_await_dma_fence(eb.request,
>in_fence);
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
> b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index 883bafc449024..68cfe5080325c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -446,13 +446,6 @@ struct intel_engine_cs {
>  */
> void(*submit_request)(struct i915_request *rq);
>
> -   /*
> -* Called on signaling of a SUBMIT_FENCE, passing along the signaling
> -* request down to the bonded pairs.
> -*/
> -   void(*bond_execute)(struct i915_request *rq,
> -   struct dma_fence *signal);
> -
> /*
>  * Call when the priority on a request has changed and it and its
>  * dependencies may need rescheduling. Note the request itself may
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index de124870af44d..b6e2b59f133b7 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -181,18 +181,6 @@ struct virtual_engine {
> int prio;
> } nodes[I915_NUM_ENGINES];
>
> -   /*
> -* Keep track of bonded pairs -- restrictions upon on our selection
> -* of physical engines any particular request may be submitted to.
> -* If we receive a submit-fe

Re: [Intel-gfx] [PATCH 08/20] drm/i915/gem: Disallow bonding of virtual engines (v2)

On Mon, Apr 26, 2021 at 06:43:30PM -0500, Jason Ekstrand wrote:
> This adds a bunch of complexity which the media driver has never
> actually used.  The media driver does technically bond a balanced engine
> to another engine but the balanced engine only has one engine in the
> sibling set.  This doesn't actually result in a virtual engine.

Have you tripled checked this by running media stack with bonding? Also
this needs acks from media side, pls Cc Carl&Tony.

I think you should also explain a bit more indetail why exactly the bonded
submit thing is a no-op and what the implications are, since it took me a
while to get that. Plus you missed the entire SUBMIT_FENCE entertainment,
so obviously this isn't very obvious :-)

> Unless some userspace badly wants it, there's no good reason to support
> this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> leave the validation code in place in case we ever decide we want to do
> something interesting with the bonding information.
> 
> v2 (Jason Ekstrand):
>  - Don't delete quite as much code.  Some of it was necessary.

Please explain the details here, after all this is rather tricky ...

> Signed-off-by: Jason Ekstrand 

So this just stops the uapi and immediate things. But since I've looked
around in how this works I think it'd be worth it to throw a backend
cleanup task on top. Not the entire thing, but just the most egregious
detail:

One thing the submit fence does, aside from holding up the subsequent
batches until the first one is scheduled, is limit the set of engines to
the right pair - which we know once the engine is selected for the first
batch. That's done with some lockless trickery in the await fence callback
(iirc, would need to double-check) with cmpxchg. If we can delete that in
a follow-up, assuming it's really not pulling in an entire string of
things, I think that would be rather nice clarification on what's possible
or not possible wrt execlist backend scheduling.

I'd like to do this now because unlike all the rcu stuff it's a lot harder
to find it again and realize it's all dead code now. With the rcu/locking
stuff I'm much less worried about leaving complexity behind that we don't
realize isn't needed anymore.

Also we really need to make sure we can get away with this before we
commit to anything I think ...

Code itself looks reasonable, but I'll wait for r-b stamping until the
commit message is more polished.
-Daniel

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
>  .../drm/i915/gt/intel_execlists_submission.c  |  83 ---
>  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
>  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 --
>  4 files changed, 6 insertions(+), 328 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index e8179918fa306..5f8d0faf783aa 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user 
> *base, void *data)
>   }
>   virtual = set->engines->engines[idx]->engine;
>  
> + if (intel_engine_is_virtual(virtual)) {
> + drm_dbg(&i915->drm,
> + "Bonding with virtual engines not allowed\n");
> + return -EINVAL;
> + }
> +
>   err = check_user_mbz(&ext->flags);
>   if (err)
>   return err;
> @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user 
> *base, void *data)
>   n, ci.engine_class, ci.engine_instance);
>   return -EINVAL;
>   }
> -
> - /*
> -  * A non-virtual engine has no siblings to choose between; and
> -  * a submit fence will always be directed to the one engine.
> -  */
> - if (intel_engine_is_virtual(virtual)) {
> - err = intel_virtual_engine_attach_bond(virtual,
> -master,
> -bond);
> - if (err)
> - return err;
> - }
>   }
>  
>   return 0;
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index de124870af44d..a6204c60b59cb 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -181,18 +181,6 @@ struct virtual_engine {
>   int prio;
>   } nodes[I915_NUM_ENGINES];
>  
> - /*
> -  * Keep track of bonded pairs -- restrictions upon on our selection
> -  * of physical engines any particular request may be submitted to.
> -  * If we receive a submit-fence from a master engine, we will only
> -  * use one of sibling_mask physical engines.

Re: [PATCH] drm: bridge: add missing word in Analogix help text

2021-04-27 Thread Neil Armstrong

On 26/04/2021 10:59, Neil Armstrong wrote:
> On 26/04/2021 09:42, Robert Foss wrote:
>>
>>
>> On Mon, Apr 26, 2021, 09:15 Neil Armstrong > > wrote:
>>
>>
>>
>> Le 24/04/2021 à 08:18, Randy Dunlap a écrit :
>> > Insert a missing word "power" in Kconfig help text.
>> >
>> > Fixes: 6aa192698089 ("drm/bridge: Add Analogix anx6345 support")
>> > Signed-off-by: Randy Dunlap > >
>> > Cc: Andrzej Hajda mailto:a.ha...@samsung.com>>
>> > Cc: Neil Armstrong > >
>> > Cc: Robert Foss > >
>> > Cc: David Airlie mailto:airl...@linux.ie>>
>> > Cc: Daniel Vetter mailto:dan...@ffwll.ch>>
>> > Cc: dri-devel@lists.freedesktop.org 
>> 
>> > Cc: Icenowy Zheng mailto:icen...@aosc.io>>
>> > Cc: Vasily Khoruzhick mailto:anars...@gmail.com>>
>> > Cc: Torsten Duwe mailto:d...@suse.de>>
>> > Cc: Maxime Ripard 
>> > ---
>> >  drivers/gpu/drm/bridge/analogix/Kconfig |    2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > --- linux-next-20210423.orig/drivers/gpu/drm/bridge/analogix/Kconfig
>> > +++ linux-next-20210423/drivers/gpu/drm/bridge/analogix/Kconfig
>> > @@ -6,7 +6,7 @@ config DRM_ANALOGIX_ANX6345
>> >       select DRM_KMS_HELPER
>> >       select REGMAP_I2C
>> >       help
>> > -       ANX6345 is an ultra-low Full-HD DisplayPort/eDP
>> > +       ANX6345 is an ultra-low power Full-HD DisplayPort/eDP
>> >         transmitter designed for portable devices. The
>> >         ANX6345 transforms the LVTTL RGB output of an
>> >         application processor to eDP or DisplayPort.
>> >
>>
>> Reviewed-by: Neil Armstrong > >
>>
>>
>> I think a typo in the email snuck in ;)
>>
> 
> Ah ah indeed !
> 
> Reviewed-by: Neil Armstrong 
> 
Wow, twice the same error... Monday was a bead day for me

Reviewed-by: Neil Armstrong 
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/bridge: anx7625: Fix power on delay

2021-04-27 Thread Neil Armstrong

On 27/04/2021 07:53, Hsin-Yi Wang wrote:
> From anx7625 spec, the delay between powering on power supplies and gpio
> should be larger than 10ms.
> 
> Fixes: 6c744983004e ("drm/bridge: anx7625: disable regulators when power off")
> Signed-off-by: Hsin-Yi Wang 
> ---
>  drivers/gpu/drm/bridge/analogix/anx7625.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.c 
> b/drivers/gpu/drm/bridge/analogix/anx7625.c
> index 23283ba0c4f9..0a8db745cfd5 100644
> --- a/drivers/gpu/drm/bridge/analogix/anx7625.c
> +++ b/drivers/gpu/drm/bridge/analogix/anx7625.c
> @@ -893,7 +893,7 @@ static void anx7625_power_on(struct anx7625_data *ctx)
>   usleep_range(2000, 2100);
>   }
>  
> - usleep_range(4000, 4100);
> + usleep_range(1, 11000);
>  
>   /* Power on pin enable */
>   gpiod_set_value(ctx->pdata.gpio_p_on, 1);
> 

Reviewed-by: Neil Armstrong 
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 08/10] drm/amdgpu: Add DMA mapping of GTT BOs

2021-04-27 Thread Zeng, Oak

Regards,
Oak 

On 2021-04-26, 11:56 PM, "Kuehling, Felix"  wrote:

Am 2021-04-26 um 8:35 p.m. schrieb Zeng, Oak:
> Regards,
> Oak 
>
>  
>
> On 2021-04-21, 9:31 PM, "amd-gfx on behalf of Felix Kuehling" 

wrote:
>
> Use DMABufs with dynamic attachment to DMA-map GTT BOs on other GPUs.
>
> Signed-off-by: Felix Kuehling 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  2 +
>  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 76 
++-
>  2 files changed, 77 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index 63668433f5a6..b706e5a54782 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -41,6 +41,7 @@ struct amdgpu_device;
>  enum kfd_mem_attachment_type {
>   KFD_MEM_ATT_SHARED, /* Share kgd_mem->bo or another 
attachment's */
>   KFD_MEM_ATT_USERPTR,/* SG bo to DMA map pages from a 
userptr bo */
> + KFD_MEM_ATT_DMABUF, /* DMAbuf to DMA map TTM BOs */
>  };
>
>  struct kfd_mem_attachment {
> @@ -56,6 +57,7 @@ struct kfd_mem_attachment {
>  struct kgd_mem {
>   struct mutex lock;
>   struct amdgpu_bo *bo;
> + struct dma_buf *dmabuf;
>   struct list_head attachments;
>   /* protected by amdkfd_process_info.lock */
>   struct ttm_validate_buffer validate_list;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 9eeedd0c7920..18a1f9222a59 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -524,6 +524,16 @@ kfd_mem_dmamap_userptr(struct kgd_mem *mem,
>   return ret;
>  }
>
> +static int
> +kfd_mem_dmamap_dmabuf(struct kfd_mem_attachment *attachment)
> +{
> + struct ttm_operation_ctx ctx = {.interruptible = true};
> + struct amdgpu_bo *bo = attachment->bo_va->base.bo;
> +
> + amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_GTT);
> + return ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
> How does this work? The function name says this is dma mapping a buffer 
but from the implementation, it is just a placement and validation

Conceptually, calling ttm_bo_validate ensures that the BO is in the
specified domain, in this case GTT. Before calling validate, it can be
in the CPU domain, which means it may be swapped to disk so it's not GPU
accessible. For a DMABuf attachment, the CPU domain means, that the
DMABuf is not attached because the underlying memory object may be on
the move or swapped out.

The actual implementation of the dmabuf attachment is currently in
amdgpu_ttm_populate/unpopulate. This is incorrect. Patch 10 in this
series fixes that to move the actual dmabuf attachment into
amdgpu_ttm_backend_bind/unbind, which is called from amdgpu_bo_move when
a BO is moved between the CPU and GTT domains.

Thanks for the explanation. One more thing I don't quite understand: before 
this series, GTT memory should already has been validated somewhere before GTT 
memory is mapped to GPU. You added GTT memory validation here - will this 
validation be duplicated?

The function naming kfd_mem_dmamap_dmabuf is still confusing since it seems to 
me it is only some preparation work before dynamically dma-map a GTT memory. 
But I understand from this series' perspective, compared to usrptr (where you 
actually do the dma-mapping in function kfd_mem_dmamap_usrptr), for gtt memory 
you leveraged the amdgpu ttm function of dynamic dma-mapping. So maybe the 
naming here makes sense from that perspective.

Another thing related but not directly to this series: for GTT memory, it is 
dma-mapped when it is allocated. See function ttm_populate_and_map_pages 
calling dma_map_page. The question is, will gtt be first dma-unmapping before 
it is mapped in amdgpu_ttm_backend_bind? It is existing work, not from your 
series. Maybe there is not issue but I just want to make sure while we are 
looking at this area. 

Regards,
  Felix

> +}
> +
>  static int
>  kfd_mem_dmamap_attachment(struct kgd_mem *mem,
> struct kfd_mem_attachment *attachment)
> @@ -533,6 +543,8 @@ kfd_mem_dmamap_attachment(struct kgd_mem *mem,
>   return 0;
>   case KFD_MEM_ATT_USERPTR:
>   return kfd_mem_dmamap_userptr(mem, attachment);
> + case KFD_MEM_ATT_DMABUF:
> + return kfd_mem_dmamap_dmabuf(atta

RE: [PATCH] drm/i915/gem: Remove reference to struct drm_device.pdev

2021-04-27 Thread Ruhl, Michael J


>-Original Message-
>From: dri-devel  On Behalf Of
>Thomas Zimmermann
>Sent: Tuesday, April 27, 2021 7:08 AM
>To: jani.nik...@linux.intel.com; joonas.lahti...@linux.intel.com; Vivi, Rodrigo
>; airl...@linux.ie; dan...@ffwll.ch; Auld, Matthew
>
>Cc: Tvrtko Ursulin ; Ursulin, Tvrtko
>; Mika Kuoppala
>; intel-...@lists.freedesktop.org; Gustavo
>A. R. Silva ; dri-devel@lists.freedesktop.org; Chris
>Wilson ; Tang, CQ ; Hellstrom,
>Thomas ; Thomas Zimmermann
>; Daniel Vetter ; Liu,
>Xinyun ; Dan Carpenter 
>Subject: [PATCH] drm/i915/gem: Remove reference to struct
>drm_device.pdev
>
>References to struct drm_device.pdev should be used any longer as

should not be used
 ^^^
?

m

>the field will be moved into the struct's legacy section. Add a fix
>for the rsp commit.
>
>Signed-off-by: Thomas Zimmermann 
>Fixes: d57d4a1daf5e ("drm/i915: Create stolen memory region from local
>memory")
>Cc: CQ Tang 
>Cc: Matthew Auld 
>Cc: Tvrtko Ursulin 
>Cc: Xinyun Liu 
>Cc: Tvrtko Ursulin 
>Cc: Jani Nikula 
>Cc: Joonas Lahtinen 
>Cc: Rodrigo Vivi 
>Cc: Chris Wilson 
>Cc: Mika Kuoppala 
>Cc: Daniel Vetter 
>Cc: Maarten Lankhorst 
>Cc: "Thomas Hellström" 
>Cc: "Gustavo A. R. Silva" 
>Cc: Dan Carpenter 
>Cc: intel-...@lists.freedesktop.org
>---
> drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>index c5b64b2400e8..e1a32672bbe8 100644
>--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>@@ -773,7 +773,7 @@ struct intel_memory_region *
> i915_gem_stolen_lmem_setup(struct drm_i915_private *i915)
> {
>   struct intel_uncore *uncore = &i915->uncore;
>-  struct pci_dev *pdev = i915->drm.pdev;
>+  struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
>   struct intel_memory_region *mem;
>   resource_size_t io_start;
>   resource_size_t lmem_size;
>--
>2.31.1
>
>___
>dri-devel mailing list
>dri-devel@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 3/7] drm/i915/gtt: map the PD up front

2021-04-27 Thread Tvrtko Ursulin




On 27/04/2021 09:54, Matthew Auld wrote:

We need to generalise our accessor for the page directories and tables from
using the simple kmap_atomic to support local memory, and this setup
must be done on acquisition of the backing storage prior to entering
fence execution contexts. Here we replace the kmap with the object
mapping code that for simple single page shmemfs object will return a
plain kmap, that is then kept for the lifetime of the page directory.

Note that keeping the mapping around is a potential concern here, since
while the vma is pinned the mapping remains there for the PDs
underneath, or at least until the used_count reaches zero, at which
point we can safely destroy the mapping. For 32b this will be even worse
since the address space is more limited, but since this change mostly
impacts full ppGTT platforms, the justification is that for modern
platforms we shouldn't care too much about 32b.


Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko


Signed-off-by: Matthew Auld 
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
  .../drm/i915/gem/selftests/i915_gem_context.c | 11 +
  drivers/gpu/drm/i915/gt/gen6_ppgtt.c  | 11 ++---
  drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 26 --
  drivers/gpu/drm/i915/gt/intel_ggtt.c  |  2 +-
  drivers/gpu/drm/i915/gt/intel_gtt.c   | 48 +--
  drivers/gpu/drm/i915/gt/intel_gtt.h   | 11 +++--
  drivers/gpu/drm/i915/gt/intel_ppgtt.c |  7 ++-
  drivers/gpu/drm/i915/i915_vma.c   |  3 +-
  drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 10 ++--
  drivers/gpu/drm/i915/selftests/i915_perf.c|  3 +-
  10 files changed, 54 insertions(+), 78 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 5fef592390cb..ce70d0a3afb2 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -1740,7 +1740,6 @@ static int read_from_scratch(struct i915_gem_context *ctx,
  static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
  {
struct i915_address_space *vm;
-   struct page *page;
u32 *vaddr;
int err = 0;
  
@@ -1748,24 +1747,18 @@ static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)

if (!vm)
return -ENODEV;
  
-	page = __px_page(vm->scratch[0]);

-   if (!page) {
+   if (!vm->scratch[0]) {
pr_err("No scratch page!\n");
return -EINVAL;
}
  
-	vaddr = kmap(page);

-   if (!vaddr) {
-   pr_err("No (mappable) scratch page!\n");
-   return -EINVAL;
-   }
+   vaddr = __px_vaddr(vm->scratch[0]);
  
  	memcpy(out, vaddr, sizeof(*out));

if (memchr_inv(vaddr, *out, PAGE_SIZE)) {
pr_err("Inconsistent initial state of scratch page!\n");
err = -EINVAL;
}
-   kunmap(page);
  
  	return err;

  }
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index e08dff376339..21b1085769be 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -96,9 +96,8 @@ static void gen6_ppgtt_clear_range(struct i915_address_space 
*vm,
 * entries back to scratch.
 */
  
-		vaddr = kmap_atomic_px(pt);

+   vaddr = px_vaddr(pt);
memset32(vaddr + pte, scratch_pte, count);
-   kunmap_atomic(vaddr);
  
  		pte = 0;

}
@@ -120,7 +119,7 @@ static void gen6_ppgtt_insert_entries(struct 
i915_address_space *vm,
  
  	GEM_BUG_ON(!pd->entry[act_pt]);
  
-	vaddr = kmap_atomic_px(i915_pt_entry(pd, act_pt));

+   vaddr = px_vaddr(i915_pt_entry(pd, act_pt));
do {
GEM_BUG_ON(sg_dma_len(iter.sg) < I915_GTT_PAGE_SIZE);
vaddr[act_pte] = pte_encode | GEN6_PTE_ADDR_ENCODE(iter.dma);
@@ -136,12 +135,10 @@ static void gen6_ppgtt_insert_entries(struct 
i915_address_space *vm,
}
  
  		if (++act_pte == GEN6_PTES) {

-   kunmap_atomic(vaddr);
-   vaddr = kmap_atomic_px(i915_pt_entry(pd, ++act_pt));
+   vaddr = px_vaddr(i915_pt_entry(pd, ++act_pt));
act_pte = 0;
}
} while (1);
-   kunmap_atomic(vaddr);
  
  	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;

  }
@@ -235,7 +232,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
goto err_scratch0;
}
  
-	ret = pin_pt_dma(vm, vm->scratch[1]);

+   ret = map_pt_dma(vm, vm->scratch[1]);
if (ret)
goto err_scratch1;
  
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c

index 176c19633412..f83496836f0f 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -242,11 +242,10 @@

Re: [Intel-gfx] [PATCH v2 4/7] drm/i915/gtt/dgfx: place the PD in LMEM


On 27/04/2021 14:34, Tang, CQ wrote:




-Original Message-
From: Intel-gfx  On Behalf Of
Matthew Auld
Sent: Tuesday, April 27, 2021 1:54 AM
To: intel-...@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Subject: [Intel-gfx] [PATCH v2 4/7] drm/i915/gtt/dgfx: place the PD in LMEM

It's a requirement that for dgfx we place all the paging structures in device
local-memory.

v2: use i915_coherent_map_type()
v3: improve the shared dma-resv object comment

Signed-off-by: Matthew Auld 
Cc: Tvrtko Ursulin 
---
  drivers/gpu/drm/i915/gt/gen8_ppgtt.c |  5 -
drivers/gpu/drm/i915/gt/intel_gtt.c  | 30 +---
drivers/gpu/drm/i915/gt/intel_gtt.h  |  1 +
  3 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index f83496836f0f..11fb5df45a0f 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -712,7 +712,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt
*gt)
 */
ppgtt->vm.has_read_only = !IS_GEN_RANGE(gt->i915, 11, 12);

-   ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
+   if (HAS_LMEM(gt->i915))
+   ppgtt->vm.alloc_pt_dma = alloc_pt_lmem;


Here we might want to allocate lmem from the 'gt' in the argument,  however, 
below inside alloc_pt_lmem(), we always allocate lmem to tile0.
Is this desired?


Yeah, AFAIK that is all handled in some later patches which have not yet 
made there way upstream. For DG1 they don't really do anything 
interesting, but yes we need them for Xe HP at some point.




--CQ


+   else
+   ppgtt->vm.alloc_pt_dma = alloc_pt_dma;

err = gen8_init_scratch(&ppgtt->vm);
if (err)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index d386b89e2758..061c39d2ad51 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -7,10 +7,26 @@

  #include 

+#include "gem/i915_gem_lmem.h"
  #include "i915_trace.h"
  #include "intel_gt.h"
  #include "intel_gtt.h"

+struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space
+*vm, int sz) {
+   struct drm_i915_gem_object *obj;
+
+   obj = i915_gem_object_create_lmem(vm->i915, sz, 0);
+   /*
+* Ensure all paging structures for this vm share the same dma-resv
+* object underneath, with the idea that one object_lock() will lock
+* them all at once.
+*/
+   if (!IS_ERR(obj))
+   obj->base.resv = &vm->resv;
+   return obj;
+}
+
  struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm,
int sz)  {
struct drm_i915_gem_object *obj;
@@ -19,7 +35,11 @@ struct drm_i915_gem_object *alloc_pt_dma(struct
i915_address_space *vm, int sz)
i915_gem_shrink_all(vm->i915);

obj = i915_gem_object_create_internal(vm->i915, sz);
-   /* ensure all dma objects have the same reservation class */
+   /*
+* Ensure all paging structures for this vm share the same dma-resv
+* object underneath, with the idea that one object_lock() will lock
+* them all at once.
+*/
if (!IS_ERR(obj))
obj->base.resv = &vm->resv;
return obj;
@@ -27,9 +47,11 @@ struct drm_i915_gem_object *alloc_pt_dma(struct
i915_address_space *vm, int sz)

  int map_pt_dma(struct i915_address_space *vm, struct
drm_i915_gem_object *obj)  {
+   enum i915_map_type type;
void *vaddr;

-   vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
+   type = i915_coherent_map_type(vm->i915, obj, true);
+   vaddr = i915_gem_object_pin_map_unlocked(obj, type);
if (IS_ERR(vaddr))
return PTR_ERR(vaddr);

@@ -39,9 +61,11 @@ int map_pt_dma(struct i915_address_space *vm,
struct drm_i915_gem_object *obj)

  int map_pt_dma_locked(struct i915_address_space *vm, struct
drm_i915_gem_object *obj)  {
+   enum i915_map_type type;
void *vaddr;

-   vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+   type = i915_coherent_map_type(vm->i915, obj, true);
+   vaddr = i915_gem_object_pin_map(obj, type);
if (IS_ERR(vaddr))
return PTR_ERR(vaddr);

diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 40e486704558..44ce27c51631 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -527,6 +527,7 @@ int setup_scratch_page(struct i915_address_space
*vm);  void free_scratch(struct i915_address_space *vm);

  struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm,
int sz);
+struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space
+*vm, int sz);
  struct i915_page_table *alloc_pt(struct i915_address_space *vm);  struct
i915_page_directory *alloc_pd(struct i915_address_space *vm);  struct
i915_page_directory *__alloc_pd(int npde);
--
2.26.3

_

Re: [PATCH v2 4/7] drm/i915/gtt/dgfx: place the PD in LMEM

2021-04-27 Thread Tvrtko Ursulin




On 27/04/2021 09:54, Matthew Auld wrote:

It's a requirement that for dgfx we place all the paging structures in
device local-memory.

v2: use i915_coherent_map_type()
v3: improve the shared dma-resv object comment

Signed-off-by: Matthew Auld 
Cc: Tvrtko Ursulin 
---
  drivers/gpu/drm/i915/gt/gen8_ppgtt.c |  5 -
  drivers/gpu/drm/i915/gt/intel_gtt.c  | 30 +---
  drivers/gpu/drm/i915/gt/intel_gtt.h  |  1 +
  3 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index f83496836f0f..11fb5df45a0f 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -712,7 +712,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
 */
ppgtt->vm.has_read_only = !IS_GEN_RANGE(gt->i915, 11, 12);
  
-	ppgtt->vm.alloc_pt_dma = alloc_pt_dma;

+   if (HAS_LMEM(gt->i915))
+   ppgtt->vm.alloc_pt_dma = alloc_pt_lmem;
+   else
+   ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
  
  	err = gen8_init_scratch(&ppgtt->vm);

if (err)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index d386b89e2758..061c39d2ad51 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -7,10 +7,26 @@
  
  #include 
  
+#include "gem/i915_gem_lmem.h"

  #include "i915_trace.h"
  #include "intel_gt.h"
  #include "intel_gtt.h"
  
+struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz)

+{
+   struct drm_i915_gem_object *obj;
+
+   obj = i915_gem_object_create_lmem(vm->i915, sz, 0);
+   /*
+* Ensure all paging structures for this vm share the same dma-resv
+* object underneath, with the idea that one object_lock() will lock
+* them all at once.


Okay but I am still missing the part about why is this beneficial and 
not a downside. I suppose it is not a concept added by this patch so not 
fair to ask for explanation here anyway.


Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko


+*/
+   if (!IS_ERR(obj))
+   obj->base.resv = &vm->resv;
+   return obj;
+}
+
  struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int 
sz)
  {
struct drm_i915_gem_object *obj;
@@ -19,7 +35,11 @@ struct drm_i915_gem_object *alloc_pt_dma(struct 
i915_address_space *vm, int sz)
i915_gem_shrink_all(vm->i915);
  
  	obj = i915_gem_object_create_internal(vm->i915, sz);

-   /* ensure all dma objects have the same reservation class */
+   /*
+* Ensure all paging structures for this vm share the same dma-resv
+* object underneath, with the idea that one object_lock() will lock
+* them all at once.
+*/
if (!IS_ERR(obj))
obj->base.resv = &vm->resv;
return obj;
@@ -27,9 +47,11 @@ struct drm_i915_gem_object *alloc_pt_dma(struct 
i915_address_space *vm, int sz)
  
  int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)

  {
+   enum i915_map_type type;
void *vaddr;
  
-	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);

+   type = i915_coherent_map_type(vm->i915, obj, true);
+   vaddr = i915_gem_object_pin_map_unlocked(obj, type);
if (IS_ERR(vaddr))
return PTR_ERR(vaddr);
  
@@ -39,9 +61,11 @@ int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
  
  int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)

  {
+   enum i915_map_type type;
void *vaddr;
  
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);

+   type = i915_coherent_map_type(vm->i915, obj, true);
+   vaddr = i915_gem_object_pin_map(obj, type);
if (IS_ERR(vaddr))
return PTR_ERR(vaddr);
  
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h

index 40e486704558..44ce27c51631 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -527,6 +527,7 @@ int setup_scratch_page(struct i915_address_space *vm);
  void free_scratch(struct i915_address_space *vm);
  
  struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz);

+struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int 
sz);
  struct i915_page_table *alloc_pt(struct i915_address_space *vm);
  struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
  struct i915_page_directory *__alloc_pd(int npde);


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC PATCH 0/3] A drm_plane API to support HDR planes

2021-04-27 Thread Pekka Paalanen

On Mon, 26 Apr 2021 13:38:49 -0400
Harry Wentland  wrote:

> ## Introduction
> 
> We are looking to enable HDR support for a couple of single-plane and
> multi-plane scenarios. To do this effectively we recommend new
> interfaces to drm_plane. Below I'll give a bit of background on HDR
> and why we propose these interfaces.
> 
> 
> ## Defining a pixel's luminance
> 
> Currently the luminance space of pixels in a framebuffer/plane
> presented to the display is not well defined. It's usually assumed to
> be in a 2.2 or 2.4 gamma space and has no mapping to an absolute
> luminance value but is interpreted in relative terms.
> 
> Luminance can be measured and described in absolute terms as candela
> per meter squared, or cd/m2, or nits. Even though a pixel value can
> be mapped to luminance in a linear fashion to do so without losing a
> lot of detail requires 16-bpc color depth. The reason for this is
> that human perception can distinguish roughly between a 0.5-1%
> luminance delta. A linear representation is suboptimal, wasting
> precision in the highlights and losing precision in the shadows.
> 
> A gamma curve is a decent approximation to a human's perception of
> luminance, but the PQ (perceptual quantizer) function [1] improves on
> it. It also defines the luminance values in absolute terms, with the
> highest value being 10,000 nits and the lowest 0.0005 nits.
> 
> Using a content that's defined in PQ space we can approximate the
> real world in a much better way.
> 
> Here are some examples of real-life objects and their approximate
> luminance values:
> 
> | Object| Luminance in nits |
> | - | - |
> | Sun   | 1.6 million   |
> | Fluorescent light | 10,000|
> | Highlights| 1,000 - sunlight  |
> | White Objects | 250 - 1,000   |
> | Typical objects   | 1 - 250   |
> | Shadows   | 0.01 - 1  |
> | Ultra Blacks  | 0 - 0.0005|
> 
> 
> ## Describing the luminance space
> 
> **We propose a new drm_plane property to describe the Eletro-Optical
> Transfer Function (EOTF) with which its framebuffer was composed.**
> Examples of EOTF are:
> 
> | EOTF  | Description 
>   |
> | - 
> |:- |
> | Gamma 2.2 | a simple 2.2 gamma  
>   |
> | sRGB  | 2.4 gamma with small initial linear section 
>   |
> | PQ 2084   | SMPTE ST 2084; used for HDR video and allows for up to 10,000 
> nit support |
> | Linear| Linear relationship between pixel value and luminance value 
>   |
> 

The definitions agree with what I have learnt so far. However, with
these EOTF definitions, only PQ defines absolute luminance values
while the others do not. So this is not enough information to blend
planes together if they do not all use the same EOTF with the same
dynamic range. More below.


> 
> ## Mastering Luminances
> 
> Now we are able to use the PQ 2084 EOTF to define the luminance of
> pixels in absolute terms. Unfortunately we're again presented with
> physical limitations of the display technologies on the market today.
> Here are a few examples of luminance ranges of displays.
> 
> | Display  | Luminance range in nits |
> |  | --- |
> | Typical PC display   | 0.3 - 200   |
> | Excellent LCD HDTV   | 0.3 - 400   |
> | HDR LCD w/ local dimming | 0.05 - 1,500|
> 
> Since no display can currently show the full 0.0005 to 10,000 nits
> luminance range the display will need to tonemap the HDR content, i.e
> to fit the content within a display's capabilities. To assist with
> tonemapping HDR content is usually accompanied with a metadata that
> describes (among other things) the minimum and maximum mastering
> luminance, i.e. the maximum and minimum luminance of the display that
> was used to master the HDR content.
> 
> The HDR metadata is currently defined on the drm_connector via the
> hdr_output_metadata blob property.
> 
> It might be useful to define per-plane hdr metadata, as different
> planes might have been mastered differently.

I don't think this would directly help with the dynamic range blending
problem. You still need to establish the mapping between the optical
values from two different EOTFs and dynamic ranges. Or can you know
which optical values match the mastering display maximum and minimum
luminances for not-PQ?


> ## SDR Luminance
> 
> Since SDR covers a smaller luminance range than HDR, an SDR plane
> might look dark when blended with HDR content. Since the max HDR
> luminance can be quite variable (200-1,500 nits on actual displays)
> it is best to make the SDR maximum luminance value configurable.
> 
> **We propose a drm_plane property to specfy the desir

Re: [PATCH v2 08/10] drm/amdgpu: Add DMA mapping of GTT BOs

2021-04-27 Thread Felix Kuehling

Am 2021-04-27 um 10:29 a.m. schrieb Zeng, Oak:
> Regards,
> Oak 
>
>  
>
> On 2021-04-26, 11:56 PM, "Kuehling, Felix"  wrote:
>
> Am 2021-04-26 um 8:35 p.m. schrieb Zeng, Oak:
> > Regards,
> > Oak 
> >
> >  
> >
> > On 2021-04-21, 9:31 PM, "amd-gfx on behalf of Felix Kuehling" 
>  
> wrote:
> >
> > Use DMABufs with dynamic attachment to DMA-map GTT BOs on other 
> GPUs.
> >
> > Signed-off-by: Felix Kuehling 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  2 +
> >  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 76 
> ++-
> >  2 files changed, 77 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > index 63668433f5a6..b706e5a54782 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > @@ -41,6 +41,7 @@ struct amdgpu_device;
> >  enum kfd_mem_attachment_type {
> > KFD_MEM_ATT_SHARED, /* Share kgd_mem->bo or another 
> attachment's */
> > KFD_MEM_ATT_USERPTR,/* SG bo to DMA map pages from a 
> userptr bo */
> > +   KFD_MEM_ATT_DMABUF, /* DMAbuf to DMA map TTM BOs */
> >  };
> >
> >  struct kfd_mem_attachment {
> > @@ -56,6 +57,7 @@ struct kfd_mem_attachment {
> >  struct kgd_mem {
> > struct mutex lock;
> > struct amdgpu_bo *bo;
> > +   struct dma_buf *dmabuf;
> > struct list_head attachments;
> > /* protected by amdkfd_process_info.lock */
> > struct ttm_validate_buffer validate_list;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > index 9eeedd0c7920..18a1f9222a59 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > @@ -524,6 +524,16 @@ kfd_mem_dmamap_userptr(struct kgd_mem *mem,
> > return ret;
> >  }
> >
> > +static int
> > +kfd_mem_dmamap_dmabuf(struct kfd_mem_attachment *attachment)
> > +{
> > +   struct ttm_operation_ctx ctx = {.interruptible = true};
> > +   struct amdgpu_bo *bo = attachment->bo_va->base.bo;
> > +
> > +   amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_GTT);
> > +   return ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
> > How does this work? The function name says this is dma mapping a buffer 
> but from the implementation, it is just a placement and validation
>
> Conceptually, calling ttm_bo_validate ensures that the BO is in the
> specified domain, in this case GTT. Before calling validate, it can be
> in the CPU domain, which means it may be swapped to disk so it's not GPU
> accessible. For a DMABuf attachment, the CPU domain means, that the
> DMABuf is not attached because the underlying memory object may be on
> the move or swapped out.
>
> The actual implementation of the dmabuf attachment is currently in
> amdgpu_ttm_populate/unpopulate. This is incorrect. Patch 10 in this
> series fixes that to move the actual dmabuf attachment into
> amdgpu_ttm_backend_bind/unbind, which is called from amdgpu_bo_move when
> a BO is moved between the CPU and GTT domains.
>
> Thanks for the explanation. One more thing I don't quite understand: before 
> this series, GTT memory should already has been validated somewhere before 
> GTT memory is mapped to GPU. You added GTT memory validation here - will this 
> validation be duplicated?

When you have N GPUs there are now N BOs involved. Each GPU needs its
own BO because it needs its own DMA mapping. There will be one actual
GTT BO that allocates physical pages in TTM. The other BOs are dmabuf
imports that DMA-map the same physical pages for access by the other GPUs.

The validate call here validates one of the dmabuf imports. This does
not duplicate the validation of the underlying TTM BO with the actual
physical memory allocation.


>
> The function naming kfd_mem_dmamap_dmabuf is still confusing since it seems 
> to me it is only some preparation work before dynamically dma-map a GTT 
> memory.

No, this series is not just preparation. It implements DMA mapping of
BOs for multiple GPUs. TTM already handles DMA mapping of the memory for
the device where the memory was allocated. (Yes, even GTT memory is
associated with a specific GPU even though it's physically in system
memory). What this patch series adds, is additional DMA mappings for the
other GPUs. Without this patch, we were using the DMA mapping for GPU-1
in the page table of GPU-X, which is incorrect. It works in many cases
where the DMA mapping is a direct mapping:

  * IOMMU disabled
  * IOMMU

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

Uff good question. DMA-buf certainly supports that use case, but I have 
no idea if that is actually used somewhere.

Daniel do you know any case?

Christian.

Am 27.04.21 um 15:26 schrieb Marek Olšák:

Ok. So that would only make the following use cases broken for now:
- amd render -> external gpu
- amd video encode -> network device

What about the case when we get a buffer from an external device and 
we're supposed to make it "busy" when we are using it, and the 
external device wants to wait until we stop using it? Is it something 
that can happen, thus turning "external -> amd" into "external <-> amd"?

Marek

On Tue., Apr. 27, 2021, 08:50 Christian König, 
> wrote:

Only amd -> external.

We can easily install something in an user queue which waits for a
dma_fence in the kernel.

But we can't easily wait for an user queue as dependency of a
dma_fence.

The good thing is we have this wait before signal case on Vulkan
timeline semaphores which have the same problem in the kernel.

The good news is I think we can relatively easily convert i915 and
older amdgpu device to something which is compatible with user fences.

So yes, getting that fixed case by case should work.

Christian

Am 27.04.21 um 14:46 schrieb Marek Olšák:

I'll defer to Christian and Alex to decide whether dropping sync
with non-amd devices (GPUs, cameras etc.) is acceptable.

Rewriting those drivers to this new sync model could be done on a
case by case basis.

For now, would we only lose the "amd -> external" dependency? Or
the "external -> amd" dependency too?

Marek

On Tue., Apr. 27, 2021, 08:15 Daniel Vetter, mailto:dan...@ffwll.ch>> wrote:

On Tue, Apr 27, 2021 at 2:11 PM Marek Olšák mailto:mar...@gmail.com>> wrote:
> Ok. I'll interpret this as "yes, it will work, let's do it".

It works if all you care about is drm/amdgpu. I'm not sure
that's a
reasonable approach for upstream, but it definitely is an
approach :-)

We've already gone somewhat through the pain of drm/amdgpu
redefining
how implicit sync works without sufficiently talking with other
people, maybe we should avoid a repeat of this ...
-Daniel

>
> Marek
>
> On Tue., Apr. 27, 2021, 08:06 Christian König,
mailto:ckoenig.leichtzumer...@gmail.com>> wrote:
>>
>> Correct, we wouldn't have synchronization between device
with and without user queues any more.
>>
>> That could only be a problem for A+I Laptops.
>>
>> Memory management will just work with preemption fences
which pause the user queues of a process before evicting
something. That will be a dma_fence, but also a well known
approach.
>>
>> Christian.
>>
>> Am 27.04.21 um 13:49 schrieb Marek Olšák:
>>
>> If we don't use future fences for DMA fences at all, e.g.
we don't use them for memory management, it can work, right?
Memory management can suspend user queues anytime. It doesn't
need to use DMA fences. There might be something that I'm
missing here.
>>
>> What would we lose without DMA fences? Just inter-device
synchronization? I think that might be acceptable.
>>
>> The only case when the kernel will wait on a future fence
is before a page flip. Everything today already depends on
userspace not hanging the gpu, which makes everything a
future fence.
>>
>> Marek
>>
>> On Tue., Apr. 27, 2021, 04:02 Daniel Vetter,
mailto:dan...@ffwll.ch>> wrote:
>>>
>>> On Mon, Apr 26, 2021 at 04:59:28PM -0400, Marek Olšák wrote:
>>> > Thanks everybody. The initial proposal is dead. Here
are some thoughts on
>>> > how to do it differently.
>>> >
>>> > I think we can have direct command submission from
userspace via
>>> > memory-mapped queues ("user queues") without changing
window systems.
>>> >
>>> > The memory management doesn't have to use GPU page
faults like HMM.
>>> > Instead, it can wait for user queues of a specific
process to go idle and
>>> > then unmap the queues, so that userspace can't submit
anything. Buffer
>>> > evictions, pinning, etc. can be executed when all
queues are unmapped
>>> > (suspended). Thus, no BO fences and page faults are needed.
>>> >
>>> > Inter-process synchronization can use timeline
semaphores. Userspace will
>>> > query the wait and signal value for a shared buffer
from the kernel. The
>>> > kernel will keep a history of those queries to know
which process is
>>> >

Re: [PATCH v2 00/10] Implement multi-GPU DMA mappings for KFD

2021-04-27 Thread Zeng, Oak

This series is Acked-by: Oak Zeng  

Regards,
Oak 

 

On 2021-04-21, 9:31 PM, "dri-devel on behalf of Felix Kuehling" 
 
wrote:

This patch series fixes DMA-mappings of system memory (GTT and userptr)
for KFD running on multi-GPU systems with IOMMU enabled. One SG-BO per
GPU is needed to maintain the DMA mappings of each BO.

Changes in v2:
- Made the original BO parent of the SG BO to fix bo destruction order
- Removed individualiation hack that is, not needed with parent BO
- Removed resv locking hace in amdgpu_ttm_unpopulate, not needed without
  the individualization hack
- Added a patch to enable the Intel IOMMU driver in rock-dbg_defconfig
- Added a patch to move dmabuf attach/detach into backend_(un)bind

I'm still seeing some IOMMU access faults in the eviction test. They seem
to be related to userptr handling. They happen even without this patch
series on a single-GPU system, where this patch series is not needed. I
believe this is an old problem in KFD or amdgpu that is being exposed by
device isolation from the IOMMU. I'm debugging it, but it should not hold
up this patch series.

"drm/ttm: Don't count pages in SG BOs against pages_limit" was already
applied to drm-misc (I think). I'm still including it here because my
patches depend on it. Without that, the SG BOs created for DMA mappings
cause many tests fail because TTM incorrectly thinks it's out of memory.

Felix Kuehling (10):
  rock-dbg_defconfig: Enable Intel IOMMU
  drm/amdgpu: Rename kfd_bo_va_list to kfd_mem_attachment
  drm/amdgpu: Keep a bo-reference per-attachment
  drm/amdgpu: Simplify AQL queue mapping
  drm/amdgpu: Add multi-GPU DMA mapping helpers
  drm/amdgpu: DMA map/unmap when updating GPU mappings
  drm/amdgpu: Move kfd_mem_attach outside reservation
  drm/amdgpu: Add DMA mapping of GTT BOs
  drm/ttm: Don't count pages in SG BOs against pages_limit
  drm/amdgpu: Move dmabuf attach/detach to backend_(un)bind

 arch/x86/configs/rock-dbg_defconfig   |  11 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  18 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 530 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |  51 +-
 drivers/gpu/drm/ttm/ttm_tt.c  |  27 +-
 5 files changed, 437 insertions(+), 200 deletions(-)

-- 
2.31.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&data=04%7C01%7Coak.zeng%40amd.com%7Cfb31922bd50846641e9508d9052e635d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637546519058204046%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=yxNesWxDmM5H8ObiNmeaa0DBIEyptiBpjUKSUqS%2B52M%3D&reserved=0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 1/8] drm/arm: Don't set allow_fb_modifiers explicitly

2021-04-27 Thread Liviu Dudau

On Tue, Apr 27, 2021 at 11:20:11AM +0200, Daniel Vetter wrote:
> Since
> 
> commit 890880ddfdbe256083170866e49c87618b706ac7
> Author: Paul Kocialkowski 
> Date:   Fri Jan 4 09:56:10 2019 +0100
> 
> drm: Auto-set allow_fb_modifiers when given modifiers at plane init
> 
> this is done automatically as part of plane init, if drivers set the
> modifier list correctly. Which is the case here for both komeda and
> malidp.
> 
> Signed-off-by: Daniel Vetter 
> Cc: "James (Qian) Wang" 
> Cc: Liviu Dudau 

Acked-by: Liviu Dudau 

Best regards,
Liviu

> Cc: Mihail Atanassov 
> Cc: Brian Starkey 
> ---
>  drivers/gpu/drm/arm/display/komeda/komeda_kms.c | 1 -
>  drivers/gpu/drm/arm/malidp_drv.c| 1 -
>  2 files changed, 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_kms.c 
> b/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
> index aeda4e5ec4f4..ff45f23f3d56 100644
> --- a/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
> +++ b/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
> @@ -247,7 +247,6 @@ static void komeda_kms_mode_config_init(struct 
> komeda_kms_dev *kms,
>   config->min_height  = 0;
>   config->max_width   = 4096;
>   config->max_height  = 4096;
> - config->allow_fb_modifiers = true;
>  
>   config->funcs = &komeda_mode_config_funcs;
>   config->helper_private = &komeda_mode_config_helpers;
> diff --git a/drivers/gpu/drm/arm/malidp_drv.c 
> b/drivers/gpu/drm/arm/malidp_drv.c
> index d83c7366b348..de59f3302516 100644
> --- a/drivers/gpu/drm/arm/malidp_drv.c
> +++ b/drivers/gpu/drm/arm/malidp_drv.c
> @@ -403,7 +403,6 @@ static int malidp_init(struct drm_device *drm)
>   drm->mode_config.max_height = hwdev->max_line_size;
>   drm->mode_config.funcs = &malidp_mode_config_funcs;
>   drm->mode_config.helper_private = &malidp_mode_config_helpers;
> - drm->mode_config.allow_fb_modifiers = true;
>  
>   ret = malidp_crtc_init(drm);
>   if (ret)
> -- 
> 2.31.0
> 

-- 

| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---
¯\_(ツ)_/¯
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm: i915: fix build when ACPI is disabled and BACKLIGHT=m

2021-04-27 Thread Randy Dunlap

On 4/27/21 1:03 AM, Jani Nikula wrote:
> On Mon, 26 Apr 2021, Randy Dunlap  wrote:
>> When CONFIG_DRM_I915=y, CONFIG_ACPI is not set, and
>> CONFIG_BACKLIGHT_CLASS_DEVICE=m, not due to I915 config,
>> there are build errors trying to reference backlight_device_{un}register().
>>
>> Changing the use of IS_ENABLED() to IS_REACHABLE() in intel_panel.[ch]
>> fixes this.
> 
> I feel like a broken record...

Thanks! :)

I'll leave it b0rken as well.


> CONFIG_DRM_I915=y and CONFIG_BACKLIGHT_CLASS_DEVICE=m is an invalid
> configuration. The patch at hand just silently hides the problem,
> leaving you without backlight.
> 
> i915 should *depend* on backlight, not select it. It would express the
> dependency without chances for invalid configuration.
> 
> However, i915 alone can't depend on backlight, all users of backlight
> should depend on backlight, not select it. Otherwise, you end up with
> other configuration problems, circular dependencies and
> whatnot. Everyone should change. See also (*) why select is not a good
> idea here.
> 
> I've sent patches to this effect before, got rejected, and the same
> thing gets repeated ad infinitum.
> 
> Accepting this patch would stop the inflow of these reports and similar
> patches, but it does not fix the root cause. It just sweeps the problem
> under the rug.
> 
> 
> BR,
> Jani.
> 
> (*) Documentation/kbuild/kconfig-language.rst:
> 
>   select should be used with care. select will force
>   a symbol to a value without visiting the dependencies.
>   By abusing select you are able to select a symbol FOO even
>   if FOO depends on BAR that is not set.
>   In general use select only for non-visible symbols
>   (no prompts anywhere) and for symbols with no dependencies.
>   That will limit the usefulness but on the other hand avoid
>   the illegal configurations all over.

Yes, I'm well aware of that.

ta.
-- 
~Randy

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v3 2/3] drm/mediatek: init panel orientation property

2021-04-27 Thread Chun-Kuang Hu

Hi, Hsin-Yi:

Hsin-Yi Wang  於 2021年4月27日 週二 下午12:49寫道：
>
> Init panel orientation property after connector is initialized. Let the
> panel driver decides the orientation value later.
>
> Signed-off-by: Hsin-Yi Wang 
> ---
>  drivers/gpu/drm/mediatek/mtk_dsi.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/mediatek/mtk_dsi.c 
> b/drivers/gpu/drm/mediatek/mtk_dsi.c
> index ae403c67cbd9..0bd27872f2a4 100644
> --- a/drivers/gpu/drm/mediatek/mtk_dsi.c
> +++ b/drivers/gpu/drm/mediatek/mtk_dsi.c
> @@ -964,6 +964,7 @@ static int mtk_dsi_encoder_init(struct drm_device *drm, 
> struct mtk_dsi *dsi)
> ret = PTR_ERR(dsi->connector);
> goto err_cleanup_encoder;
> }
> +   drm_connector_init_panel_orientation_property(dsi->connector);

Process the return value.

Regards,
Chun-Kuang.

> drm_connector_attach_encoder(dsi->connector, &dsi->encoder);
>
> return 0;
> --
> 2.31.1.498.g6c1eba8ee3d-goog
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 2/8] drm/arm/malidp: Always list modifiers

2021-04-27 Thread Liviu Dudau

On Tue, Apr 27, 2021 at 11:20:12AM +0200, Daniel Vetter wrote:
> Even when all we support is linear, make that explicit. Otherwise the
> uapi is rather confusing.

:)

> 
> Cc: sta...@vger.kernel.org
> Cc: Pekka Paalanen 
> Cc: Liviu Dudau 
> Cc: Brian Starkey 
> Signed-off-by: Daniel Vetter 

Acked-by: Liviu Dudau 

Best regards,
Liviu

> ---
>  drivers/gpu/drm/arm/malidp_planes.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/arm/malidp_planes.c 
> b/drivers/gpu/drm/arm/malidp_planes.c
> index ddbba67f0283..8c2ab3d653b7 100644
> --- a/drivers/gpu/drm/arm/malidp_planes.c
> +++ b/drivers/gpu/drm/arm/malidp_planes.c
> @@ -927,6 +927,11 @@ static const struct drm_plane_helper_funcs 
> malidp_de_plane_helper_funcs = {
>   .atomic_disable = malidp_de_plane_disable,
>  };
>  
> +static const uint64_t linear_only_modifiers[] = {
> + DRM_FORMAT_MOD_LINEAR,
> + DRM_FORMAT_MOD_INVALID
> +};
> +
>  int malidp_de_planes_init(struct drm_device *drm)
>  {
>   struct malidp_drm *malidp = drm->dev_private;
> @@ -990,8 +995,8 @@ int malidp_de_planes_init(struct drm_device *drm)
>*/
>   ret = drm_universal_plane_init(drm, &plane->base, crtcs,
>   &malidp_de_plane_funcs, formats, n,
> - (id == DE_SMART) ? NULL : modifiers, plane_type,
> - NULL);
> + (id == DE_SMART) ? linear_only_modifiers : 
> modifiers,
> + plane_type, NULL);
>  
>   if (ret < 0)
>   goto cleanup;
> -- 
> 2.31.0
> 

-- 

| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---
¯\_(ツ)_/¯
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH V2 2/2] drm/bridge: ti-sn65dsi83: Add TI SN65DSI83 and SN65DSI84 driver

2021-04-27 Thread Felix Radensky

 Hi Marek,

SN65DSI84 also supports single-link LVDS. We have quite a few customers using 
SN65DSI84 on Variscite SOMs with low resolution single-channel LVDS panels. I 
think having a DTS binding to indicate the number of links used by SN65DSI84 is 
more flexible than forcing dual-link mode on all SN65DSI84 users.
Setting "ti,sn65dsi83" compatible when actually SN65DSI84 is used by the board 
is a bit misleading.

There are also some bridge properties that are currently not supported by the 
driver and not reflected in the DTS bindings, e.g. RGB666 vs RGB888, Output 
Format 1 vs Output Format 2. Do you have any plans to support them ?

Thanks a lot for working on this driver, your efforts are much appreciated.

Felix.


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 000/190] Revertion of all of the umn.edu commits

2021-04-27 Thread Greg Kroah-Hartman

On Wed, Apr 21, 2021 at 07:35:44PM +0200, Daniel Vetter wrote:
> On Wed, Apr 21, 2021 at 3:01 PM Greg Kroah-Hartman
>  wrote:
> >
> > I have been meaning to do this for a while, but recent events have
> > finally forced me to do so.
> >
> > Commits from @umn.edu addresses have been found to be submitted in "bad
> > faith" to try to test the kernel community's ability to review "known
> > malicious" changes.  The result of these submissions can be found in a
> > paper published at the 42nd IEEE Symposium on Security and Privacy
> > entitled, "Open Source Insecurity: Stealthily Introducing
> > Vulnerabilities via Hypocrite Commits" written by Qiushi Wu (University
> > of Minnesota) and Kangjie Lu (University of Minnesota).
> >
> > Because of this, all submissions from this group must be reverted from
> > the kernel tree and will need to be re-reviewed again to determine if
> > they actually are a valid fix.  Until that work is complete, remove this
> > change to ensure that no problems are being introduced into the
> > codebase.
> >
> > This patchset has the "easy" reverts, there are 68 remaining ones that
> > need to be manually reviewed.  Some of them are not able to be reverted
> > as they already have been reverted, or fixed up with follow-on patches
> > as they were determined to be invalid.  Proof that these submissions
> > were almost universally wrong.
> 
> Will you take care of these remaining ones in subsequent patches too?

Yes I will.

> > I will be working with some other kernel developers to determine if any
> > of these reverts were actually valid changes, were actually valid, and
> > if so, will resubmit them properly later.  For now, it's better to be
> > safe.
> >
> > I'll take this through my tree, so no need for any maintainer to worry
> > about this, but they should be aware that future submissions from anyone
> > with a umn.edu address should be by default-rejected unless otherwise
> > determined to actually be a valid fix (i.e. they provide proof and you
> > can verify it, but really, why waste your time doing that extra work?)
> >
> > thanks,
> >
> > greg k-h
> >
> > Greg Kroah-Hartman (190):
> >   Revert "net/rds: Avoid potential use after free in
> > rds_send_remove_from_sock"
> >   Revert "media: st-delta: Fix reference count leak in delta_run_work"
> >   Revert "media: sti: Fix reference count leaks"
> >   Revert "media: exynos4-is: Fix several reference count leaks due to
> > pm_runtime_get_sync"
> >   Revert "media: exynos4-is: Fix a reference count leak due to
> > pm_runtime_get_sync"
> >   Revert "media: exynos4-is: Fix a reference count leak"
> >   Revert "media: ti-vpe: Fix a missing check and reference count leak"
> >   Revert "media: stm32-dcmi: Fix a reference count leak"
> >   Revert "media: s5p-mfc: Fix a reference count leak"
> >   Revert "media: camss: Fix a reference count leak."
> >   Revert "media: platform: fcp: Fix a reference count leak."
> >   Revert "media: rockchip/rga: Fix a reference count leak."
> >   Revert "media: rcar-vin: Fix a reference count leak."
> >   Revert "media: rcar-vin: Fix a reference count leak."
> >   Revert "firmware: Fix a reference count leak."
> >   Revert "drm/nouveau: fix reference count leak in
> > nouveau_debugfs_strap_peek"
> >   Revert "drm/nouveau: fix reference count leak in
> > nv50_disp_atomic_commit"
> >   Revert "drm/nouveau: fix multiple instances of reference count leaks"
> >   Revert "drm/nouveau/drm/noveau: fix reference count leak in
> > nouveau_fbcon_open"
> >   Revert "PCI: Fix pci_create_slot() reference count leak"
> >   Revert "omapfb: fix multiple reference count leaks due to
> > pm_runtime_get_sync"
> >   Revert "drm/radeon: Fix reference count leaks caused by
> > pm_runtime_get_sync"
> >   Revert "drm/radeon: fix multiple reference count leak"
> >   Revert "drm/amdkfd: Fix reference count leaks."
> 
> I didn't review these carefully, but from a quick look they all seem
> rather inconsequental. Either error paths that are very unlikely, or
> drivers which are very dead (looking at the entire list, not just what
> you reverted here).
> 
> Acked-by: Daniel Vetter 

Thanks for the quick review, I'm now going over them all again to see if
they are valid or not, some of the pm reference count stuff all looks
correct.  Others not at all.

> Also adding drm maintainers/lists, those aren't all on your cc it
> seems. I will also forward this to fd.o sitewranglers as abuse of our
> infrastructure, it's for community collaboration, not for inflicting
> experiments on unconsenting subjects.

Much appreciated.

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 0/9] drm: Add privacy-screen class and connector properties

2021-04-27 Thread Marco Trevisan

Hi,

>>> There now is GNOME userspace code using the new properties:
>>> https://hackmd.io/@3v1n0/rkyIy3BOw
>> 
>> Thanks for working on this.
>> 
>> Can these patches be submitted as merge requests against the upstream
>> projects? It would be nice to get some feedback from the maintainers,
>> and be able to easily leave some comments there as well.

FYI, I've discussed with other uptream developers about these while
doing them, and afterwards on how to improve them.

> I guess Marco was waiting for the kernel bits too land before
> submitting these,
> but I agree that it would probably be good to have these submitted
> now, we
> can mark them as WIP to avoid them getting merged before the kernel side
> is finalized.

I'll submit them in the next days once I'm done with the refactor I've
in mind, and will notify the list.

And for sure we can keep them in WIP till the final bits aren't completed.

Cheers
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-04-27 Thread Lucas Stach

Hi,

Am Dienstag, dem 27.04.2021 um 09:26 -0400 schrieb Marek Olšák:
> Ok. So that would only make the following use cases broken for now:
> - amd render -> external gpu
> - amd video encode -> network device

FWIW, "only" breaking amd render -> external gpu will make us pretty
unhappy, as we have some cases where we are combining an AMD APU with a
FPGA based graphics card. I can't go into the specifics of this use-
case too much but basically the AMD graphics is rendering content that
gets composited on top of a live video pipeline running through the
FPGA.

> What about the case when we get a buffer from an external device and
> we're supposed to make it "busy" when we are using it, and the
> external device wants to wait until we stop using it? Is it something
> that can happen, thus turning "external -> amd" into "external <->
> amd"?

Zero-copy texture sampling from a video input certainly appreciates
this very much. Trying to pass the render fence through the various
layers of userspace to be able to tell when the video input can reuse a
buffer is a great experience in yak shaving. Allowing the video input
to reuse the buffer as soon as the read dma_fence from the GPU is
signaled is much more straight forward.

Regards,
Lucas

> Marek
> 
> On Tue., Apr. 27, 2021, 08:50 Christian König, < 
> ckoenig.leichtzumer...@gmail.com> wrote:
> >  Only amd -> external.
> >  
> >  We can easily install something in an user queue which waits for a
> > dma_fence in the kernel.
> >  
> >  But we can't easily wait for an user queue as dependency of a
> > dma_fence.
> >  
> >  The good thing is we have this wait before signal case on Vulkan
> > timeline semaphores which have the same problem in the kernel.
> >  
> >  The good news is I think we can relatively easily convert i915 and
> > older amdgpu device to something which is compatible with user
> > fences.
> >  
> >  So yes, getting that fixed case by case should work.
> >  
> >  Christian
> >  
> > Am 27.04.21 um 14:46 schrieb Marek Olšák:
> >  
> > > I'll defer to Christian and Alex to decide whether dropping sync
> > > with non-amd devices (GPUs, cameras etc.) is acceptable.
> > > 
> > > Rewriting those drivers to this new sync model could be done on a
> > > case by case basis.
> > > 
> > > For now, would we only lose the "amd -> external" dependency? Or
> > > the "external -> amd" dependency too?
> > > 
> > > Marek
> > > 
> > > On Tue., Apr. 27, 2021, 08:15 Daniel Vetter, 
> > > wrote:
> > >  
> > > > On Tue, Apr 27, 2021 at 2:11 PM Marek Olšák 
> > > > wrote:
> > > >  > Ok. I'll interpret this as "yes, it will work, let's do it".
> > > >  
> > > >  It works if all you care about is drm/amdgpu. I'm not sure
> > > > that's a
> > > >  reasonable approach for upstream, but it definitely is an
> > > > approach :-)
> > > >  
> > > >  We've already gone somewhat through the pain of drm/amdgpu
> > > > redefining
> > > >  how implicit sync works without sufficiently talking with
> > > > other
> > > >  people, maybe we should avoid a repeat of this ...
> > > >  -Daniel
> > > >  
> > > >  >
> > > >  > Marek
> > > >  >
> > > >  > On Tue., Apr. 27, 2021, 08:06 Christian König,
> > > >  wrote:
> > > >  >>
> > > >  >> Correct, we wouldn't have synchronization between device
> > > > with
> > > > and without user queues any more.
> > > >  >>
> > > >  >> That could only be a problem for A+I Laptops.
> > > >  >>
> > > >  >> Memory management will just work with preemption fences
> > > > which
> > > > pause the user queues of a process before evicting something.
> > > > That will be a dma_fence, but also a well known approach.
> > > >  >>
> > > >  >> Christian.
> > > >  >>
> > > >  >> Am 27.04.21 um 13:49 schrieb Marek Olšák:
> > > >  >>
> > > >  >> If we don't use future fences for DMA fences at all, e.g.
> > > > we
> > > > don't use them for memory management, it can work, right?
> > > > Memory
> > > > management can suspend user queues anytime. It doesn't need to
> > > > use DMA fences. There might be something that I'm missing here.
> > > >  >>
> > > >  >> What would we lose without DMA fences? Just inter-device
> > > > synchronization? I think that might be acceptable.
> > > >  >>
> > > >  >> The only case when the kernel will wait on a future fence
> > > > is
> > > > before a page flip. Everything today already depends on
> > > > userspace
> > > > not hanging the gpu, which makes everything a future fence.
> > > >  >>
> > > >  >> Marek
> > > >  >>
> > > >  >> On Tue., Apr. 27, 2021, 04:02 Daniel Vetter,
> > > >  wrote:
> > > >  >>>
> > > >  >>> On Mon, Apr 26, 2021 at 04:59:28PM -0400, Marek Olšák
> > > > wrote:
> > > >  >>> > Thanks everybody. The initial proposal is dead. Here are
> > > > some thoughts on
> > > >  >>> > how to do it differently.
> > > >  >>> >
> > > >  >>> > I think we can have direct command submission from
> > > > userspace via
> > > >  >>> > memory-mapped queues ("user queues") without changing
> > > > window systems.
> > > >  >>> >
> > > >  >>> > The

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-04-27 Thread Simon Ser

On Tuesday, April 27th, 2021 at 7:31 PM, Lucas Stach  
wrote:

> > Ok. So that would only make the following use cases broken for now:
> >
> > - amd render -> external gpu
> > - amd video encode -> network device
>
> FWIW, "only" breaking amd render -> external gpu will make us pretty
> unhappy

I concur. I have quite a few users with a multi-GPU setup involving
AMD hardware.

Note, if this brokenness can't be avoided, I'd prefer a to get a clear
error, and not bad results on screen because nothing is synchronized
anymore.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/i915/gem: Remove reference to struct drm_device.pdev