Re: [PATCH 6/8] drm/sched: Re-order struct drm_sched_rq members for clarity

2024-09-24 Thread Simona Vetter
On Mon, Sep 09, 2024 at 06:19:35PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin 
> 
> Lets re-order the members to make it clear which are protected by the lock
> and at the same time document it via kerneldoc.
> 
> Signed-off-by: Tvrtko Ursulin 
> Cc: Christian König 
> Cc: Alex Deucher 
> Cc: Luben Tuikov 
> Cc: Matthew Brost 
> Cc: Philipp Stanner 
> ---
>  include/drm/gpu_scheduler.h | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index a06753987d93..d4a3ba333568 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -243,10 +243,10 @@ struct drm_sched_entity {
>  /**
>   * struct drm_sched_rq - queue of entities to be scheduled.
>   *
> - * @lock: to modify the entities list.
>   * @sched: the scheduler to which this rq belongs to.
> - * @entities: list of the entities to be scheduled.
> + * @lock: protects the list, tree and current entity.
>   * @current_entity: the entity which is to be scheduled.
> + * @entities: list of the entities to be scheduled.
>   * @rb_tree_root: root of time based priory queue of entities for FIFO 
> scheduling
>   *
>   * Run queue is a set of entities scheduling command submissions for
> @@ -254,10 +254,12 @@ struct drm_sched_entity {
>   * the next entity to emit commands from.
>   */
>  struct drm_sched_rq {
> - spinlock_t  lock;
>   struct drm_gpu_scheduler*sched;
> - struct list_headentities;
> +
> + spinlock_t  lock;
> + /* Following members are protected by the @lock: */

Bit a bikeshed, but I think the moment you feel like you need to sprinkle
additional comments around you should switch over to the in-line
documentation style, where you can have all the space to add everything.
For structures in general I think the top comment style for members isn't
a good idea except for really trivial structs.

But yeah I know it's a bit of work to shuffle stuff around, so maybe next
time ..
-Sima

>   struct drm_sched_entity *current_entity;
> +     struct list_headentities;
>   struct rb_root_cached   rb_tree_root;
>  };
>  
> -- 
> 2.46.0
> 

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: AW: [PATCH] drm/amdgpu: Enable runtime modification of gpu_recovery parameter with validation

2025-01-06 Thread Simona Vetter
000:0a:00.0/ras/gpu_vram_bad_pages
>  | grep 27f5
> 0x027f5d43 : 0x1000 : R
> 0x027f5d4b : 0x1000 : R
> 0x027f5d53 : 0x1000 : R
> 0x027f5d5b : 0x1000 : R
> 0x027f5f43 : 0x1000 : R
> 0x027f5f4b : 0x1000 : R
> 0x027f5f53 : 0x1000 : R
> 0x027f5f5b : 0x1000 : R
> 
> AFAIK, the reserved bad pages will not be used any more.  Please correct me if
> I missed anything.
> 
> DRAM ECC issues are the most common problems. When it occurs, the kernel will
> attempt to hard-offline the page, by trying to unmap the page or killing any
> owner, or triggering IO errors if needed.
> 
> ECC error is also common for HBM and error isolation from each user's job is a
> basic requirement in public cloud. For NVIDIA GPU, a ECC error could be
> contained to a process.
> 
> > XID 94: Contained ECC error
> > XID 95: UnContained ECC error
> >
> > For Xid 94, these errors are contained to one application, and the 
> > application
> > that encountered this error must be restarted. All other applications 
> > running
> > at the time of the Xid are unaffected. It is recommended to reset the GPU 
> > when
> > convenient. Applications can continue to be run until the reset can be
> > performed.
> >
> > For Xid 95, these errors affect multiple applications, and the affected GPU
> > must be reset before applications can restart.
> >
> > https://docs.nvidia.com/deploy/xid-errors/
> 
> Does AMD GPU provide a similar way to achieve error isolation requirement?
> 
> Best Regards,
> Shuai
> 
> >
> >>   However, this parameter is
> >> read-only, necessitating correct settings at driver load. And reloading the
> >> GPU driver in a production environment can be challenging due to reference
> >> counts maintained by various monitoring services.
> >>
> >> Set the gpu_recovery parameter with read-write permission to enable runtime
> >> modification. It will enables users to dynamically manage GPU recovery
> >> mechanisms based on real-time requirements or conditions.
> >>
> >> Signed-off-by: Shuai Xue 
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 26 -
> >>   1 file changed, 25 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> index 38686203bea6..03dd902e1cec 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> @@ -563,12 +563,36 @@ module_param_named(lbpw, amdgpu_lbpw, int, 0444);
> >>   MODULE_PARM_DESC(compute_multipipe, "Force compute queues to be spread 
> >> across pipes (1 = enable, 0 = disable, -1 = auto)");
> >>   module_param_named(compute_multipipe, amdgpu_compute_multipipe, int, 
> >> 0444);
> >> +static int amdgpu_set_gpu_recovery(const char *buf,
> >> +   const struct kernel_param *kp)
> >> +{
> >> +unsigned long val;
> >> +int ret;
> >> +
> >> +ret = kstrtol(buf, 10, &val);
> >> +if (ret < 0)
> >> +return ret;
> >> +
> >> +if (val != 1 && val != 0 && val != -1) {
> >> +pr_err("Invalid value for gpu_recovery: %ld, excepted 0,1,-1\n",
> >> +   val);
> >> +return -EINVAL;
> >> +}
> >> +
> >> +return param_set_int(buf, kp);
> >> +}
> >> +
> >> +static const struct kernel_param_ops amdgpu_gpu_recovery_ops = {
> >> +.set = amdgpu_set_gpu_recovery,
> >> +.get = param_get_int,
> >> +};
> >> +
> >>   /**
> >>* DOC: gpu_recovery (int)
> >>* Set to enable GPU recovery mechanism (1 = enable, 0 = disable). The 
> >> default is -1 (auto, disabled except SRIOV).
> >>*/
> >>   MODULE_PARM_DESC(gpu_recovery, "Enable GPU recovery mechanism, (1 = 
> >> enable, 0 = disable, -1 = auto)");
> >> -module_param_named(gpu_recovery, amdgpu_gpu_recovery, int, 0444);
> >> +module_param_cb(gpu_recovery, &amdgpu_gpu_recovery_ops, 
> >> &amdgpu_gpu_recovery, 0644);
> >>   /**
> >>* DOC: emu_mode (int)

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/amdgpu: mark a bunch of module parameters unsafe

2025-01-07 Thread Simona Vetter
On Tue, Jan 07, 2025 at 03:53:08PM +0100, Christian König wrote:
> We sometimes have people trying to use debugging options in production
> environments.
> 
> Mark options only meant to be used for debugging as unsafe so that the
> kernel is tainted when they are used.
> 
> Signed-off-by: Christian König 

Acked-by: Simona Vetter 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index acb9dc3705ac..9ddfdb02a6a2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -401,7 +401,7 @@ module_param_named(runpm, amdgpu_runtime_pm, int, 0444);
>   * the kernel log for the list of IPs on the asic. The default is 0x 
> (enable all blocks on a device).
>   */
>  MODULE_PARM_DESC(ip_block_mask, "IP Block Mask (all blocks enabled 
> (default))");
> -module_param_named(ip_block_mask, amdgpu_ip_block_mask, uint, 0444);
> +module_param_named_unsafe(ip_block_mask, amdgpu_ip_block_mask, uint, 0444);
>  
>  /**
>   * DOC: bapm (int)
> @@ -459,7 +459,7 @@ module_param_named(vm_update_mode, amdgpu_vm_update_mode, 
> int, 0444);
>   * Enable experimental hw support (1 = enable). The default is 0 (disabled).
>   */
>  MODULE_PARM_DESC(exp_hw_support, "experimental hw support (1 = enable, 0 = 
> disable (default))");
> -module_param_named(exp_hw_support, amdgpu_exp_hw_support, int, 0444);
> +module_param_named_unsafe(exp_hw_support, amdgpu_exp_hw_support, int, 0444);
>  
>  /**
>   * DOC: dc (int)
> @@ -570,14 +570,14 @@ module_param_named(compute_multipipe, 
> amdgpu_compute_multipipe, int, 0444);
>   * Set to enable GPU recovery mechanism (1 = enable, 0 = disable). The 
> default is -1 (auto, disabled except SRIOV).
>   */
>  MODULE_PARM_DESC(gpu_recovery, "Enable GPU recovery mechanism, (1 = enable, 
> 0 = disable, -1 = auto)");
> -module_param_named(gpu_recovery, amdgpu_gpu_recovery, int, 0444);
> +module_param_named_unsafe(gpu_recovery, amdgpu_gpu_recovery, int, 0444);
>  
>  /**
>   * DOC: emu_mode (int)
>   * Set value 1 to enable emulation mode. This is only needed when running on 
> an emulator. The default is 0 (disabled).
>   */
>  MODULE_PARM_DESC(emu_mode, "Emulation mode, (1 = enable, 0 = disable)");
> -module_param_named(emu_mode, amdgpu_emu_mode, int, 0444);
> +module_param_named_unsafe(emu_mode, amdgpu_emu_mode, int, 0444);
>  
>  /**
>   * DOC: ras_enable (int)
> @@ -732,7 +732,7 @@ module_param_named(noretry, amdgpu_noretry, int, 0644);
>   */
>  MODULE_PARM_DESC(force_asic_type,
>   "A non negative value used to specify the asic type for all supported 
> GPUs");
> -module_param_named(force_asic_type, amdgpu_force_asic_type, int, 0444);
> +module_param_named_unsafe(force_asic_type, amdgpu_force_asic_type, int, 
> 0444);
>  
>  /**
>   * DOC: use_xgmi_p2p (int)
> @@ -955,7 +955,7 @@ module_param_named(freesync_video, 
> amdgpu_freesync_vid_mode, uint, 0444);
>   * GPU reset method (-1 = auto (default), 0 = legacy, 1 = mode0, 2 = mode1, 
> 3 = mode2, 4 = baco)
>   */
>  MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default), 0 = 
> legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco/bamaco)");
> -module_param_named(reset_method, amdgpu_reset_method, int, 0644);
> +module_param_named_unsafe(reset_method, amdgpu_reset_method, int, 0644);
>  
>  /**
>   * DOC: bad_page_threshold (int) Bad page threshold is specifies the
> @@ -1051,7 +1051,7 @@ module_param_named(seamless, amdgpu_seamless, int, 
> 0444);
>   * - 0x4: Disable GPU soft recovery, always do a full reset
>   */
>  MODULE_PARM_DESC(debug_mask, "debug options for amdgpu, disabled by 
> default");
> -module_param_named(debug_mask, amdgpu_debug_mask, uint, 0444);
> +module_param_named_unsafe(debug_mask, amdgpu_debug_mask, uint, 0444);
>  
>  /**
>   * DOC: agp (int)
> -- 
> 2.34.1
> 

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/fourcc: add LINEAR modifiers with an exact pitch alignment

2024-12-16 Thread Simona Vetter
On Mon, Dec 16, 2024 at 11:46:13AM +0100, Lucas Stach wrote:
> Am Montag, dem 16.12.2024 um 10:27 +0100 schrieb Michel Dänzer:
> > On 2024-12-15 21:53, Marek Olšák wrote:
> > > The comment explains the problem with DRM_FORMAT_MOD_LINEAR.
> > >    
> > > Signed-off-by: Marek Olšák  > > <mailto:marek.ol...@amd.com>>
> > > 
> > > diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> > > index 78abd819fd62e..8ec4163429014 100644
> > > --- a/include/uapi/drm/drm_fourcc.h
> > > +++ b/include/uapi/drm/drm_fourcc.h
> > > @@ -484,9 +484,27 @@ extern "C" {
> > >   * modifier (e.g. not setting DRM_MODE_FB_MODIFIERS in the DRM_ADDFB2 
> > > ioctl),
> > >   * which tells the driver to also take driver-internal information into 
> > > account
> > >   * and so might actually result in a tiled framebuffer.
> > > + *
> > > + * WARNING:
> > > + * There are drivers out there that expose DRM_FORMAT_MOD_LINEAR, but 
> > > only
> > > + * support a certain pitch alignment and can't import images with this 
> > > modifier
> > > + * if the pitch alignment isn't exactly the one supported. They can 
> > > however
> > > + * allocate images with this modifier and other drivers can import them 
> > > only
> > > + * if they support the same pitch alignment. Thus, DRM_FORMAT_MOD_LINEAR 
> > > is
> > > + * fundamentically incompatible across devices and is the only modifier 
> > > that
> > > + * has a chance of not working. The PITCH_ALIGN modifiers should be used
> > > + * instead.
> > >   */
> > >  #define DRM_FORMAT_MOD_LINEAR  fourcc_mod_code(NONE, 0)
> > >  
> > > +/* Linear layout modifiers with an explicit pitch alignment in bytes.
> > > + * Exposing this modifier requires that the pitch alignment is exactly
> > > + * the number in the definition.
> > > + */
> > > +#define DRM_FORMAT_MOD_LINEAR_PITCH_ALIGN_64B fourcc_mod_code(NONE, 1)
> > 
> > It's not clear what you mean by "requires that the pitch alignment is 
> > exactly
> > the number in the definition", since a pitch which is aligned to 256 bytes 
> > is
> > also aligned to 128 & 64 bytes. Do you mean the pitch must be exactly the 
> > width
> > rounded up to the next / smallest possible multiple of the specified number 
> > of bytes?
> 
> I guess that's the intention here, as some AMD GPUs apparently have
> this limitation that they need an exact aligned pitch.
> 
> If we open the can of worms to overhaul the linear modifier, I think it
> would make sense to also add modifiers for the more common restriction,
> where the pitch needs to be aligned to a specific granule, but the
> engine doesn't care if things get overaligned to a multiple of the
> granule. Having both sets would probably make it easier for the reader
> to see the difference to the exact pitch modifiers proposed in this
> patch.

Yeah I think with linear modifiers that sepcificy alignment limitations we
need to be _extremely_ precise in what exactly is required, and what not.
There's all kinds of hilarious stuff that might be incompatible, and if we
don't mind those we're right back to the "everyone agrees on what linear
means" falacy.

So if pitch must be a multiple of 64, but cannot be a multiple of 128
(because the hw does not cope with the padding, then that's important).
Other things are alignment constraints on the starting point, and any
padding you need to add at the bottom (yeah some hw overscans and gets
pissed if there's not memory there). So I think it's best to go overboard
here with verbosity.

There's also really funny stuff like power-of-two alignment and things
like that, but I guess we'll get there if that's ever needed. That's also
why I think we don't need to add all possible linear modifiers from the
start, unless there's maybe too much confusion with stuff like "exactly
64b aligned pitch" and "at least 64b aligned pitch, but you can add lots
of padding if you feel like".
-""ma

> 
> Regards,
> Lucas

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/fourcc: add LINEAR modifiers with an exact pitch alignment

2024-12-18 Thread Simona Vetter
On Mon, Dec 16, 2024 at 04:58:20PM -0500, Marek Olšák wrote:
> On Mon, Dec 16, 2024 at 9:53 AM Simona Vetter 
> wrote:
> 
> > On Mon, Dec 16, 2024 at 11:46:13AM +0100, Lucas Stach wrote:
> > > Am Montag, dem 16.12.2024 um 10:27 +0100 schrieb Michel Dänzer:
> > > > On 2024-12-15 21:53, Marek Olšák wrote:
> > > > > The comment explains the problem with DRM_FORMAT_MOD_LINEAR.
> > > > >
> > > > > Signed-off-by: Marek Olšák  > marek.ol...@amd.com>>
> > > > >
> > > > > diff --git a/include/uapi/drm/drm_fourcc.h
> > b/include/uapi/drm/drm_fourcc.h
> > > > > index 78abd819fd62e..8ec4163429014 100644
> > > > > --- a/include/uapi/drm/drm_fourcc.h
> > > > > +++ b/include/uapi/drm/drm_fourcc.h
> > > > > @@ -484,9 +484,27 @@ extern "C" {
> > > > >   * modifier (e.g. not setting DRM_MODE_FB_MODIFIERS in the
> > DRM_ADDFB2 ioctl),
> > > > >   * which tells the driver to also take driver-internal information
> > into account
> > > > >   * and so might actually result in a tiled framebuffer.
> > > > > + *
> > > > > + * WARNING:
> > > > > + * There are drivers out there that expose DRM_FORMAT_MOD_LINEAR,
> > but only
> > > > > + * support a certain pitch alignment and can't import images with
> > this modifier
> > > > > + * if the pitch alignment isn't exactly the one supported. They can
> > however
> > > > > + * allocate images with this modifier and other drivers can import
> > them only
> > > > > + * if they support the same pitch alignment. Thus,
> > DRM_FORMAT_MOD_LINEAR is
> > > > > + * fundamentically incompatible across devices and is the only
> > modifier that
> > > > > + * has a chance of not working. The PITCH_ALIGN modifiers should be
> > used
> > > > > + * instead.
> > > > >   */
> > > > >  #define DRM_FORMAT_MOD_LINEAR  fourcc_mod_code(NONE, 0)
> > > > >
> > > > > +/* Linear layout modifiers with an explicit pitch alignment in
> > bytes.
> > > > > + * Exposing this modifier requires that the pitch alignment is
> > exactly
> > > > > + * the number in the definition.
> > > > > + */
> > > > > +#define DRM_FORMAT_MOD_LINEAR_PITCH_ALIGN_64B fourcc_mod_code(NONE,
> > 1)
> > > >
> > > > It's not clear what you mean by "requires that the pitch alignment is
> > exactly
> > > > the number in the definition", since a pitch which is aligned to 256
> > bytes is
> > > > also aligned to 128 & 64 bytes. Do you mean the pitch must be exactly
> > the width
> > > > rounded up to the next / smallest possible multiple of the specified
> > number of bytes?
> > >
> > > I guess that's the intention here, as some AMD GPUs apparently have
> > > this limitation that they need an exact aligned pitch.
> > >
> > > If we open the can of worms to overhaul the linear modifier, I think it
> > > would make sense to also add modifiers for the more common restriction,
> > > where the pitch needs to be aligned to a specific granule, but the
> > > engine doesn't care if things get overaligned to a multiple of the
> > > granule. Having both sets would probably make it easier for the reader
> > > to see the difference to the exact pitch modifiers proposed in this
> > > patch.
> >
> > Yeah I think with linear modifiers that sepcificy alignment limitations we
> > need to be _extremely_ precise in what exactly is required, and what not.
> > There's all kinds of hilarious stuff that might be incompatible, and if we
> > don't mind those we're right back to the "everyone agrees on what linear
> > means" falacy.
> >
> > So if pitch must be a multiple of 64, but cannot be a multiple of 128
> > (because the hw does not cope with the padding, then that's important).
> > Other things are alignment constraints on the starting point, and any
> > padding you need to add at the bottom (yeah some hw overscans and gets
> > pissed if there's not memory there). So I think it's best to go overboard
> > here with verbosity.
> >
> > There's also really funny stuff like power-of-two alignment and things
> > like that, but I guess we'll get there if that's ever needed. That's also
> > why I think we don't need to ad

Re: [PATCH] drm/fourcc: add LINEAR modifiers with an exact pitch alignment

2024-12-18 Thread Simona Vetter
On Wed, Dec 18, 2024 at 10:44:17AM +0100, Michel Dänzer wrote:
> On 2024-12-17 12:03, Brian Starkey wrote:
> > On Tue, Dec 17, 2024 at 11:13:05AM +, Michel Dänzer wrote:
> >> On 2024-12-17 10:14, Brian Starkey wrote:
> >>
> >>> Modifiers are meant to describe framebuffers, and this pitch alignment
> >>> requirement isn't really a framebuffer property - it's a device
> >>> constraint. It feels out of place to overload modifiers with it.
> 
> FWIW, KMS framebuffers aren't the only use case for sharing buffers
> between devices.
> 
> 
> >>> I'm not saying we don't need a way to describe constraints to
> >>> allocators, but I question if modifiers the right mechanism to
> >>> communicate them?
> >> While I agree with your concern in general, AFAIK there's no other
> >> solution for this even on the horizon, after years of talking about
> >> it. The solution proposed here seems like an acceptable stop gap,
> >> assuming it won't result in a gazillion linear modifiers.
> > 
> > UAPI is baked forever, so it's worth being a little wary IMO.
> > 
> > This sets a precedent for describing constraints via modifiers. The
> > reason no other proposal is on the horizon is because describing the
> > plethora of constraints across devices is a hard problem; and the
> > answer so far has been "userspace needs to know" (à la Android's
> > gralloc).
> > 
> > If we start down the road of describing constraints with modifiers, I
> > fear we'll end up in a mess. The full enumeration of modifiers is
> > already horrendous for parameterized types, please let's not
> > combinatorially multiply those by constraints.
> 
> I agree there's a slippery slope.
> 
> That said, linear buffers are special in that they're the only
> possibility which can work for sharing buffers between devices in many
> cases, in particular when the devices are from different vendors or even
> different generations from the same vendor.
> 
> So as long as device vendors don't get too creative with their linear
> pitch alignment restrictions, it still seems like this might be workable
> stop-gap solution for that specific purpose alone, until a better
> solution for handling constraints arrives.
> 
> 
> > P.S. "is the only modifier that has a chance of not working" is
> > fundamentally false.
> 
> My reading of that part of the comment is that pitch alignment shouldn't
> be an issue with non-linear modifiers, since the constraints for pitch
> should be part of the modifier definition. Maybe that could be clarified
> in the comment.

Yeah with all other modifiers pitch alignment or other alignment/sizing
requirements are generally implied. Mostly by stuff like tile size, but
there's others where the hw requirement flat out is that the buffer must
have a power-of-two stride (and maybe we should document these when they
pop up, but the only one I know of are the legacy i915 modifiers, which
are kinda busted anyway for interop).

For that reason I think linear modifiers with explicit pitch/size
alignment constraints is a sound concept and fits into how modifiers work
overall.
-Sima

> 
> 
> -- 
> Earthling Michel Dänzer   \GNOME / Xwayland / Mesa developer
> https://redhat.com \   Libre software enthusiast

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/fourcc: add LINEAR modifiers with an exact pitch alignment

2024-12-19 Thread Simona Vetter
On Thu, Dec 19, 2024 at 09:02:27AM +, Daniel Stone wrote:
> On Wed, 18 Dec 2024 at 10:32, Brian Starkey  wrote:
> > On Wed, Dec 18, 2024 at 11:24:58AM +0000, Simona Vetter wrote:
> > > For that reason I think linear modifiers with explicit pitch/size
> > > alignment constraints is a sound concept and fits into how modifiers work
> > > overall.
> >
> > Could we make it (more) clear that pitch alignment is a "special"
> > constraint (in that it's really a description of the buffer layout),
> > and that constraints in-general shouldn't be exposed via modifiers?
> 
> It's still worryingly common to see requirements for contiguous
> allocation, if for no other reason than we'll all be stuck with
> Freescale/NXP i.MX6 for a long time to come. Would that be in scope
> for expressing constraints via modifiers as well, and if so, should we
> be trying to use feature bits to express this?
> 
> How this would be used in practice is also way too underdocumented. We
> need to document that exact-round-up 64b is more restrictive than
> any-multiple-of 64b is more restrictive than 'classic' linear. We need
> to document what people should advertise - if we were starting from
> scratch, the clear answer would be that anything which doesn't care
> should advertise all three, anything advertising any-multiple-of
> should also advertise exact-round-up, etc.
> 
> But we're not starting from scratch, and since linear is 'special',
> userspace already has explicit knowledge of it. So AMD is going to
> have to advertise LINEAR forever, because media frameworks know about
> DRM_FORMAT_MOD_LINEAR and pass that around explicitly when they know
> that the buffer is linear. That and not breaking older userspace
> running in containers or as part of a bisect or whatever.
> 
> There's also the question of what e.g. gbm_bo_get_modifier() should
> return. Again, if we were starting from scratch, most restrictive
> would make sense. But we're not, so I think it has to return LINEAR
> for maximum compatibility (because modifiers can't be morphed into
> other ones for fun), which further cements that we're not removing
> LINEAR.
> 
> And how should allocators determine what to go for? Given that, I
> think the only sensible semantics are, when only LINEAR has been
> passed, to pick the most restrictive set possible; when LINEAR
> variants have been passed as well as LINEAR, to act as if LINEAR were
> not passed at all.

Yeah I think this makes sense, and we'd need to add that to the kerneldoc
about how drivers/apps/frameworks need to work with variants of LINEAR.

Just deprecating LINEAR does indeed not work. The same way it was really
hard slow crawl (and we're still not there everywhere, if you include
stuff like bare metal Xorg) trying to retire the implied modifier. Maybe,
in an extremely bright future were all relevant drivers advertise a full
set of LINEAR variants, and all frameworks understand them, we'll get
there. But if AMD is the one special case that really needs this I don't
think it's realistic to plan for that, and what Daniel describe above
looks like the future we're stuck to.
-Sima
-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/fourcc: add LINEAR modifiers with an exact pitch alignment

2024-12-20 Thread Simona Vetter
On Thu, Dec 19, 2024 at 05:09:33PM +0100, Michel Dänzer wrote:
> On 2024-12-19 10:02, Daniel Stone wrote:
> > 
> > How this would be used in practice is also way too underdocumented. We
> > need to document that exact-round-up 64b is more restrictive than
> > any-multiple-of 64b is more restrictive than 'classic' linear. We need
> > to document what people should advertise - if we were starting from
> > scratch, the clear answer would be that anything which doesn't care
> > should advertise all three, anything advertising any-multiple-of
> > should also advertise exact-round-up, etc.
> > 
> > But we're not starting from scratch, and since linear is 'special',
> > userspace already has explicit knowledge of it. So AMD is going to
> > have to advertise LINEAR forever, because media frameworks know about
> > DRM_FORMAT_MOD_LINEAR and pass that around explicitly when they know
> > that the buffer is linear. That and not breaking older userspace
> > running in containers or as part of a bisect or whatever.
> > 
> > There's also the question of what e.g. gbm_bo_get_modifier() should
> > return. Again, if we were starting from scratch, most restrictive
> > would make sense. But we're not, so I think it has to return LINEAR
> > for maximum compatibility (because modifiers can't be morphed into
> > other ones for fun), which further cements that we're not removing
> > LINEAR.
> > 
> > And how should allocators determine what to go for? Given that, I
> > think the only sensible semantics are, when only LINEAR has been
> > passed, to pick the most restrictive set possible; when LINEAR
> > variants have been passed as well as LINEAR, to act as if LINEAR were
> > not passed at all.
> 
> These are all very good points, which call for stricter-than-usual
> application of the "new UAPI requires corresponding user-space patches"
> rule:
> 
> * Patches adding support for the new modifiers in Mesa (and kernel)
> drivers for at least two separate vendors

I think this is too strict? At least I could come up with other scenarios
where we'd need a new linear variant:
- one driver, but two different devices that happen to have incompatible
  linear requirements which break and get fixed with the new linear mode.
- one driver, one device, but non-driver userspace allocates the linear
  buffer somewhere else (e.g. from dma-buf heaps) and gets pitch
  constraints wrong

> * Patches adding support in at least one non-Mesa user-space component,
> e.g. a Wayland compositor which has code using the existing linear
> modifier (e.g. mutter)

This also feels a bit too strict, since I think what Daniel proposed is
that drivers do the special LINEAR handling when there are variants
present in the list of compatible modifiers for an alloation. Hence I
don't think compositor patches are necessarily required, but we definitely
need to test to make sure it actually works and there's not surprises.

The exception is of course when non-mesa userspace allocates/sizes the
buffer itself (maybe for an soc where the display is separate and the gpu
has stricter stride constraints than the display).

> Ideally also describe a specific multi-vendor scenario which can't work
> with the existing linear modifier, and validate that it works with the
> new ones.

I think that's really the crucial part, because adding modifiers without
an actual use-case that they fix is just asking for more future trouble I
think.
-Sima

> 
> 
> -- 
> Earthling Michel Dänzer   \GNOME / Xwayland / Mesa developer
> https://redhat.com \   Libre software enthusiast

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/fourcc: add LINEAR modifiers with an exact pitch alignment

2024-12-20 Thread Simona Vetter
On Fri, Dec 20, 2024 at 09:24:59AM -0500, Marek Olšák wrote:
> >
> >  * Modifiers must uniquely encode buffer layout. In other words, a buffer
> > must
> >  * match only a single modifier.
> >
> 
> That sentence is misleading and impossible to meet. Specifications are
> sometimes changed to reflect the overwhelming reality. Multiple modifiers
> can represent identical layouts - they already do between vendors, between
> generations of the same vendor, and accidentally even between chips of the
> same generation. Modifiers have already become 64-bit structures of
> bitfields with currently 2^16 possible modifiers for some vendors, and
> possibly exceeding 100k for all vendors combined. Encoding all linear
> constraints into the 64 bits is one option. It needs more thought, but
> encoding at least some constraints in the modifier is not totally off the
> table.

Yeah uniqueness is not required, we just try to avoid it since it causes
some pain. Worst case all drivers that care about working sharing just
need to make sure they advertise all overlapping variants they support.

> The semi-functional LINEAR modifier needs to go. The idea of modifiers is
> that nobody should have to expose one that is unsupported to keep things
> working for a subset of apps. If the LINEAR modifier is a requirement
> everywhere because of apps, and even drivers that can't support it must
> expose it, that's a problem. It causes GBM/EGL to fail to import a DMABUF
> for a random reason and it can't be prevented without, say, looking at PCI
> IDs. If that happened for any other API, it would be considered unusable.
> We can either fix it (by replacing/deprecating/removing LINEAR) or abandon
> modifiers and replace them with something that works.

I think Daniel's forward compatibility plan is more solid than trying to
deprecate LINEAR itself, that seems like too much an uphill battle to me.
-Sima
-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [pull] amdgpu, amdkfd drm-fixes-6.13

2024-12-06 Thread Simona Vetter
gt;  .../display/dc/resource/dcn321/dcn321_resource.c   |   1 +
>  .../amd/display/dc/resource/dcn35/dcn35_resource.c |   2 +
>  .../display/dc/resource/dcn351/dcn351_resource.c   |   2 +
>  .../display/dc/resource/dcn401/dcn401_resource.c   |   1 +
>  .../drm/amd/display/modules/freesync/freesync.c|  13 +-
>  drivers/gpu/drm/amd/pm/amdgpu_pm.c |   6 +-
>  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c  | 169 +++-
>  drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h  |  17 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c  | 167 +++-
>  drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c| 170 
>  .../drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c| 171 
>  drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c   |  41 ++---
>  drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c|  43 ++---
>  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c   | 175 
> +++--
>  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c   | 139 +---
>  .../gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c   | 165 +++
>  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c |  33 +++-
>  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h |   6 +-
>  49 files changed, 1031 insertions(+), 615 deletions(-)

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/fourcc: add LINEAR modifiers with an exact pitch alignment

2025-01-22 Thread Simona Vetter
On Tue, Jan 21, 2025 at 02:21:57PM -0500, Marek Olšák wrote:
> On Mon, Jan 20, 2025 at 1:41 PM Simona Vetter 
> wrote:
> 
> > On Mon, Jan 20, 2025 at 08:58:20AM +0100, Thomas Zimmermann wrote:
> > > Hi
> > >
> > >
> > > Am 18.01.25 um 03:37 schrieb Marek Olšák:
> > > [...]
> > > >
> > > > 3) Implementing DRM_FORMAT_MOD_LINEAR as having 256B pitch and offset
> > > > alignment. This is what we do today. Even if Intel and some AMD chips
> > > > can do 64B or 128B alignment, they overalign to 256B. With so many
> > > > AMD+NV laptops out there, NV is probably next, unless they already do
> > > > this in the closed source driver.
> >
> > I don't think this works, or at least not any better than the current
> > linear modifier. There's way too many users of that thing out there that I
> > think you can realistically redefine it.
> >
> 
> DRM_FORMAT_MOD_LINEAR was redefined on PC a long time ago to mean 256B
> pitch alignment because of laptops with AMD+Intel. Drivers redefined it
> because that's what happens when it's under-defined. As you say,
> DRM_FORMAT_MOD_LINEAR can't be removed, but then it can't work with any
> other pitch alignment on all PC hw either, so there is no other choice.
> 
> The options for PC are either a new parameterized linear modifier (with
> properly defined addressing and size equations) or DRM_FORMAT_MOD_LINEAR
> with 256B pitch alignment. There is no 3rd option. Even if you totally
> disregard AMD, you won't get it below 128B or 64B on the rest of PC hw
> anyway, and that's the same problem.

Ah I missed that, but just checked in mesa, happened in 2021 apparently.
Would be really good to document this in the kernel's drm_fourcc.h
comments as the defacto rule. It's better if the docs reflect actual
reality, whatever that is.
-Sima
-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [pull] amdgpu, amdkfd drm-next-6.14

2025-01-24 Thread Simona Vetter
amd/amdgpu/amdgpu_mes.h|2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h   |1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c|2 +-
>  drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c |8 +
>  drivers/gpu/drm/amd/amdgpu/mes_v12_0.c |   49 +-
>  drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c   |   22 +
>  drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h | 2391 
> ++--
>  .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm  |4 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c   |   22 +-
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  |8 +
>  .../drm/amd/display/dc/dml2/display_mode_core.c|   12 +-
>  .../dml21/src/dml2_core/dml2_core_dcn4_calcs.c |   12 +-
>  .../gpu/drm/amd/display/dc/dpp/dcn10/dcn10_dpp.c   |   10 +-
>  .../drm/amd/display/dc/dpp/dcn401/dcn401_dpp_cm.c  |6 +-
>  .../gpu/drm/amd/display/dc/hubp/dcn10/dcn10_hubp.c |   10 +-
>  .../gpu/drm/amd/display/dc/hubp/dcn10/dcn10_hubp.h |2 +
>  .../gpu/drm/amd/display/dc/hubp/dcn20/dcn20_hubp.c |9 +-
>  .../drm/amd/display/dc/hubp/dcn201/dcn201_hubp.c   |1 +
>  .../gpu/drm/amd/display/dc/hubp/dcn21/dcn21_hubp.c |3 +
>  .../gpu/drm/amd/display/dc/hubp/dcn30/dcn30_hubp.c |3 +
>  .../gpu/drm/amd/display/dc/hubp/dcn31/dcn31_hubp.c |1 +
>  .../gpu/drm/amd/display/dc/hubp/dcn32/dcn32_hubp.c |1 +
>  .../gpu/drm/amd/display/dc/hubp/dcn35/dcn35_hubp.c |1 +
>  .../drm/amd/display/dc/hubp/dcn401/dcn401_hubp.c   |   13 +-
>  .../drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c|2 +
>  .../drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.c|2 +
>  drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h   |2 +
>  drivers/gpu/drm/amd/include/amd_pcie.h |   18 +
>  drivers/gpu/drm/amd/include/amd_shared.h   |7 +-
>  .../amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_ppsmc.h   |1 -
>  drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h   |3 +-
>  drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h   |1 +
>  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c   |  286 ++-
>  40 files changed, 1743 insertions(+), 1400 deletions(-)

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [pull] amdgpu drm-next-6.14

2025-01-24 Thread Simona Vetter
On Wed, Jan 22, 2025 at 11:32:52AM -0500, Alex Deucher wrote:
> Hi Dave, Simona,
> 
> Fixes for 6.14.
> 
> The following changes since commit 97e5c9e4139087a67c2469488360a6d6afdd4b69:
> 
>   drm/amd/display: fix CEC DC_DEBUG_MASK documentation (2025-01-16 16:23:22 
> -0500)
> 
> are available in the Git repository at:
> 
>   https://gitlab.freedesktop.org/agd5f/linux.git 
> tags/amd-drm-next-6.14-2025-01-22

I think you forgot to add your sob when rebasing/cherry-picking some of
the commits in this one. Apologies for only looking at it now, but can
you please respin?

dim: f4f65756091c ("drm/amdkfd: Clear MODE.VSKIP in gfx9 trap handler"): 
committer Signed-off-by missing.
dim: 8507f33a13b8 ("drm/amdgpu: Refine ip detection log message"): committer 
Signed-off-by missing.
dim: d3690f0e7b50 ("drm/amdgpu: Add handler for SDMA context empty"): committer 
Signed-off-by missing.
dim: 5349658fa4a1 ("drm/amd: Add debug option to disable subvp"): committer 
Signed-off-by missing.
dim: bede35aa9ebe ("drm/amdkfd: Sync trap handler binary with source"): 
committer Signed-off-by missing.
dim: 1cd21c472883 ("drm/amdkfd: Fix partial migrate issue"): committer 
Signed-off-by missing.

Cheers, Sima
> 
> for you to fetch changes up to 74d099cbc632a76ea80cdf31126c8b110ff34865:
> 
>   drm/amd/display: Optimize cursor position updates (2025-01-21 10:38:59 
> -0500)
> 
> 
> amd-drm-next-6.14-2025-01-22:
> 
> amdgpu:
> - Documentation fixes
> - SMU 13.x fixes
> - SR-IOV fix
> - Display fix
> - PCIe calculation fix
> - MES 12 fix
> - HUBP fix
> - Cursor fix
> 
> 
> Alex Deucher (1):
>   drm/amd/display: fix SUBVP DC_DEBUG_MASK documentation
> 
> Aric Cyr (2):
>   drm/amd/display: Add hubp cache reset when powergating
>   drm/amd/display: Optimize cursor position updates
> 
> jesse.zh...@amd.com (3):
>   revert "drm/amdgpu/pm: Implement SDMA queue reset for different asic"
>   revert "drm/amdgpu/pm: add definition PPSMC_MSG_ResetSDMA2"
>   drm/amd/pm: Refactor SMU 13.0.6 SDMA reset firmware version checks
> 
> Lijo Lazar (2):
>   drm/amd/pm: Add capability flags for SMU v13.0.6
>   drm/amd/pm: Fix smu v13.0.6 caps initialization
> 
> Lin.Cao (1):
>   drm/amdgpu: fix ring timeout issue in gfx10 sr-iov environment
> 
> Mario Limonciello (1):
>   drm/amd: Clarify kdoc for amdgpu.gttsize
> 
> Shaoyun Liu (1):
>   drm/amd/amdgpu: Enable scratch data dump for mes 12
> 
> Srinivasan Shanmugam (2):
>   drm/amd/display: Fix error pointers in amdgpu_dm_crtc_mem_type_changed
>   drm/amd/amdgpu: Prevent null pointer dereference in GPU bandwidth 
> calculation
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   4 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|   2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c |   2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h|   2 +-
>  drivers/gpu/drm/amd/amdgpu/mes_v12_0.c |  49 +++-
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  |   5 +
>  .../gpu/drm/amd/display/dc/dpp/dcn10/dcn10_dpp.c   |  10 +-
>  .../drm/amd/display/dc/dpp/dcn401/dcn401_dpp_cm.c  |   6 +-
>  .../gpu/drm/amd/display/dc/hubp/dcn10/dcn10_hubp.c |  10 +-
>  .../gpu/drm/amd/display/dc/hubp/dcn10/dcn10_hubp.h |   2 +
>  .../gpu/drm/amd/display/dc/hubp/dcn20/dcn20_hubp.c |   9 +-
>  .../drm/amd/display/dc/hubp/dcn201/dcn201_hubp.c   |   1 +
>  .../gpu/drm/amd/display/dc/hubp/dcn21/dcn21_hubp.c |   3 +
>  .../gpu/drm/amd/display/dc/hubp/dcn30/dcn30_hubp.c |   3 +
>  .../gpu/drm/amd/display/dc/hubp/dcn31/dcn31_hubp.c |   1 +
>  .../gpu/drm/amd/display/dc/hubp/dcn32/dcn32_hubp.c |   1 +
>  .../gpu/drm/amd/display/dc/hubp/dcn35/dcn35_hubp.c |   1 +
>  .../drm/amd/display/dc/hubp/dcn401/dcn401_hubp.c   |  13 +-
>  .../drm/amd/display/dc/hwss/dcn10/dcn10_hwseq.c|   2 +
>  .../drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.c|   2 +
>  drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h   |   2 +
>  drivers/gpu/drm/amd/include/amd_shared.h   |   2 +-
>  .../amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_ppsmc.h   |   1 -
>  drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h   |   3 +-
>  drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h   |   1 +
>  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c   | 286 
> ++---
>  26 files changed, 300 insertions(+), 123 deletions(-)

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/fourcc: add LINEAR modifiers with an exact pitch alignment

2025-01-20 Thread Simona Vetter
On Mon, Jan 20, 2025 at 08:58:20AM +0100, Thomas Zimmermann wrote:
> Hi
> 
> 
> Am 18.01.25 um 03:37 schrieb Marek Olšák:
> [...]
> > 
> > 3) Implementing DRM_FORMAT_MOD_LINEAR as having 256B pitch and offset
> > alignment. This is what we do today. Even if Intel and some AMD chips
> > can do 64B or 128B alignment, they overalign to 256B. With so many
> > AMD+NV laptops out there, NV is probably next, unless they already do
> > this in the closed source driver.

I don't think this works, or at least not any better than the current
linear modifier. There's way too many users of that thing out there that I
think you can realistically redefine it.

Adding new linear modifiers and then preferring those above the old LINEAR
(if there is one left after all the wittling down to a common set) is I
think the only option that really works to fix something.

> The dumb-buffer series currently being discussed on dri-devel also touches
> handling of scanline pitches. THe actual value varies with each driver. 
> Should dumb buffers use a default pitch alignment of 256 on al hardware?

If you go with new modifiers then there could be shared code that dtrt by
just looking at the modifier list.
-Sima

> Best regards
> Thomas
> 
> > 
> > Marek
> > 
> > On Fri, Jan 17, 2025 at 9:18 AM Simona Vetter 
> > wrote:
> > 
> > On Wed, Jan 15, 2025 at 12:20:07PM +, Daniel Stone wrote:
> > > On Wed, 15 Jan 2025 at 04:05, Marek Olšák  wrote:
> > > > On Tue, Jan 14, 2025 at 12:58 PM Daniel Stone
> >  wrote:
> > > >> AMD hardware is the only hardware I know of which doesn't support
> > > >> overaligning. Say (not hypothetically) we have a GPU and a
> > display
> > > >> controller which have a minimum pitch alignment of 32 bytes, no
> > > >> minimum height alignment, minimum 32-byte offset alignment,
> > minimum
> > > >> pitch of 32 bytes, and minimum image size of 32 bytes.
> > > >>
> > > >> To be maximally compatible, we'd have to expose 28 (pitch
> > align) * 32
> > > >> (height align) * 28 (offset align) * 28 (min pitch) * 28 (min
> > size) ==
> > > >> 19668992 individual modifiers when queried, which is 150MB
> > per format
> > > >> just to store the list of modifiers.
> > > >
> > > > Maximum compatibility is not required nor expected.
> > > >
> > > > In your case, only 1 linear modifier would be added for that
> > driver, which is: [5 / 0 / 5 / 5 / 5]
> > > >
> > > > Then if, and only if, compatibility with other devices is
> > desired, the driver developer could look at drivers of those other
> > devices and determine which other linear modifiers to add. Ideally
> > it would be just 1, so there would be a total of 2.
> > >
> > > Mali (actually two DRM drivers and sort of three Mesa drivers)
> > can be
> > > paired with any one of 11 KMS drivers (really 12 given that one is a
> > > very independent subdriver), and something like 20 different codecs
> > > (at least 12 different vendors; I didn't bother counting the actual
> > > subdrivers which are all quite different). The VeriSilicon Hantro G2
> > > codec driver is shipped by five (that we know of) vendors who
> > all have
> > > their own KMS drivers. One of those is in the Rockchip RK3588, which
> > > (don't ask me why) ships six different codec blocks, with three
> > > different drivers, from two different vendors - that's before
> > you even
> > > get to things like the ISP and NPU which really need to be sharing
> > > buffers properly without copies.
> > >
> > > So yeah, working widely without having to encode specific knowledge
> > > everywhere isn't a nice-to-have, it's a hard baseline requirement.
> > >
> > > >> > DRM_FORMAT_MOD_LINEAR needs to go because it prevents apps
> > from detecting whether 2 devices have 0 compatible memory layouts,
> > which is a useful thing to know.
> > > >>
> > > >> I get the point, but again, we have the exact same problem
> > today with
> > > >> placement, i.e. some devices require buffers to be in or not
> > be in
> > > >> VRAM or GTT or sysram for some uses, and some devices require
> > phys

Re: [PATCH] drm/fourcc: add LINEAR modifiers with an exact pitch alignment

2025-01-15 Thread Simona Vetter
On Tue, Jan 14, 2025 at 12:33:57PM -0600, Faith Ekstrand wrote:
> On January 14, 2025 03:39:45 Marek Olšák  wrote:
> > I would keep the existing modifier interfaces, API extensions, and
> > expectations the same as today for simplicity.
> > 
> > The new linear modifier definition (proposal) will have these fields:
> >   5 bits for log2 pitch alignment in bytes
> >   5 bits for log2 height alignment in rows
> > 
> >   5 bits for log2 offset alignment in bytes
> >   5 bits for log2 minimum pitch in bytes
> > 
> >   5 bits for log2 minimum (2D) image size in bytes
> 
> I'm not strictly opposed to adding a new modifier or two but this seems
> massively over-designed. First off, no one uses anything but simple 2D
> images for WSI and BOs are allocated in units of 4k pages so 2, 4, and 5 can
> go. If we assume pitch alignment and offset alignment are the same (and
> offset is almost always 0 anyway), 3 can go.
> 
> Even with that, I'm struggling to see how useful this is. My understanding
> is that you're trying to solve a problem where you need an exact 64-byte
> alignment for some AMD scanout stuff. That's not even possible to support on
> Nvidia (minimum alignment is 128B) so practically you're looking at one
> modifier that's shared between AMD and Intel. Why can't we just add an AMD
> modifier, make Intel support it, and move on?
> 
> Otherwise we're massively exploding the modifier space for... Why? Intel
> will have to advertise basically all of them. Nvidia will advertise most of
> them. AMD will advertise something. And now apps have tens of thousands of
> modifiers to sort through when we could have just added one and solved the
> problem.

Yeah I feel like step 1 here would be to just add LINEAR_AMD, document the
requirements, add it to drivers that can pull it in for display or
whatever or have some other interop requirement, and see where we go.

The combinatorial explosion of linear constraints is way too much as
Daniel/James both point out, but luckily there's not that many
gpu/display/camera vendors in this world, and we do have like almost
56bits of LINEAR space we can waste.

What we cannot do is drop LINEAR itself, since that would break the world.
Maybe in some glorious future, if there's enough drivers out there
supporting vendor linear modifiers, userspace could know that LINEAR is
special, and if it finds a matching vendor LINEAR_FOO modifier, it should
prefer that one. But then we could probably achieve the same by something
like "try common modifiers in order" and put LINEAR last consistently,
after all the vendor linear modifers, and have the same effect.
-Sima

> ~Faith
> 
> 
> > 
> > 
> > 
> > The pitch and the image size in bytes are no longer arbitrary values.
> > They are fixed values computed from {width, height, bpp, modifier} as
> > follows:
> >   aligned_width = align(width * bpp / 8, 1 << log2_pitch_alignment);
> > 
> >   aligned_height = align(height, 1 << log2_height_alignment);
> >   pitch = max(1 << log2_minimum_pitch, aligned_width);
> > 
> >   image_size = max(1 << log2_minimum_image_size, pitch * aligned_height);
> > 
> > 
> > The modifier defines the layout exactly and non-ambiguously.
> > Overaligning the pitch or height is not supported. Only the offset
> > alignment has some freedom regarding placement. Drivers can expose
> > whatever they want within that definition, even exposing only 1 linear
> > modifier is OK. Then, you can look at modifiers of other drivers if you
> > want to find commonalities.
> > 
> > 
> > DRM_FORMAT_MOD_LINEAR needs to go because it prevents apps from
> > detecting whether 2 devices have 0 compatible memory layouts, which is a
> > useful thing to know.
> > 
> > 
> > Marek
> > 
> > 
> > 
> > On Fri, Jan 10, 2025 at 4:23 PM James Jones  wrote:
> > 
> > On 12/19/24 10:03, Simona Vetter wrote:
> > > On Thu, Dec 19, 2024 at 09:02:27AM +, Daniel Stone wrote:
> > > > On Wed, 18 Dec 2024 at 10:32, Brian Starkey  
> > > > wrote:
> > > > > On Wed, Dec 18, 2024 at 11:24:58AM +, Simona Vetter wrote:
> > > > > > For that reason I think linear modifiers with explicit pitch/size
> > > > > > alignment constraints is a sound concept and fits into how 
> > > > > > modifiers work
> > > > > > overall.
> > > > > 
> > > > > Could we make it (more) clear that pitch alignment is a "special"
> > > > > constraint (in that it's really a description of the buffer lay

Re: [PATCH] drm/fourcc: add LINEAR modifiers with an exact pitch alignment

2025-01-17 Thread Simona Vetter
H_ALIGN_EQ(256), or
> DRM_FORMAT_MOD_LINEAR_PITCH_ALIGN_GE(32). But then that doesn't really
> work at all with how people actually use modifiers: as the doc
> describes, userspace takes and intersects the declared modifier lists
> and passes the result through. The intersection of LINEAR+EQ256 and
> LINEAR+GE32 is LINEAR, so a userspace that follows the rules will just
> drop the hints on the floor and pick whatever linear allocation it
> feels like.

Yeah I think latest when we also take into account logical image size (not
just pitch) with stuff like it needs to be aligned to 2 pixels in both
directions just using modifiers falls apart.

And the problem with linear, unlike device modifiers is that we can't just
throw up our hands and enumerate the handful of formats in actual use for
interop. There's so many produces and consumers of linera buffers
(Daniel's list above missed camera/image processors) that save assumption
is that anything really can happen.

> I think I've just talked myself into the position that passing
> allocator constraints together with modifiers is the only way to
> actually solve this problem, at least without creating the sort of
> technical debt that meant we spent years fixing up implicit/explicit
> modifier interactions when it really should've just been adding a
> !)@*(#$ u64 next to the u32.

Yeah probably.

Otoh I know inertia, so I am tempted to go with the oddball
LINEAR_VEDNOR_A_VENDOR_B_INTEROP thing and stretch the runway for a bit.
And we just assign those as we go as a very special thing, and the drivers
that support it would prefer it above just LINEAR if there's no other
common format left.

Also makes it really obvious what all userspace/kernel driver enabling
would be needed to justify such a modifier.
-Sima

> 
> Cheers,
> Daniel

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P

2025-01-09 Thread Simona Vetter
On Thu, 9 Jan 2025 at 17:58, Felix Kuehling  wrote:
>
> From: Christian König 
>
> Try pinning into VRAM to allow P2P with RDMA NICs without ODP
> support if all attachments can do P2P. If any attachment can't do
> P2P just pin into GTT instead.
>
> Signed-off-by: Christian König 
> Signed-off-by: Felix Kuehling 
> Reviewed-by: Felix Kuehling 
> Tested-by: Pak Nin Lui 
> Cc: Simona Vetter 

I thought the big reason we never did this was that there's no way to
account vram pinning properly, until now? But with the cgroup stuff
finally moving that's no longer a concern I guess, but I'd at least
note that down in the commit message. And maybe land this only after
the cgroup stuff is in?

Anyway code looks like how we designed this all to work, so ack.
-Sima

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 25 +++--
>  1 file changed, 18 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> index 8e81a83d37d84..83390143c2e9f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> @@ -72,11 +72,25 @@ static int amdgpu_dma_buf_attach(struct dma_buf *dmabuf,
>   */
>  static int amdgpu_dma_buf_pin(struct dma_buf_attachment *attach)
>  {
> -   struct drm_gem_object *obj = attach->dmabuf->priv;
> -   struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
> +   struct dma_buf *dmabuf = attach->dmabuf;
> +   struct amdgpu_bo *bo = gem_to_amdgpu_bo(dmabuf->priv);
> +   u32 domains = bo->preferred_domains;
>
> -   /* pin buffer into GTT */
> -   return amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT);
> +   dma_resv_assert_held(dmabuf->resv);
> +
> +   /*
> +* Try pinning into VRAM to allow P2P with RDMA NICs without ODP
> +* support if all attachments can do P2P. If any attachment can't do
> +* P2P just pin into GTT instead.
> +*/
> +   list_for_each_entry(attach, &dmabuf->attachments, node)
> +   if (!attach->peer2peer)
> +   domains &= ~AMDGPU_GEM_DOMAIN_VRAM;
> +
> +   if (domains & AMDGPU_GEM_DOMAIN_VRAM)
> +   bo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> +
> +   return amdgpu_bo_pin(bo, domains);
>  }
>
>  /**
> @@ -131,9 +145,6 @@ static struct sg_table *amdgpu_dma_buf_map(struct 
> dma_buf_attachment *attach,
> r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
> if (r)
> return ERR_PTR(r);
> -
> -   } else if (bo->tbo.resource->mem_type != TTM_PL_TT) {
> -   return ERR_PTR(-EBUSY);
> }
>
> switch (bo->tbo.resource->mem_type) {
> --
> 2.34.1
>


-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH RFC v3 0/7] drm/display: dp: add new DPCD access functions

2025-03-07 Thread Simona Vetter
On Fri, Mar 07, 2025 at 06:34:42AM +0200, Dmitry Baryshkov wrote:
> Existing DPCD access functions return an error code or the number of
> bytes being read / write in case of partial access. However a lot of
> drivers either (incorrectly) ignore partial access or mishandle error
> codes. In other cases this results in a boilerplate code which compares
> returned value with the size.
> 
> As suggested by Jani implement new set of DPCD access helpers, which
> ignore partial access, always return 0 or an error code. Implement
> new helpers using existing functions to ensure backwards compatibility
> and to assess necessity to handle incomplete reads on a global scale.
> Currently only one possible place has been identified, dp-aux-dev, which
> needs to handle possible holes in DPCD.
> 
> This series targets only the DRM helpers code. If the approach is found
> to be acceptable, each of the drivers should be converted on its own.
> 
> Signed-off-by: Dmitry Baryshkov 

Just wanted to drop my "I like this" on your series here. Short
read/writes come from unix pipes, and they're everywhere, and yes ime
everyone gets them wrong. So ack or whatever that means :-)
-Sima

> ---
> Changes in v3:
> - Fixed cover letter (Jani)
> - Added intel-gfx and intel-xe to get the series CI-tested (Jani)
> - Link to v2: 
> https://lore.kernel.org/r/20250301-drm-rework-dpcd-access-v2-0-4d92602fc...@linaro.org
> 
> Changes in v2:
> - Reimplemented new helpers using old ones (Lyude)
> - Reworked the drm_dp_dpcd_read_link_status() patch (Lyude)
> - Dropped the dp-aux-dev patch (Jani)
> - Link to v1: 
> https://lore.kernel.org/r/20250117-drm-rework-dpcd-access-v1-0-7fc020e04...@linaro.org
> 
> ---
> Dmitry Baryshkov (7):
>   drm/display: dp: implement new access helpers
>   drm/display: dp: change drm_dp_dpcd_read_link_status() return value
>   drm/display: dp: use new DCPD access helpers
>   drm/display: dp-aux-dev: use new DCPD access helpers
>   drm/display: dp-cec: use new DCPD access helpers
>   drm/display: dp-mst-topology: use new DCPD access helpers
>   drm/display: dp-tunnel: use new DCPD access helpers
> 
>  drivers/gpu/drm/amd/amdgpu/atombios_dp.c   |   8 +-
>  .../gpu/drm/bridge/cadence/cdns-mhdp8546-core.c|   2 +-
>  drivers/gpu/drm/display/drm_dp_aux_dev.c   |  12 +-
>  drivers/gpu/drm/display/drm_dp_cec.c   |  37 ++-
>  drivers/gpu/drm/display/drm_dp_helper.c| 307 
> +
>  drivers/gpu/drm/display/drm_dp_mst_topology.c  | 105 ---
>  drivers/gpu/drm/display/drm_dp_tunnel.c|  20 +-
>  drivers/gpu/drm/hisilicon/hibmc/dp/dp_link.c   |   4 +-
>  drivers/gpu/drm/msm/dp/dp_ctrl.c   |  24 +-
>  drivers/gpu/drm/msm/dp/dp_link.c   |  18 +-
>  drivers/gpu/drm/radeon/atombios_dp.c   |   8 +-
>  include/drm/display/drm_dp_helper.h|  92 +-
>  12 files changed, 322 insertions(+), 315 deletions(-)
> ---
> base-commit: 565351ae7e0cee80e9b5ed84452a5b13644ffc4d
> change-id: 20241231-drm-rework-dpcd-access-b0fc2e47d613
> 
> Best regards,
> -- 
> Dmitry Baryshkov 
> 

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 00/63] Fix CONFIG_DRM_USE_DYNAMIC_DEBUG=y

2025-02-20 Thread Simona Vetter
On Thu, Feb 20, 2025 at 09:31:41AM +0100, Greg KH wrote:
> On Fri, Jan 24, 2025 at 11:45:14PM -0700, Jim Cromie wrote:
> > This series fixes dynamic-debug's support for DRM debug-categories.
> > Classmaps-v1 evaded full review, and got committed in 2 chunks:
> > 
> >   b7b4eebdba7b..6ea3bf466ac6# core dyndbg changes
> >   0406faf25fb1..ee7d633f2dfb# drm adoption
> > 
> > DRM-CI found a regression during init with drm.debug=; the
> > static-keys under the drm-dbgs in drm.ko got enabled, but those in
> > drivers & helpers did not.
> > 
> > Root Problem:
> > 
> > DECLARE_DYNDBG_CLASSMAP violated a K&R rule "define once, refer
> > afterwards".  Replace it with DYNDBG_CLASSMAP_DEFINE (invoked once in
> > drm-core) and DYNDBG_CLASSMAP_USE (invoked repeatedly, in drivers &
> > helpers).
> > 
> > _DEFINE exports the classmap it creates (in drm.ko), other modules
> > _USE the classmap.  The _USE adds a record ref'g the _DEFINEd (&
> > exported) classmap, in a 2nd __dyndbg_class_users section.
> > 
> > So now at modprobe, dyndbg scans the new section after the 1st
> > __dyndbg_class_maps section, follows the linkage to the _DEFINEr
> > module, finds the (optional) kernel-param controlling the classmap,
> > examines its drm.debug=, and applies it to the module being
> > initialized.
> > 
> > To recapitulate the multi-module problem wo DRM involvement, Add:
> > 
> > A. tools/testing/selftests/dynamic_debug/*
> > 
> > This alters pr_debugs in the test-modules, counts the results and
> > checks them against expectations.  It uses this formula to test most
> > of the control grammar, including the new class keyword.
> > 
> > B. test_dynamic_debug_submod.ko
> > 
> > This alters the test-module to build both parent & _submod ko's, with
> > _DEFINE and _USE inside #if/#else blocks.  This recap's DRM's 2 module
> > failure scenario, allowing A to exersize several cases.
> > 
> > The #if/#else puts the 2 macro uses together for clarity, and gives
> > the 2 modules identical sets of debugs.
> > 
> > Recent DRM-CI tests are here:
> >   https://patchwork.freedesktop.org/series/139147/
> > 
> > Previous rev:
> >   
> > https://lore.kernel.org/lkml/20240716185806.1572048-1-jim.cro...@gmail.com/
> > 
> > Noteworthy Additions:
> > 
> > 1- drop class "protection" special case, per JBaron's preference.
> >only current use is marked BROKEN so nobody to affect.
> >now framed as policy-choice:
> >#define ddebug_client_module_protects_classes() false
> >subsystems wanting protection can change this.
> > 
> > 2- compile-time arg-tests in DYNDBG_CLASSMAP_DEFINE
> >implement several required constraints, and fail obviously.
> > 
> > 3- modprobe time check of conflicting class-id reservations
> >only affects 2+classmaps users.
> >compile-time solution not apparent.
> > 
> > 4- dyndbg can now cause modprobe to fail.
> >needed to catch 3.
> >maybe some loose ends here on failure.
> > 
> > 5- refactor & rename ddebug_attach_*module_classes
> >reduce repetetive boilerplate on 2 types: maps, users.
> >rework mostly brought forward in patchset to reduce churn
> >TBD: maybe squash more.
> > 
> > Several recent trybot submissions (against drm-tip) have been passing
> > CI.BAT, and failing one or few CI.IGT tests randomly; re-tests do not
> > reliably repeat the failures.
> > 
> > its also at github.com:jimc/linux.git
> >   dd-fix-9[st]-ontip  &  dd-fix-9-13
> > 
> > Ive been running it on my desktop w/o issues.
> > 
> > The drivers/gpu/drm patches are RFC, I think there might be a single
> > place to call DRM_CLASSMAP_USE(drm_dedbug_classes) to replace the
> > sprinkling of _USEs in drivers and helpers.  IIRC, I tried adding a
> > _DEFINE into drm_drv.c, that didn't do it, so I punted for now.
> > 
> > I think the dyndbg core additions are ready for review and merging
> > into a (next-next) test/integration tree.
> 
> So whose tree should this go through?

I'm trying to get some drm folks to review/test this, but thus far not
much success :-/ I think it's good stuff, but I'm somewhat hesitant if no
one else agrees that it's useful for CI or in-field crash-recording or
whatever ...

I guess worst case we can land it and hope it attracts more folks?

Wrt tree I don't care, but I guess we should then also land the drm side
too.
-Sima

> And I think the last patch in this series isn't correct, it looks like a
> 000 email somehow.
> 
> thanks,
> 
> greg k-h

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[PATCH 5/8] drm/amd/kfd: Add comment about possible drm_gem_handle_create() race

2025-05-28 Thread Simona Vetter
I've long ago stopped trying to fully understand all the locking in
amdkfd, so maybe this is safe for a contrived reason. It's definitely
not how this should be done. Considers this more a request for a
proper patch.

Cc: Felix Kuehling 
Cc: amd-gfx@lists.freedesktop.org
Signed-off-by: Simona Vetter 
Signed-off-by: Simona Vetter 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 260165bbe373..aa51930a012b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1774,6 +1774,8 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
ret = drm_gem_handle_create(adev->kfd.client.file, gobj, 
&(*mem)->gem_handle);
if (ret)
goto err_gem_handle_create;
+   /* FIXME: Thou shall completely initialize the bo before calling
+* drm_gem_handle_create. Or explain why this is safe. */
bo = gem_to_amdgpu_bo(gobj);
if (bo_type == ttm_bo_type_sg) {
bo->tbo.sg = sg;
-- 
2.49.0



Re: [PATCH 1/4] drm/sched: optimize drm_sched_job_add_dependency

2025-05-28 Thread Simona Vetter
ere can
be really nasty sometimes, and storage is cheap.
-Sima

> 
> 
> Thx
> P.
> 
> 
> > 
> > Regards,
> > Christian.
> > 
> > > 
> > > 
> > > Signed-off-by: Gordon Freeman 
> > > Reviewed-by: Alyx Vance 
> > > ---
> > > Changes in v2:
> > >   - Provide more docu for crowbar-alloc-function.
> > >   - Use NULL pointers for reserved xarray entries
> > > ---
> > > 
> > > 
> > > 
> > > P.
> > > 
> > > 
> > > > 
> > > > Regards,
> > > > Christian.
> > > > 
> > > > > > 
> > > > > > > Signed-off-by: Christian König 
> > > > > > > ---
> > > > > > >  drivers/gpu/drm/scheduler/sched_main.c | 29
> > > > > > > ++
> > > > > > >  1 file changed, 20 insertions(+), 9 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > index f7118497e47a..cf200b1b643e 100644
> > > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > @@ -871,10 +871,8 @@ EXPORT_SYMBOL(drm_sched_job_arm);
> > > > > > >  int drm_sched_job_add_dependency(struct drm_sched_job
> > > > > > > *job,
> > > > > > >   struct dma_fence *fence)
> > > > > > >  {
> > > > > > > + XA_STATE(xas, &job->dependencies, 0);
> > > > > > >   struct dma_fence *entry;
> > > > > > > - unsigned long index;
> > > > > > > - u32 id = 0;
> > > > > > > - int ret;
> > > > > > >  
> > > > > > >   if (!fence)
> > > > > > >   return 0;
> > > > > > > @@ -883,24 +881,37 @@ int
> > > > > > > drm_sched_job_add_dependency(struct
> > > > > > > drm_sched_job *job,
> > > > > > >   * This lets the size of the array of deps scale with
> > > > > > > the number of
> > > > > > >   * engines involved, rather than the number of BOs.
> > > > > > >   */
> > > > > > > - xa_for_each(&job->dependencies, index, entry) {
> > > > > > > + xa_lock(&job->dependencies);
> > > > > > > + xas_for_each(&xas, entry, ULONG_MAX) {
> > > > > > >   if (entry->context != fence->context)
> > > > > > >   continue;
> > > > > > >  
> > > > > > >   if (dma_fence_is_later(fence, entry)) {
> > > > > > >   dma_fence_put(entry);
> > > > > > > - xa_store(&job->dependencies, index,
> > > > > > > fence, GFP_KERNEL);
> > > > > > > + xas_store(&xas, fence);
> > > > > > >   } else {
> > > > > > >   dma_fence_put(fence);
> > > > > > >   }
> > > > > > > - return 0;
> > > > > > > + xa_unlock(&job->dependencies);
> > > > > > > + return xas_error(&xas);
> > > > > > >   }
> > > > > > >  
> > > > > > > - ret = xa_alloc(&job->dependencies, &id, fence,
> > > > > > > xa_limit_32b, GFP_KERNEL);
> > > > > > > - if (ret != 0)
> > > > > > > +retry:
> > > > > > > + entry = xas_store(&xas, fence);
> > > > > > > + xa_unlock(&job->dependencies);
> > > > > > > +
> > > > > > > + /* There shouldn't be any concurrent add, so no need
> > > > > > > to loop again */
> > > > > > 
> > > > > > Concurrency shouldn't matter, xas_nomem() stores the pre-
> > > > > > allocated memory in the
> > > > > > XA_STATE not the xarray. Hence, I think we should remove the
> > > > > > comment.
> > > > > > 
> > > > > > > + if (xas_nomem(&xas, GFP_KERNEL)) {
> > > > > > > + xa_lock(&job->dependencies);
> > > > > > > + goto retry;
> > > > > > 
> > > > > > Please don't use a goto here, if we would have failed to
> > > > > > allocate
> > > > > > memory here,
> > > > > > this would be an endless loop until we succeed eventually. It
> > > > > > would be equal to:
> > > > > > 
> > > > > > while (!ptr) {
> > > > > > ptr = kmalloc();
> > > > > > }
> > > > > > 
> > > > > > Instead just take the lock and call xas_store() again.
> > > > > > 
> > > > > > > + }
> > > > > > > +
> > > > > > > + if (xas_error(&xas))
> > > > > > >   dma_fence_put(fence);
> > > > > > > + else
> > > > > > > + WARN_ON(entry);
> > > > > > 
> > > > > > Please don't call WARN_ON() here, this isn't fatal, we only
> > > > > > need
> > > > > > to return the
> > > > > > error code.
> > > > 
> > > 
> > 
> 

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/4] drm/sched: optimize drm_sched_job_add_dependency

2025-06-03 Thread Simona Vetter
On Wed, May 28, 2025 at 04:47:00PM +0200, Danilo Krummrich wrote:
> On Wed, May 28, 2025 at 04:39:01PM +0200, Danilo Krummrich wrote:
> > On Wed, May 28, 2025 at 09:29:30AM -0400, Alex Deucher wrote:
> > > On Wed, May 28, 2025 at 8:45 AM Simona Vetter  
> > > wrote:
> > > > I do occasionally find it useful as a record of different approaches
> > > > considered, which sometimes people fail to adequately cover in their
> > > > commit messages. Also useful indicator of how cursed a patch is :-)
> > > >
> > > > But as long as anything relevant does end up in the commit message and
> > > > people don't just delete stuff I don't care how it's done at all. It's
> > > > just that the cost of deleting something that should have been there can
> > > > be really nasty sometimes, and storage is cheap.
> > > 
> > > I like them for the same reasons.  Also, even with links, sometimes
> > > there are forks of the conversation that get missed that a changelog
> > > provides some insight into.  I find it useful in my own development as
> > > I can note what I've changed in a patch and can retain that in the
> > > commit rather than as something I need to track separately and then
> > > add to the patches when I send them out.
> > 
> > Personally, I don't think it's super useful in the commit message, it still
> > remains in the patches sent to the mailing list though. And since we put 
> > lore
> > links everywhere, it's easily accessible, *including* the context of why a
> > change was made from one version to another, i.e. the full conversation.
> > 
> > However, if we really want that, we should make it an offical thing, since
> > currently the kernel's process documentation [1] clearly states otherwise:
> > 
> > "Please put this information after the '---' line which separates the 
> > changelog
> > from the rest of the patch. The version information is not part of the 
> > changelog
> > which gets committed to the git tree. It is additional information for the
> > reviewers. If it's placed above the commit tags, it needs manual 
> > interaction to
> > remove it."
> > 
> > Alternatively, it can go into the cover letter.
> 
> One additional note:
> 
> This is not me trying to be super bureaucratic; instead I think being 
> consistent
> in the process across the whole kernel results in a better experience for 
> (new)
> contributors.

Yeah I agree with this part, which is why in the past I didn't ask people
to keep that part, but also won't complain if it's kept. The entire goal
being "minimal amount to get to a commit message that's hopefully
complete". I think with b4 this has now also become a bit easier than 10+
years ago.

Also all the kernel fd.o lists are on lore and the archive on fd.o is
under our control, so hopefully the archive situation shouldn't ever be an
issue for us.

Anyway no strong opinion from me, but we might want to document that we're
a bit more relaxed here.

*shrugs*

Cheers, Sima

> 
> > [1] https://docs.kernel.org/process/submitting-patches.html#commentary

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: WARNING: possible circular locking dependency detected: drm_client_dev_suspend() & radeon_suspend_kms()

2025-07-11 Thread Simona Vetter
ca-HP-ZBook-14-G2 kernel:  ? __pfx_kthread+0x10/0x10
> Jul 10 16:12:52 qca-HP-ZBook-14-G2 kernel:  ? ret_from_fork+0x1f/0x2f0
> Jul 10 16:12:52 qca-HP-ZBook-14-G2 kernel:  ? lock_release+0xc6/0x2a0
> Jul 10 16:12:52 qca-HP-ZBook-14-G2 kernel:  ? __pfx_kthread+0x10/0x10
> Jul 10 16:12:52 qca-HP-ZBook-14-G2 kernel:  ret_from_fork+0x215/0x2f0
> Jul 10 16:12:52 qca-HP-ZBook-14-G2 kernel:  ? __pfx_kthread+0x10/0x10
> Jul 10 16:12:52 qca-HP-ZBook-14-G2 kernel:  ret_from_fork_asm+0x1a/0x30
> Jul 10 16:12:52 qca-HP-ZBook-14-G2 kernel:  
>
> This doesn't seem to be the cause of the ath12k issue I'm debugging,
> but thought it worth mentioning since I only see one similar report
> on lore, and that didn't have any apparent follow-up:
> https://lore.kernel.org/all/20250202161048.373f89c0@yea/
>
> /jeff

--
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] Revert "drm/amdgpu: Use dma_buf from GEM object instance"

2025-07-15 Thread Simona Vetter
On Tue, Jul 15, 2025 at 10:26:22AM +0200, Thomas Zimmermann wrote:
> This reverts commit 515986100d176663d0a03219a3056e4252f729e6.
> 
> The dma_buf field in struct drm_gem_object is not stable over the
> object instance's lifetime. The field becomes NULL when user space
> releases the final GEM handle on the buffer object. This resulted
> in a NULL-pointer deref.
> 
> Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
> GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
> Acquire internal references on GEM handles") only solved the problem
> partially. They especially don't work for buffer objects without a DRM
> framebuffer associated.
> 
> Hence, this revert to going back to using .import_attach->dmabuf.
> 
> Signed-off-by: Thomas Zimmermann 

Reviewed-by: Simona Vetter 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 3 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  | 2 +-
>  3 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> index ff98c87b2e0b..5743ebb2f1b7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> @@ -514,7 +514,7 @@ bool amdgpu_dmabuf_is_xgmi_accessible(struct 
> amdgpu_device *adev,
>   return false;
>  
>   if (drm_gem_is_imported(obj)) {
> - struct dma_buf *dma_buf = obj->dma_buf;
> + struct dma_buf *dma_buf = obj->import_attach->dmabuf;
>  
>   if (dma_buf->ops != &amdgpu_dmabuf_ops)
>   /* No XGMI with non AMD GPUs */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index 6626a6e64ff5..d1ccbfcf21fa 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -317,7 +317,8 @@ static int amdgpu_gem_object_open(struct drm_gem_object 
> *obj,
>*/
>   if (!vm->is_compute_context || !vm->process_info)
>   return 0;
> - if (!drm_gem_is_imported(obj) || !dma_buf_is_dynamic(obj->dma_buf))
> + if (!drm_gem_is_imported(obj) ||
> + !dma_buf_is_dynamic(obj->import_attach->dmabuf))
>   return 0;
>   mutex_lock_nested(&vm->process_info->lock, 1);
>   if (!WARN_ON(!vm->process_info->eviction_fence)) {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index af0f655dfd5b..b9b4f7d9186e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1272,7 +1272,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, 
> struct amdgpu_bo_va *bo_va,
>   struct drm_gem_object *obj = &bo->tbo.base;
>  
>   if (drm_gem_is_imported(obj) && bo_va->is_xgmi) {
> - struct dma_buf *dma_buf = obj->dma_buf;
> + struct dma_buf *dma_buf = obj->import_attach->dmabuf;
>   struct drm_gem_object *gobj = dma_buf->priv;
>   struct amdgpu_bo *abo = gem_to_amdgpu_bo(gobj);
>  
> -- 
> 2.50.0
> 

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v9] Add CRIU support for amdgpu dmabuf

2025-07-31 Thread Simona Vetter
On Fri, Jul 25, 2025 at 12:07:54PM -0400, David Francis wrote:
> This patch series adds support for CRIU checkpointing of processes that
> share memory with the amdgpu dmabuf interface.
> 
> This v9 adds padding to the structs to align them.
> 
> Adding Alex Deucher beause these patches add two new amdgpu drm ioctls.

Bit aside, but I think big infrastructure stuff like CRIU support would be
good to cc to dri-devel, as an fyi. I think it's pretty ok to let the
leading driver roll this all in driver specific ways, but keeping the
wider subsystem in the loop on things like that would be good. That's also
kinda what I meant with my cc request, not cc me personally, I get _way_
too much mail myself to stay on top of anything at all.

Thanks, Sima
-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch