On Sun, Dec 7, 2025 at 11:49 PM Nicolas Frattaroli
<[email protected]> wrote:
>
> On Friday, 5 December 2025 22:16:44 Central European Standard Time Chia-I Wu 
> wrote:
> > On Fri, Dec 5, 2025 at 2:48 AM Nicolas Frattaroli
> > <[email protected]> wrote:
> > >
> > > On Thursday, 4 December 2025 21:21:08 Central European Standard Time 
> > > Chia-I Wu wrote:
> > > > On Wed, Dec 3, 2025 at 6:04 AM Nicolas Frattaroli
> > > > <[email protected]> wrote:
> > > > >
> > > > > Mali GPUs have three registers that indicate which parts of the 
> > > > > hardware
> > > > > are powered and active at any moment. These take the form of bitmaps. 
> > > > > In
> > > > > the case of SHADER_PWRACTIVE for example, a high bit indicates that 
> > > > > the
> > > > > shader core corresponding to that bit index is active. These bitmaps
> > > > > aren't solely contiguous bits, as it's common to have holes in the
> > > > > sequence of shader core indices, and the actual set of which cores are
> > > > > present is defined by the "shader present" register.
> > > > >
> > > > > When the GPU finishes a power state transition, it fires a
> > > > > GPU_IRQ_POWER_CHANGED_ALL interrupt. After such an interrupt is
> > > > > received, the PWRACTIVE registers will likely contain interesting new
> > > > > information.
> > > > I am seeing
> > > >
> > > >    irq/342-panthor-412     [000] .....   934.526754: gpu_power_active:
> > > > shader_bitmap=0x0 tiler_bitmap=0x0 l2_bitmap=0x0
> > > >    irq/342-panthor-412     [000] .....   936.640356: gpu_power_active:
> > > > shader_bitmap=0x0 tiler_bitmap=0x0 l2_bitmap=0x0
> > > >
> > > > on a gpu-bound test. It does not look like collecting samples on
> > > > GPU_IRQ_POWER_CHANGED_ALL gives too much info.
> > >
> > > On what GPU and SoC is that? If it's MT8196 then I wouldn't be
> > > surprised if it just broke that hardware register, considering
> > > what it did to the SHADER_PRESENT register.
> > Indeed I was on mt8196.
>
> I don't have much faith in the Mali integration of that SoC being
> representative of how the Mali hardware is supposed to work. The
> SHADER_PRESENT thing is just the tip of the iceberg, I've also
> noticed while developing mtk-mfg-pmdomain that it seemingly messes
> with the Mali GPU's internal MCU from the GPUEB depending on the
> commands you send it, and can get it into a broken state with
> enough luck.
>
> Check if the registers ever read anything but 0, e.g. by dumping
> them from sysfs like this:
>
> ---
> diff --git a/drivers/gpu/drm/panthor/panthor_drv.c 
> b/drivers/gpu/drm/panthor/panthor_drv.c
> index d1d4c50da5bf..b0e67dc17c92 100644
> --- a/drivers/gpu/drm/panthor/panthor_drv.c
> +++ b/drivers/gpu/drm/panthor/panthor_drv.c
> @@ -1678,8 +1678,69 @@ static ssize_t profiling_store(struct device *dev,
>
>  static DEVICE_ATTR_RW(profiling);
>
> +static ssize_t print_active_bitmask(char *buf, ssize_t len, u64 present, u64 
> active)
> +{
> +       unsigned int i = 0;
> +       u64 bit;
> +
> +       while (present) {
> +               bit = BIT(i);
> +               if (present & bit) {
> +                       present &= ~bit;
> +                       len += sysfs_emit_at(buf, len, "%s", (active & bit) ? 
> "1" : "0");
> +               } else {
> +                       len += sysfs_emit_at(buf, len, "_");
> +               }
> +               i++;
> +       }
> +
> +       return len;
> +}
> +
> +static ssize_t power_active_show(struct device *dev, struct device_attribute 
> *attr,
> +                                char *buf)
> +{
> +       struct panthor_device *ptdev = dev_get_drvdata(dev);
> +       ssize_t len = 0;
> +       u64 present;
> +       int ret;
> +
> +       if (pm_runtime_suspended(ptdev->base.dev))
> +               return sysfs_emit(buf, "Shader:\t0\nTiler:\t0\nL2:\t0\n");
> +
> +       ret = pm_runtime_resume_and_get(ptdev->base.dev);
> +       if (ret)
> +               return ret;
> +
> +       len += sysfs_emit_at(buf, len, "Shader:\t");
> +       len += print_active_bitmask(buf, len, gpu_read64(ptdev, 
> GPU_SHADER_PRESENT),
> +                                   gpu_read64(ptdev, SHADER_PWRACTIVE));
> +       len += sysfs_emit_at(buf, len, "\n");
> +
> +       present = gpu_read64(ptdev, GPU_TILER_PRESENT);
> +       if (present == 0x1) /* "Implementation defined", just try to dump all 
> */
> +               present = U64_MAX;
> +       len += sysfs_emit_at(buf, len, "Tiler:\t");
> +       len += print_active_bitmask(buf, len, present, gpu_read64(ptdev, 
> TILER_PWRACTIVE));
> +       len += sysfs_emit_at(buf, len, "\n");
> +
> +       present = gpu_read64(ptdev, GPU_L2_PRESENT);
> +       if (present == 0x1) /* "Implementation defined", just try to dump all 
> */
> +               present = U64_MAX;
> +       len += sysfs_emit_at(buf, len, "L2:\t");
> +       len += print_active_bitmask(buf, len, present, gpu_read64(ptdev, 
> L2_PWRACTIVE));
> +       len += sysfs_emit_at(buf, len, "\n");
> +
> +       pm_runtime_put(ptdev->base.dev);
> +
> +       return len;
> +}
> +
> +static DEVICE_ATTR_RO(power_active);
> +
>  static struct attribute *panthor_attrs[] = {
>         &dev_attr_profiling.attr,
> +       &dev_attr_power_active.attr,
>         NULL,
>  };
> ---
>
> If they always read 0 regardless of whether you're running a GPU
> workload or not, then it's just not properly wired up.
They can be non-zero.
>
> > >
> > > On RK3588 (v10), GPU_IRQ_POWER_CHANGED_ALL reliably fires when
> > > there is new information available in those registers. I haven't
> > > tried on MT8196 (v13) yet because that still doesn't boot with
> > > mainline so testing anything is a pain.
> > >
> > > I don't have any v12 or v11 hardware to test with. From what I
> > > understand, there's no open enough platform to do v11 testing on,
> > > just the Pixel 8 and Pixel 9. I could look into the Cix SoC for v12
> > > though some day, but I don't own one at the moment.
> > >
> > > >
> > > > I think they are more useful to be collected periodically, such that
> > > > we know that in the past X seconds, Y out of a total of Z samples
> > > > indicates activities. That's best done in userspace, and panthor's
> > > > role should be to provide an uapi such as
> > > > https://lore.kernel.org/all/[email protected]/.
> > >
> > > This wouldn't give you information on the time a power transition has
> > > completed, which is one of the motivations. A periodically collected
> > > PWRACTIVE would just be roughly correlated to how busy the GPU is,
> > > which isn't very useful additional information as the performance
> > > counters themselves are likely a better source of that kind of info.
> > {SHADER,TILER,L2}_READY might be more appropriate if you want to trace
> > power transitions?
>
> Depends, the documentation I have access to isn't explicit about
> what "READY" means. Is a busy core non-ready? Is there ever a case
> where a significant number of cores are READY but not PWRACTIVE?
>
> I can answer the first question with some more poking on RK3588,
> but for the latter a simple experiment on one piece of hardware
> isn't going to answer it. Plus, the core being active will probably
> be more interesting than it either sitting idle but powered or
> actually doing work.
>From what I can see, *_READY are non-zero when powered and *_PWRACTIVE
are non-zero when powered and busy on mt8196.

If you want to generate a trace event upon GPU_IRQ_POWER_CHANGED_ALL,
*_READY seems more appropriate at least on mt8196. If you want to
track busyness with *_PWRACTIVE, you probably need to sample
periodically.

>
> >
> > >
> > > What I need to do is restrict this to <= v13 in the next revision
> > > however, because v14 reworks this stuff.
> > >
> > > Kind regards,
> > > Nicolas Frattaroli
> > >
> > >
> >
>
>
>
>

Reply via email to