On Wed, Sep 03, 2025 at 02:08:28PM +0530, Swamil Jain wrote: > Hi Tomi, Maxime, > > On 8/27/25 15:19, Tomi Valkeinen wrote: > > Hi, > > > > On 27/08/2025 12:27, Maxime Ripard wrote: > > > On Wed, Aug 27, 2025 at 11:49:22AM +0300, Tomi Valkeinen wrote: > > > > On 19/08/2025 22:21, Swamil Jain wrote: > > > > > From: Jayesh Choudhary <j-choudh...@ti.com> > > > > > > > > > > TIDSS hardware by itself does not have variable max_pclk for each VP. > > > > > The maximum pixel clock is determined by the limiting factor between > > > > > the functional clock and the PLL (parent to the VP/pixel clock). > > > > > > > > Hmm, this is actually not in the driver, is it? We're not limiting the > > > > pclk based on the fclk. > > > > > > > > > The limitation that has been modeled till now comes from the clock > > > > > (PLL can only be programmed to a particular max value). Instead of > > > > > putting it as a constant field in dispc_features, we can query the > > > > > DM to see if requested clock can be set or not and use it in > > > > > mode_valid(). > > > > > > > > > > Replace constant "max_pclk_khz" in dispc_features with > > > > > max_successful_rate and max_attempted_rate, both of these in > > > > > tidss_device structure would be modified in runtime. In mode_valid() > > > > > call, check if a best frequency match for mode clock can be found or > > > > > not using "clk_round_rate()". Based on that, propagate > > > > > max_successful_rate and max_attempted_rate and query DM again only if > > > > > the requested mode clock is greater than max_attempted_rate. (As the > > > > > preferred display mode is usually the max resolution, driver ends up > > > > > checking the highest clock the first time itself which is used in > > > > > subsequent checks). > > > > > > > > > > Since TIDSS display controller provides clock tolerance of 5%, we use > > > > > this while checking the max_successful_rate. Also, move up > > > > > "dispc_pclk_diff()" before it is called. > > > > > > > > > > This will make the existing compatibles reusable if DSS features are > > > > > same across two SoCs with the only difference being the pixel clock. > > > > > > > > > > Fixes: 7246e0929945 ("drm/tidss: Add OLDI bridge support") > > > > > Reviewed-by: Devarsh Thakkar <devar...@ti.com> > > > > > Signed-off-by: Jayesh Choudhary <j-choudh...@ti.com> > > > > > Signed-off-by: Swamil Jain <s-ja...@ti.com> > > > > > --- > > > > > drivers/gpu/drm/tidss/tidss_dispc.c | 85 > > > > > +++++++++++++---------------- > > > > > drivers/gpu/drm/tidss/tidss_dispc.h | 1 - > > > > > drivers/gpu/drm/tidss/tidss_drv.h | 11 +++- > > > > > 3 files changed, 47 insertions(+), 50 deletions(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/tidss/tidss_dispc.c > > > > > b/drivers/gpu/drm/tidss/tidss_dispc.c > > > > > index c0277fa36425..c2c0fe0d4a0f 100644 > > > > > --- a/drivers/gpu/drm/tidss/tidss_dispc.c > > > > > +++ b/drivers/gpu/drm/tidss/tidss_dispc.c > > > > > @@ -58,10 +58,6 @@ static const u16 > > > > > tidss_k2g_common_regs[DISPC_COMMON_REG_TABLE_LEN] = { > > > > > const struct dispc_features dispc_k2g_feats = { > > > > > .min_pclk_khz = 4375, > > > > > - .max_pclk_khz = { > > > > > - [DISPC_VP_DPI] = 150000, > > > > > - }, > > > > > - > > > > > /* > > > > > * XXX According TRM the RGB input buffer width up to 2560 > > > > > should > > > > > * work on 3 taps, but in practice it only works up to 1280. > > > > > @@ -144,11 +140,6 @@ static const u16 > > > > > tidss_am65x_common_regs[DISPC_COMMON_REG_TABLE_LEN] = { > > > > > }; > > > > > const struct dispc_features dispc_am65x_feats = { > > > > > - .max_pclk_khz = { > > > > > - [DISPC_VP_DPI] = 165000, > > > > > - [DISPC_VP_OLDI_AM65X] = 165000, > > > > > - }, > > > > > - > > > > > .scaling = { > > > > > .in_width_max_5tap_rgb = 1280, > > > > > .in_width_max_3tap_rgb = 2560, > > > > > @@ -244,11 +235,6 @@ static const u16 > > > > > tidss_j721e_common_regs[DISPC_COMMON_REG_TABLE_LEN] = { > > > > > }; > > > > > const struct dispc_features dispc_j721e_feats = { > > > > > - .max_pclk_khz = { > > > > > - [DISPC_VP_DPI] = 170000, > > > > > - [DISPC_VP_INTERNAL] = 600000, > > > > > - }, > > > > > - > > > > > .scaling = { > > > > > .in_width_max_5tap_rgb = 2048, > > > > > .in_width_max_3tap_rgb = 4096, > > > > > @@ -315,11 +301,6 @@ const struct dispc_features dispc_j721e_feats = { > > > > > }; > > > > > const struct dispc_features dispc_am625_feats = { > > > > > - .max_pclk_khz = { > > > > > - [DISPC_VP_DPI] = 165000, > > > > > - [DISPC_VP_INTERNAL] = 170000, > > > > > - }, > > > > > - > > > > > .scaling = { > > > > > .in_width_max_5tap_rgb = 1280, > > > > > .in_width_max_3tap_rgb = 2560, > > > > > @@ -376,15 +357,6 @@ const struct dispc_features dispc_am625_feats = { > > > > > }; > > > > > const struct dispc_features dispc_am62a7_feats = { > > > > > - /* > > > > > - * if the code reaches dispc_mode_valid with VP1, > > > > > - * it should return MODE_BAD. > > > > > - */ > > > > > - .max_pclk_khz = { > > > > > - [DISPC_VP_TIED_OFF] = 0, > > > > > - [DISPC_VP_DPI] = 165000, > > > > > - }, > > > > > - > > > > > .scaling = { > > > > > .in_width_max_5tap_rgb = 1280, > > > > > .in_width_max_3tap_rgb = 2560, > > > > > @@ -441,10 +413,6 @@ const struct dispc_features dispc_am62a7_feats = > > > > > { > > > > > }; > > > > > const struct dispc_features dispc_am62l_feats = { > > > > > - .max_pclk_khz = { > > > > > - [DISPC_VP_DPI] = 165000, > > > > > - }, > > > > > - > > > > > .subrev = DISPC_AM62L, > > > > > .common = "common", > > > > > @@ -1347,25 +1315,57 @@ static void dispc_vp_set_default_color(struct > > > > > dispc_device *dispc, > > > > > DISPC_OVR_DEFAULT_COLOR2, (v >> 32) & 0xffff); > > > > > } > > > > > +/* > > > > > + * Calculate the percentage difference between the requested pixel > > > > > clock rate > > > > > + * and the effective rate resulting from calculating the clock > > > > > divider value. > > > > > + */ > > > > > +unsigned int dispc_pclk_diff(unsigned long rate, unsigned long > > > > > real_rate) > > > > > +{ > > > > > + int r = rate / 100, rr = real_rate / 100; > > > > > + > > > > > + return (unsigned int)(abs(((rr - r) * 100) / r)); > > > > > +} > > > > > + > > > > > +static int check_pixel_clock(struct dispc_device *dispc, > > > > > + u32 hw_videoport, unsigned long clock) > > > > > +{ > > > > > + unsigned long round_clock; > > > > > + > > > > > + if (dispc->tidss->is_ext_vp_clk[hw_videoport]) > > > > > + return 0; > > > > > + > > > > > + if (clock <= dispc->tidss->max_successful_rate[hw_videoport]) > > > > > + return 0; > > > > > + > > > > > + if (clock < dispc->tidss->max_attempted_rate[hw_videoport]) > > > > > + return -EINVAL; > > > > > + > > > > > + round_clock = clk_round_rate(dispc->vp_clk[hw_videoport], > > > > > clock); > > > > > + > > > > > + if (dispc_pclk_diff(clock, round_clock) > 5) > > > > > + return -EINVAL; > > > > > + > > > > > + dispc->tidss->max_successful_rate[hw_videoport] = round_clock; > > > > > + dispc->tidss->max_attempted_rate[hw_videoport] = clock; > > > > > > > > I still don't think this logic is sound. This is trying to find the > > > > maximum clock rate, and optimize by avoiding the calls to > > > > clk_round_rate() if possible. That makes sense. > > > > > > > > But checking for the 5% tolerance breaks it, in my opinion. If we find > > > > out that the PLL can do, say, 100M, but we need pclk of 90M, the current > > > > maximum is still the 100M, isn't it? > > > > > > 5% is pretty large indeed. We've been using .5% in multiple drivers and > > > it proved to be pretty ok. I would advise you tu use it too. > > > > The 5% comes from OMAP DSS, where we had to do pixel clock with a few > > dividers and multipliers. The rates were quite coarse, and we ended up > > having quite a large tolerance. > > > > I think with tidss, we always have a PLL we control, so we should always > > have very exact clocks. So I'm fine with dropping it to .5%. However, > > this patch and series is about removing the a-bit-too-hardcoded VP clk > > max rate code in the driver, so I would leave everything else to another > > series. > > > > > It's not clear to me why avoiding a clk_round_rate() call is something > > > worth doing though? > > > > Hard to say if it's worth doing, someone should make some perf tests. > > However, afaik, the calls do go to the firmware, so it involves > > inter-processor calls. On OMAP DSS checking the clock rates was slow, as > > it involved lots of iterating with dividers and multipliers. Perhaps > > it's much faster here. > > > > > Even caching the maximum rate you have been able to reach before is > > > pretty fragile: if the PLL changes its rate, or if a sibling clock has > > > set some limits on what the PLL can do, your maximum isn't relevant > > > anymore. > > > > You're right, although afaik it should not happen with TI's SoCs. We > > would be in trouble anyway if that were the case (e.g. someone starts > > the camera, and suddenly we can't support 1080p anymore). > > > > > in other words, what's wrong with simply calling clk_round_rate() and > > > checking if it's within a .5% deviation? > > > > This started with discussions how to replace the hardcoded max VP clock > > rate (used to quickly weed out impossible rates), which in reality was > > actually PLL max clock rate. We don't know the PLL max rate, and can't > > query it, so this approach was taken. > > > > > At the very least, this should be explained in comments or the commit > > > message. > > > > I agree. > > > > Swamil, can you do some perf tests with clk_round_rate()? If it's fast > > (enough), it will simplify the driver. > > Average execution time is around 112 us. > Trace file including the execution time for clk_round_rate(): > https://gist.github.com/swamiljain/2abe86982cdeba1d69223d2d525e0cb6 > It is better to reduce calls to clk_round_rate().
But why? 100us is ridiculously small in that context. Even assuming you have like 10 modes, we're in the millisecond order of magnitude. Assuming no contention on the bus, it's about the same than reading the EDIDs. Hotplug pulses in HDMI take 100 *milli* seconds. Sync time is in seconds. Unless you have a real world benchmark that shows that it's too slow, you shouldn't care. And if you do have that benchmark, it should be fixed for all drivers. Either way, you don't need your caching logic in the tidss driver. Maxime
signature.asc
Description: PGP signature