Hi, On 03/09/2025 11:38, Swamil Jain wrote: > Hi Tomi, Maxime, > > On 8/27/25 15:19, Tomi Valkeinen wrote: >> Hi, >> >> On 27/08/2025 12:27, Maxime Ripard wrote: >>> On Wed, Aug 27, 2025 at 11:49:22AM +0300, Tomi Valkeinen wrote: >>>> On 19/08/2025 22:21, Swamil Jain wrote: >>>>> From: Jayesh Choudhary <j-choudh...@ti.com> >>>>> >>>>> TIDSS hardware by itself does not have variable max_pclk for each VP. >>>>> The maximum pixel clock is determined by the limiting factor between >>>>> the functional clock and the PLL (parent to the VP/pixel clock). >>>> >>>> Hmm, this is actually not in the driver, is it? We're not limiting the >>>> pclk based on the fclk. >>>> >>>>> The limitation that has been modeled till now comes from the clock >>>>> (PLL can only be programmed to a particular max value). Instead of >>>>> putting it as a constant field in dispc_features, we can query the >>>>> DM to see if requested clock can be set or not and use it in >>>>> mode_valid(). >>>>> >>>>> Replace constant "max_pclk_khz" in dispc_features with >>>>> max_successful_rate and max_attempted_rate, both of these in >>>>> tidss_device structure would be modified in runtime. In mode_valid() >>>>> call, check if a best frequency match for mode clock can be found or >>>>> not using "clk_round_rate()". Based on that, propagate >>>>> max_successful_rate and max_attempted_rate and query DM again only if >>>>> the requested mode clock is greater than max_attempted_rate. (As the >>>>> preferred display mode is usually the max resolution, driver ends up >>>>> checking the highest clock the first time itself which is used in >>>>> subsequent checks). >>>>> >>>>> Since TIDSS display controller provides clock tolerance of 5%, we use >>>>> this while checking the max_successful_rate. Also, move up >>>>> "dispc_pclk_diff()" before it is called. >>>>> >>>>> This will make the existing compatibles reusable if DSS features are >>>>> same across two SoCs with the only difference being the pixel clock. >>>>> >>>>> Fixes: 7246e0929945 ("drm/tidss: Add OLDI bridge support") >>>>> Reviewed-by: Devarsh Thakkar <devar...@ti.com> >>>>> Signed-off-by: Jayesh Choudhary <j-choudh...@ti.com> >>>>> Signed-off-by: Swamil Jain <s-ja...@ti.com> >>>>> --- >>>>> drivers/gpu/drm/tidss/tidss_dispc.c | 85 ++++++++++++ >>>>> +---------------- >>>>> drivers/gpu/drm/tidss/tidss_dispc.h | 1 - >>>>> drivers/gpu/drm/tidss/tidss_drv.h | 11 +++- >>>>> 3 files changed, 47 insertions(+), 50 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/tidss/tidss_dispc.c b/drivers/gpu/drm/ >>>>> tidss/tidss_dispc.c >>>>> index c0277fa36425..c2c0fe0d4a0f 100644 >>>>> --- a/drivers/gpu/drm/tidss/tidss_dispc.c >>>>> +++ b/drivers/gpu/drm/tidss/tidss_dispc.c >>>>> @@ -58,10 +58,6 @@ static const u16 >>>>> tidss_k2g_common_regs[DISPC_COMMON_REG_TABLE_LEN] = { >>>>> const struct dispc_features dispc_k2g_feats = { >>>>> .min_pclk_khz = 4375, >>>>> - .max_pclk_khz = { >>>>> - [DISPC_VP_DPI] = 150000, >>>>> - }, >>>>> - >>>>> /* >>>>> * XXX According TRM the RGB input buffer width up to 2560 >>>>> should >>>>> * work on 3 taps, but in practice it only works up to 1280. >>>>> @@ -144,11 +140,6 @@ static const u16 >>>>> tidss_am65x_common_regs[DISPC_COMMON_REG_TABLE_LEN] = { >>>>> }; >>>>> const struct dispc_features dispc_am65x_feats = { >>>>> - .max_pclk_khz = { >>>>> - [DISPC_VP_DPI] = 165000, >>>>> - [DISPC_VP_OLDI_AM65X] = 165000, >>>>> - }, >>>>> - >>>>> .scaling = { >>>>> .in_width_max_5tap_rgb = 1280, >>>>> .in_width_max_3tap_rgb = 2560, >>>>> @@ -244,11 +235,6 @@ static const u16 >>>>> tidss_j721e_common_regs[DISPC_COMMON_REG_TABLE_LEN] = { >>>>> }; >>>>> const struct dispc_features dispc_j721e_feats = { >>>>> - .max_pclk_khz = { >>>>> - [DISPC_VP_DPI] = 170000, >>>>> - [DISPC_VP_INTERNAL] = 600000, >>>>> - }, >>>>> - >>>>> .scaling = { >>>>> .in_width_max_5tap_rgb = 2048, >>>>> .in_width_max_3tap_rgb = 4096, >>>>> @@ -315,11 +301,6 @@ const struct dispc_features dispc_j721e_feats = { >>>>> }; >>>>> const struct dispc_features dispc_am625_feats = { >>>>> - .max_pclk_khz = { >>>>> - [DISPC_VP_DPI] = 165000, >>>>> - [DISPC_VP_INTERNAL] = 170000, >>>>> - }, >>>>> - >>>>> .scaling = { >>>>> .in_width_max_5tap_rgb = 1280, >>>>> .in_width_max_3tap_rgb = 2560, >>>>> @@ -376,15 +357,6 @@ const struct dispc_features dispc_am625_feats = { >>>>> }; >>>>> const struct dispc_features dispc_am62a7_feats = { >>>>> - /* >>>>> - * if the code reaches dispc_mode_valid with VP1, >>>>> - * it should return MODE_BAD. >>>>> - */ >>>>> - .max_pclk_khz = { >>>>> - [DISPC_VP_TIED_OFF] = 0, >>>>> - [DISPC_VP_DPI] = 165000, >>>>> - }, >>>>> - >>>>> .scaling = { >>>>> .in_width_max_5tap_rgb = 1280, >>>>> .in_width_max_3tap_rgb = 2560, >>>>> @@ -441,10 +413,6 @@ const struct dispc_features dispc_am62a7_feats >>>>> = { >>>>> }; >>>>> const struct dispc_features dispc_am62l_feats = { >>>>> - .max_pclk_khz = { >>>>> - [DISPC_VP_DPI] = 165000, >>>>> - }, >>>>> - >>>>> .subrev = DISPC_AM62L, >>>>> .common = "common", >>>>> @@ -1347,25 +1315,57 @@ static void >>>>> dispc_vp_set_default_color(struct dispc_device *dispc, >>>>> DISPC_OVR_DEFAULT_COLOR2, (v >> 32) & 0xffff); >>>>> } >>>>> +/* >>>>> + * Calculate the percentage difference between the requested pixel >>>>> clock rate >>>>> + * and the effective rate resulting from calculating the clock >>>>> divider value. >>>>> + */ >>>>> +unsigned int dispc_pclk_diff(unsigned long rate, unsigned long >>>>> real_rate) >>>>> +{ >>>>> + int r = rate / 100, rr = real_rate / 100; >>>>> + >>>>> + return (unsigned int)(abs(((rr - r) * 100) / r)); >>>>> +} >>>>> + >>>>> +static int check_pixel_clock(struct dispc_device *dispc, >>>>> + u32 hw_videoport, unsigned long clock) >>>>> +{ >>>>> + unsigned long round_clock; >>>>> + >>>>> + if (dispc->tidss->is_ext_vp_clk[hw_videoport]) >>>>> + return 0; >>>>> + >>>>> + if (clock <= dispc->tidss->max_successful_rate[hw_videoport]) >>>>> + return 0; >>>>> + >>>>> + if (clock < dispc->tidss->max_attempted_rate[hw_videoport]) >>>>> + return -EINVAL; >>>>> + >>>>> + round_clock = clk_round_rate(dispc->vp_clk[hw_videoport], clock); >>>>> + >>>>> + if (dispc_pclk_diff(clock, round_clock) > 5) >>>>> + return -EINVAL; >>>>> + >>>>> + dispc->tidss->max_successful_rate[hw_videoport] = round_clock; >>>>> + dispc->tidss->max_attempted_rate[hw_videoport] = clock; >>>> >>>> I still don't think this logic is sound. This is trying to find the >>>> maximum clock rate, and optimize by avoiding the calls to >>>> clk_round_rate() if possible. That makes sense. >>>> >>>> But checking for the 5% tolerance breaks it, in my opinion. If we find >>>> out that the PLL can do, say, 100M, but we need pclk of 90M, the >>>> current >>>> maximum is still the 100M, isn't it? >>> >>> 5% is pretty large indeed. We've been using .5% in multiple drivers and >>> it proved to be pretty ok. I would advise you tu use it too. >> >> The 5% comes from OMAP DSS, where we had to do pixel clock with a few >> dividers and multipliers. The rates were quite coarse, and we ended up >> having quite a large tolerance. >> >> I think with tidss, we always have a PLL we control, so we should always >> have very exact clocks. So I'm fine with dropping it to .5%. However, >> this patch and series is about removing the a-bit-too-hardcoded VP clk >> max rate code in the driver, so I would leave everything else to another >> series. >> >>> It's not clear to me why avoiding a clk_round_rate() call is something >>> worth doing though? >> >> Hard to say if it's worth doing, someone should make some perf tests. >> However, afaik, the calls do go to the firmware, so it involves >> inter-processor calls. On OMAP DSS checking the clock rates was slow, as >> it involved lots of iterating with dividers and multipliers. Perhaps >> it's much faster here. >> >>> Even caching the maximum rate you have been able to reach before is >>> pretty fragile: if the PLL changes its rate, or if a sibling clock has >>> set some limits on what the PLL can do, your maximum isn't relevant >>> anymore. >> >> You're right, although afaik it should not happen with TI's SoCs. We >> would be in trouble anyway if that were the case (e.g. someone starts >> the camera, and suddenly we can't support 1080p anymore). >> >>> in other words, what's wrong with simply calling clk_round_rate() and >>> checking if it's within a .5% deviation? >> >> This started with discussions how to replace the hardcoded max VP clock >> rate (used to quickly weed out impossible rates), which in reality was >> actually PLL max clock rate. We don't know the PLL max rate, and can't >> query it, so this approach was taken. >> >>> At the very least, this should be explained in comments or the commit >>> message. >> >> I agree. >> >> Swamil, can you do some perf tests with clk_round_rate()? If it's fast >> (enough), it will simplify the driver. > > Average execution time is around 112 us. > Trace file including the execution time for clk_round_rate(): https:// > gist.github.com/swamiljain/2abe86982cdeba1d69223d2d525e0cb6 > It is better to reduce calls to clk_round_rate(). > > Need your suggestions for a better approach.
We can cache the clk_round_rate calls. Checking my monitor, there are 36 modes it offers me, but only 20 different pclk rates. Also, we could have multiple clk_round_rate calls happening in the driver for the same mode, and that would also be handled. Even if clk_round_rate takes a bit long, it only happens once (I hope) when an app does a modeset, and multiple times when a display is connected, I wonder if 100 us is an issue? Just using clk_round_rate() without any tricks would simplify the driver nicely, so I think we should try to see if we can get that working. Do you know if there's anything to improve on the clock side, ti-sci or firmare? Tomi