Ok. So, sorry about all the back and forth. Partially this is because I'm more familiar with thermald than others on the SRU team, and so don't necessarily make things explicit that should be.
At a high level, what the SRU team (in general, so that *I* don't have to be the single point of failure) is looking for is: *) What is the scope of potential regressions - what hardware *could* this effect, what possible effects could it have - The answer to “what hardware” is: the list of CPUIDs in tdh_engine.cpp:id_table. This is SandyBridge onwards - The answer to “what effects” is: temperature throttling problems - either reduced performance of CPU (and GPU?) due to unnecessary throttling, or instability due to not controlling temperatures <FEEL FREE TO EXPAND HERE> *) What is the scope of *upstream* support - what systems do *they* test on, and expect to continue to work. - Relatedly: what testing does upstream do - What do we do if upstream doesn't test on hardware that we support (ie: *we* care about all the hardware) *) What is the process we are going to use to verify that upstream doesn't drop support for systems? - Upstream doesn't seem to make it very easy to identify this - eg: the current SRU includes dropping the MSR poking support. What process do we/will we have to catch such cases? *) What is the process for testing that an upload does not regress - The [Test Case] above is good for systems with KBL or newer processors - thermald also supports systems from SandyBridge onwards - how are we testing these? These are still supported by Ubuntu; we need a testing system more than “maybe users will report regressions”, particularly since it's not necessarily going to be clear to users that “my system got slower” is related to the thermald update. Most of those questions are covered above, I think, but some could do with your input. Particularly the SandyBridge+ question is an important one to answer. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to thermald in Ubuntu. https://bugs.launchpad.net/bugs/1995606 Title: Upgrade thermald to 2.5.1 Status in thermald package in Ubuntu: Fix Released Status in thermald source package in Jammy: Incomplete Bug description: [Justification] The purpose of this bug is that prevent the regression in the future. The automatic test scripts are better for the future SRU and is still on the planning. [Test case] For each supported CPU series (RPL/ADL/TGL/CML/CFL/KBL) the following tests will be run on machines in the CI lab: 1. Run stress-ng, and observe the temperature/frequency/power with s-tui - Temperatures should stay just below trip values - Power/performance profiles should stay roughly the same between old thermald and new thermald (unless specifically expected eg: to fix premature/insufficient throttling) 2. check if thermald could read rules from /dev/acpi_thermal_rel and generate the xml file on /etc/thermald/ correctly. - this depends on if acpi_thermal_rel exist. - if the machine suppots acpi_thermal_rel, the "thermal-conf.xml.auto" could be landed in etc/thermald/. - if not, the user-defined xml could be created, then jump to (3). - run thermald with --loglevel=debug, and compare the log with xml.auto file. check if the configuration could be parsed correctly. 3. check if theramd-conf.xml and thermal-cpu-cdev-order.xml can be loaded correctly. - run thermald with --loglevel=debug, and compare the log with xml files. - if parsed correctly, the configurations from XML files would appear in the log. 4. Run unit tests, the scripts are under test folder, using emul_temp to simulate the High temperatue and check thermald would throttle CPU through the related cooling device. - rapl.sh - intel_pstate.sh - powerclamp.sh - processor.sh 5. check if the power/frequency would be throttled once the temperature reach the trip-points of thermal zone. 6. check if system would be throttled even the temperature is under the trip-points. [ Where problems could occur ] since the PL1 min/max is introduced, we may face the edge case in the future. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/thermald/+bug/1995606/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp