On Fri, Sep 06, 2024 at 04:36:30PM +0800, Yicong Yang wrote:
> On 2024/9/6 15:06, Morten Rasmussen wrote:
> > Hi Yicong,
> > 
> > On Thu, Sep 05, 2024 at 08:02:20PM +0800, Yicong Yang wrote:
> >> On 2024/9/5 16:34, Pierre Gondois wrote:
> >>> Hello Yicong,
> >>>
> >>> If a platform has CPUs with:
> >>> - 1 thread
> >>> - X (!= 1) threads
> >>> Then I think that the asymmetry is not detected
> >>
> >> Ah ok, I only handle the case where there are several thread numbers 
> >> except no SMT CPUs in the
> >> system. For this case I was thinking we don't need to handle this since 
> >> there's only one kind
> >> of SMT core in the system, control should works fine for the SMT CPU 
> >> clusters and we may not
> >> care about the CPUs with no SMT.
> >>
> >> Below code should handle the case if we initialize the max_smt_thread_num 
> >> to 0. I also
> >> reword the warning messages to match the fact. For heterogeneous SMT 
> >> topology we still
> >> support control SMT by on/off toggle but not fully support setting the 
> >> thread number.
> >>
> >>    int max_smt_thread_num = 0;
> >>    [...]
> >>    /*
> >>     * This should be a short loop depending on the number of heterogeneous
> >>     * CPU clusters. Typically on a homogeneous system there's only one
> >>     * entry in the XArray.
> >>     */
> >>    xa_for_each(&hetero_cpu, hetero_id, entry) {
> >>            /*
> >>             * If max_smt_thread_num has been initialized and doesn't match
> >>             * the thread number of this entry, then the system has
> >>             * heterogeneous SMT topology.
> >>             */
> >>            if (entry->thread_num != max_smt_thread_num && 
> >> max_smt_thread_num)
> >>                    pr_warn_once("Heterogeneous SMT topology is partly 
> >> supported by SMT control\n");
> > 
> > What does 'partly supported' mean here?
> > 
> > If the SMT control doesn't work as intended for this topology, I don't
> > think it should be enabled for it.
> > 
> 
> The /sys/devices/system/cpu/smt/control supports 2 kind of controls [1]
> (a) enable/disable SMT entirely by writing on/off
> (b) enable/disable SMT partially by writing a valid thread number 
> (CONFIG_SMT_NUM_THREADS_DYNAMIC)
> 
> We assume 3 kind of SMT topology:
> 1. homogeneous SMT topology, all the CPUs support SMT or not
> 2.1 heterogeneous SMT topology, part of CPU clusters have SMT and others not. 
> Clusters support SMT
>     have the same SMT thread number
> 2.2 heterogeneous SMT topology, part of CPU clusters have SMT and the thread 
> number may differs
>     (e.g. cluster 1 is of SMT 2 and cluster 2 is of SMT 4. not sure there's 
> such a real system)
> 
> For any of above SMT topology, control (a) should work as expected. When 
> enabling SMT by writing "on"
> the SMT is disabled for those CPU clusters who have SMT. Same for disabling 
> SMT by writing "off".
> But control (b) may not work for case 2.2 since the SMT thread number is not 
> the same across
> the system.
> 
> For this series alone we won't met this since CONFIG_SMT_NUM_THREADS_DYNAMIC 
> is not enabled.
> So control (b) only supports write 1/max_thread which behaves same as write 
> off/on and will
> work as intended for all the 3 cases. As discussed Pierre will add support for
>  CONFIG_SMT_NUM_THREADS_DYNAMIC since thunderX2 is a symmetric SMT4 machine 
> and
> CONFIG_SMT_NUM_THREADS_DYNAMIC would be useful. We thought a warning should 
> be useful
> if running on a system of case 2.2.

Thanks for explaining the situation.

So IIUC, for case 2.2 there will be _no_ failures if someone writes a
value different from 1 or max_threads?

The SMT control code can handle that max_threads isn't the correct
number of threads for all cores in the system?

Reply via email to