On 6/19/20 8:34 AM, Scott Cheloha wrote: > The H_GetPerformanceCounterInfo PHYP hypercall has a subcall, > Affinity_Domain_Info_By_Partition, which returns, among other things, > a "partition affinity score" for a given LPAR. This score, a value on > [0-100], represents the processor-memory affinity for the LPAR in > question. A score of 0 indicates the worst possible affinity while a > score of 100 indicates perfect affinity. The score can be used to > reason about performance. > > This patch adds the score for the local LPAR to the lparcfg procfile > under a new 'partition_affinity_score' key.
I expect that you will probably get a NACK from Michael on this. The overall desire is to move away from these dated /proc interfaces. While its true that I did add a new value recently it was strictly to facilitate and correct the calculation of a derived value that was already dependent on a couple other existing values in lparcfg. With that said I would expect that you would likely be advised to expose this as a sysfs attribute. The question is where? We probably should put some thought in to this as I would like to port each lparcfg value over to sysfs so that we can move to deprecating lparcfg. Putting everything under something like /sys/kernel/lparcfg/* maybe. Michael may have a better suggestion. > > The H_GetPerformanceCounterInfo hypercall is already used elsewhere in > the kernel, in powerpc/perf/hv-gpci.c. Refactoring that code and this > code into a more general API might be worthwhile if additional modules > require the hypercall in the future. If you are duplicating code its likely you should already be doing this. See the rest of my comments about below. > > Signed-off-by: Scott Cheloha <chel...@linux.ibm.com> > --- > arch/powerpc/platforms/pseries/lparcfg.c | 53 ++++++++++++++++++++++++ > 1 file changed, 53 insertions(+) > > diff --git a/arch/powerpc/platforms/pseries/lparcfg.c > b/arch/powerpc/platforms/pseries/lparcfg.c > index b8d28ab88178..b75151eee0f0 100644 > --- a/arch/powerpc/platforms/pseries/lparcfg.c > +++ b/arch/powerpc/platforms/pseries/lparcfg.c > @@ -136,6 +136,57 @@ static unsigned int h_get_ppp(struct hvcall_ppp_data > *ppp_data) > return rc; > } > > +/* > + * Based on H_GetPerformanceCounterInfo v1.10. > + */ > +static void show_gpci_data(struct seq_file *m) > +{ > + struct perf_counter_info_params { > + __be32 counter_request; > + __be32 starting_index; > + __be16 secondary_index; > + __be16 returned_values; > + __be32 detail_rc; > + __be16 counter_value_element_size; > + u8 counter_info_version_in; > + u8 counter_info_version_out; > + u8 reserved[0xC]; > + } __packed; This looks to duplicate the hv_get_perf_counter_info_params struct from arch/powerpc/perf/hv-gpci.h. Maybe this include file needs to move to arch/powerpc/asm/inlcude so you don't have to redefine this struct. > + struct hv_gpci_request_buffer { > + struct perf_counter_info_params params; > + u8 output[4096 - sizeof(struct perf_counter_info_params)]; > + } __packed; This struct is code duplication of the one defined in arch/powerpc/perf/hv-gpci.c and could be moved into hv-gpci.h along with HGPCI_MAX_DATA_BYTES so that you can use those versions here. > + struct hv_gpci_request_buffer *buf; > + long ret; > + unsigned int affinity_score; > + > + buf = kmalloc(sizeof(*buf), GFP_KERNEL); > + if (buf == NULL) > + return; > + > + /* > + * Show the local LPAR's affinity score. > + * > + * 0xB1 selects the Affinity_Domain_Info_By_Partition subcall. > + * The score is at byte 0xB in the output buffer. > + */ > + memset(&buf->params, 0, sizeof(buf->params)); > + buf->params.counter_request = cpu_to_be32(0xB1); > + buf->params.starting_index = cpu_to_be32(-1); /* local LPAR */ > + buf->params.counter_info_version_in = 0x5; /* v5+ for score */ > + ret = plpar_hcall_norets(H_GET_PERF_COUNTER_INFO, virt_to_phys(buf), > + sizeof(*buf)); > + if (ret != H_SUCCESS) { > + pr_debug("hcall failed: H_GET_PERF_COUNTER_INFO: %ld, %x\n", > + ret, be32_to_cpu(buf->params.detail_rc)); > + goto out; > + } > + affinity_score = buf->output[0xB]; > + seq_printf(m, "partition_affinity_score=%u\n", affinity_score); > +out: > + kfree(buf); > +} > + IIUC we should already be able to get this value from userspace using perf tool, right? If thats the case can't we also programatically retrieve it via the perf_event interface in userspace as well? -Tyrel > static unsigned h_pic(unsigned long *pool_idle_time, > unsigned long *num_procs) > { > @@ -487,6 +538,8 @@ static int pseries_lparcfg_data(struct seq_file *m, void > *v) > partition_active_processors * 100); > } > > + show_gpci_data(m); > + > seq_printf(m, "partition_active_processors=%d\n", > partition_active_processors); > >