On 2/21/19 4:13 AM, David Gibson wrote: > On Tue, Feb 19, 2019 at 08:31:25AM +0100, Cédric Le Goater wrote: >> On 2/12/19 6:40 AM, David Gibson wrote: >>> On Mon, Jan 28, 2019 at 10:46:11AM +0100, Cédric Le Goater wrote: > [snip] >>>> #endif /* _PPC_PNV_H */ >>>> diff --git a/include/hw/ppc/pnv_core.h b/include/hw/ppc/pnv_core.h >>>> index 9961ea3a92cd..8e57c064e661 100644 >>>> --- a/include/hw/ppc/pnv_core.h >>>> +++ b/include/hw/ppc/pnv_core.h >>>> @@ -49,6 +49,7 @@ typedef struct PnvCoreClass { >>>> >>>> typedef struct PnvCPUState { >>>> struct ICPState *icp; >>>> + struct XiveTCTX *tctx; >>> >>> Unlike sPAPR, we really do always know in advance the interrupt >>> controller for a particular machine. I think it makes sense to >>> further split the POWER8 and POWER9 cases here, so we only track one >>> for any given setup. >> >> So, you would define a : >> >> typedef struct Pnv9CPUState { >> struct XiveTCTX *tctx; >> } Pnv9CPUState; >> >> to be allocated when the core is realized ? and later the routine >> pnv_chip_power9_intc_create() would assign the ->tctx pointer. > > Sounds about right. > > [snip] >>>> + uint32_t nr_ends; >>>> + XiveENDSource end_source; >>>> + >>>> + /* Interrupt controller registers */ >>>> + uint64_t regs[0x300]; >>>> + >>>> + /* Can be configured by FW */ >>>> + uint32_t tctx_chipid; >>>> + uint32_t chip_id; >>> >>> Can't you derive that since you have a pointer to the owning chip? >> >> Not always, there are register fields to purposely override this value. >> I can improve the current code a little I think. > > Ok. > > [snip] >>>> +/* >>>> + * Virtual structures table (VST) >>>> + */ >>>> +typedef struct XiveVstInfo { >>>> + uint32_t type; >>>> + const char *name; >>>> + uint32_t size; >>>> + uint32_t max_blocks; >>>> +} XiveVstInfo; >>>> + >>>> +static const XiveVstInfo vst_infos[] = { >>>> + [VST_TSEL_IVT] = { VST_TSEL_IVT, "EAT", sizeof(XiveEAS), 16 }, >>> >>> I don't love explicitly storing the type/index in each record, as well >>> as it being implicit in the table slot. >> >> The 'vst_infos' table decribes the different table types and the 'type' >> field is used to index the runtime table of VSDs. See >> pnv_xive_vst_addr() > > Yes, I know what it's for but it's still redundant information. You > could avoid it, for example, by passing around an index instead of a > pointer to a vst_infos[] slot - then you can look up both vst_infos > and the other table using that index.
:) This is exactly what the code was doing before and I thought passing the pointer was cleaner ! No problem. This is minor. I will revert. > [snip] >>>> + case CQ_TM1_BAR: /* TM BAR. 4 pages. Map only once */ >>>> + case CQ_TM2_BAR: /* second TM BAR. for hotplug. Not modeled */ >>>> + xive->tm_shift = val & CQ_TM_BAR_64K ? 16 : 12; >>>> + if (!(val & CQ_TM_BAR_VALID)) { >>>> + xive->tm_base = 0; >>>> + if (xive->regs[reg] & CQ_TM_BAR_VALID && xive->chip_id == 0) { >>>> + memory_region_del_subregion(sysmem, &xive->tm_mmio); >>>> + } >>>> + } else { >>>> + xive->tm_base = val & ~(CQ_TM_BAR_VALID | CQ_TM_BAR_64K); >>>> + if (!(xive->regs[reg] & CQ_TM_BAR_VALID) && xive->chip_id == >>>> 0) { >>>> + memory_region_add_subregion(sysmem, xive->tm_base, >>>> + &xive->tm_mmio); >>>> + } >>>> + } >>>> + break; >>>> + >>>> + case CQ_PC_BARM: >>>> + xive->regs[reg] = val; >>> >>> As discussed elsewhere, this seems to be a big mix of writing things >>> directly into regs[reg] and doing other things instead, and you really >>> want to go one way or the other. I'd suggest dropping xive->regs[] >>> and instead putting the state you need persistent into its own >>> variables. >> >> I made a big effort to introduce helper routines to avoid storing values >> that can be calculated under the PnvXive model, as you asked for it. >> The assignment above is only necessary for the pnv_xive_pc_size() below >> and I don't know how handle this current case without duplicating the >> switch statement, which I think is ugly. > > I'm not sure quite what you mean about duplicating the case. >> The point here is that since you're only storing in a couple of the > switch cases, you can just have explicit data backing just those > values and write to those in the switch cases instead of having a > great big regs[] array of which only a few bits are used. The model uses these registers : xive->regs[PC_GLOBAL_CONFIG >> 3] xive->regs[CQ_VC_BARM >> 3] xive->regs[CQ_PC_BARM >> 3] xive->regs[CQ_TAR >> 3] xive->regs[VC_GLOBAL_CONFIG >> 3] xive->regs[VC_VSD_TABLE_ADDR >> 3]); xive->regs[CQ_IC_BAR] xive->regs[CQ_TM1_BAR] xive->regs[CQ_VC_BAR] xive->regs[PC_THREAD_EN_REG0 >> 3] xive->regs[PC_THREAD_EN_REG1 >> 3] xive->regs[(VC_EQC_CWATCH_DAT0 >> 3) + i]; xive->regs[(PC_VPC_CWATCH_DAT0 >> 3) + i] xive->regs[VC_EQC_CONFIG]; xive->regs[VC_EQC_SCRUB_TRIG] xive->regs[PC_AT_KILL]; xive->regs[VC_AT_MACRO_KILL] xive->regs[(PC_TCTXT_INDIR0 >> 3) + i] The regs array is useful for the different cache watch but apart from that, yes, we could use independent fields. I will see how much energy I have to put into this change. All the registers change in P10, may be this will be a better time to share a common PnvXive model and isolate the HW interface of each processor. >> So, I will keep the xive->regs[] and make the couple of fixes still needed. > > [snip] >>>> +/* >>>> + * Virtualization Controller MMIO region containing the IPI and END ESB >>>> pages >>>> + */ >>>> +static uint64_t pnv_xive_vc_read(void *opaque, hwaddr offset, >>>> + unsigned size) >>>> +{ >>>> + PnvXive *xive = PNV_XIVE(opaque); >>>> + uint64_t edt_index = offset >> pnv_xive_edt_shift(xive); >>>> + uint64_t edt_type = 0; >>>> + uint64_t edt_offset; >>>> + MemTxResult result; >>>> + AddressSpace *edt_as = NULL; >>>> + uint64_t ret = -1; >>>> + >>>> + if (edt_index < XIVE_TABLE_EDT_MAX) { >>>> + edt_type = GETFIELD(CQ_TDR_EDT_TYPE, xive->edt[edt_index]); >>>> + } >>>> + >>>> + switch (edt_type) { >>>> + case CQ_TDR_EDT_IPI: >>>> + edt_as = &xive->ipi_as; >>>> + break; >>>> + case CQ_TDR_EDT_EQ: >>>> + edt_as = &xive->end_as; >>> >>> I'm not entirely understanding the function of these AddressSpace objectz. >> >> The IPI and END ESB pages are exposed in the same VC region. But this VC >> region is not splitted between the two types with a single offset. It >> is segmented with the EDT tables which defines the type of each segment. >> >> The purpose of these address spaces is to translate the load/stores in >> the VC region in the equivalent IPI or END region exposing contigously >> ESB pages of the same type: 'end_edt_mmio' and 'ipi_edt_mmio'. >> >> The memory regions of the XiveENDSource and the XiveSource for the IPIs >> are mapped in 'end_edt_mmio' and 'ipi_edt_mmio'. > > Hmm. Ok, I'm not immediately seeing why you can't map the various IPI > or END blocks directly into address_space memory rather than having > this extra internal layer of indirection. I think I see what you mean. You would get rid of the intermediate MR &xive->ipi_edt_mmio and map directly &xsrc->esb_mmio into &xive->ipi_mmio same for the ENDs, map directly &end_xsrc->esb_mmio in &xive->end_mmio That might be overkill indeed. I will check. >> (This is changing for P10, should be simpler) > > [snip] >>>> +/* >>>> + * Maximum number of IRQs and ENDs supported by HW >>>> + */ >>>> +#define PNV_XIVE_NR_IRQS (PNV9_XIVE_VC_SIZE / (1ull << >>>> XIVE_ESB_64K_2PAGE)) >>>> +#define PNV_XIVE_NR_ENDS (PNV9_XIVE_VC_SIZE / (1ull << >>>> XIVE_ESB_64K_2PAGE)) >>>> + >>>> +static void pnv_xive_realize(DeviceState *dev, Error **errp) >>>> +{ >>>> + PnvXive *xive = PNV_XIVE(dev); >>>> + XiveSource *xsrc = &xive->source; >>>> + XiveENDSource *end_xsrc = &xive->end_source; >>>> + Error *local_err = NULL; >>>> + Object *obj; >>>> + >>>> + obj = object_property_get_link(OBJECT(dev), "chip", &local_err); >>>> + if (!obj) { >>>> + error_propagate(errp, local_err); >>>> + error_prepend(errp, "required link 'chip' not found: "); >>>> + return; >>>> + } >>>> + >>>> + /* The PnvChip id identifies the XIVE interrupt controller. */ >>>> + xive->chip = PNV_CHIP(obj); >>>> + >>>> + /* >>>> + * Xive Interrupt source and END source object with the maximum >>>> + * allowed HW configuration. The ESB MMIO regions will be resized >>>> + * dynamically when the controller is configured by the FW to >>>> + * limit accesses to resources not provisioned. >>>> + */ >>>> + object_property_set_int(OBJECT(xsrc), PNV_XIVE_NR_IRQS, "nr-irqs", >>>> + &error_fatal); >>> >>> You have a constant here, but your router object also includes a >>> nr_irqs field. What's up with that? >> >> The XiveSource object of PnvXive is realized for the maximum size allowed >> by HW because we don't know how much IRQs the FW will provision for. >> >> The 'nr_irqs' is the number of IRQs configured by the FW (We changed the >> name to 'nr_ipis') See routine pnv_xive_vst_set_exclusive() > > Ah, ok. > I will try to get that done for 4.0. PSI and LPC P9 models should be less complex to review. Thanks, C.