On Wed, Nov 28, 2018 at 11:59:58AM +0100, Cédric Le Goater wrote: > On 11/28/18 12:49 AM, David Gibson wrote: > > On Fri, Nov 16, 2018 at 11:57:01AM +0100, Cédric Le Goater wrote: > >> The last sub-engine of the XIVE architecture is the Interrupt > >> Virtualization Presentation Engine (IVPE). On HW, they share elements, > >> the Power Bus interface (CQ), the routing table descriptors, and they > >> can be combined in the same HW logic. We do the same in QEMU and > >> combine both engines in the XiveRouter for simplicity. > > > > Ok, I'm not entirely convinced combining the IVPE and IVRE into a > > single object is a good idea, but we can probably discuss that once > > I've read further. > > We could introduce a simplified presenter for sPAPR but I am not even > sure of that as it will get more complex if we support the EBB one day.
I wasn't really thinking about PAPR for this comment. > >> When the IVRE has completed its job of matching an event source with a > >> Notification Virtual Target (NVT) to notify, it forwards the event > >> notification to the IVPE sub-engine. The IVPE scans the thread > >> interrupt contexts of the Notification Virtual Targets (NVT) > >> dispatched on the HW processor threads and if a match is found, it > >> signals the thread. If not, the IVPE escalates the notification to > >> some other targets and records the notification in a backlog queue. > >> > >> The IVPE maintains the thread interrupt context state for each of its > >> NVTs not dispatched on HW processor threads in the Notification > >> Virtual Target table (NVTT). > >> > >> The model currently only supports single NVT notifications. > >> > >> Signed-off-by: Cédric Le Goater <c...@kaod.org> > >> --- > >> include/hw/ppc/xive.h | 13 +++ > >> include/hw/ppc/xive_regs.h | 22 ++++ > >> hw/intc/xive.c | 223 +++++++++++++++++++++++++++++++++++++ > >> 3 files changed, 258 insertions(+) > >> > >> diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h > >> index 5987f26ddb98..e715a6c6923d 100644 > >> --- a/include/hw/ppc/xive.h > >> +++ b/include/hw/ppc/xive.h > >> @@ -197,6 +197,10 @@ typedef struct XiveRouterClass { > >> XiveEND *end); > >> int (*set_end)(XiveRouter *xrtr, uint8_t end_blk, uint32_t end_idx, > >> XiveEND *end); > >> + int (*get_nvt)(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx, > >> + XiveNVT *nvt); > >> + int (*set_nvt)(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx, > >> + XiveNVT *nvt); > > > > As with the ENDs, I don't think get/set is a good interface for a > > bigger-than-word-size object. > > We need to agree on this interface before I respin. So you would like > to add a extra argument specifying the word being accessed ? Yes. Ok, 3 options I can see at this point: 1) read/write accessors which take a word number 2) A "get" accessor which copies the whole structure, but "write" accessor which takes a word number. The asymmetry is a bit ugly, but it's the non-atomic writeback of the whole structure which I'm most uncomfortable with. 3) A map/unmap interface which gives you / releases a pointer to the "live" structure. For powernv that would become address_space_map()/unmap(). For PAPR it would just be reutn pointer / no-op. > > > > >> } XiveRouterClass; > >> > >> void xive_eas_pic_print_info(XiveEAS *eas, uint32_t lisn, Monitor *mon); > >> @@ -207,6 +211,10 @@ int xive_router_get_end(XiveRouter *xrtr, uint8_t > >> end_blk, uint32_t end_idx, > >> XiveEND *end); > >> int xive_router_set_end(XiveRouter *xrtr, uint8_t end_blk, uint32_t > >> end_idx, > >> XiveEND *end); > >> +int xive_router_get_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t > >> nvt_idx, > >> + XiveNVT *nvt); > >> +int xive_router_set_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t > >> nvt_idx, > >> + XiveNVT *nvt); > >> > >> /* > >> * XIVE END ESBs > >> @@ -274,4 +282,9 @@ extern const MemoryRegionOps xive_tm_ops; > >> > >> void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon); > >> > >> +static inline uint32_t xive_tctx_cam_line(uint8_t nvt_blk, uint32_t > >> nvt_idx) > >> +{ > >> + return (nvt_blk << 19) | nvt_idx; > > > > I'm guessing this formula is the standard way of combining the NVT > > block and index into a single word? > > That number is the VP/NVT identifier which is written in the CAM value. > The index is on 19 bits because of the NVT definition in the END > structure. It is being increased to 24 bits on Power10 > > > If so, I think we should > > standardize on passing a single word "nvt_id" around and only > > splitting it when we need to use the block separately. > > This is really the only place where we concatenate the two NVT values, > block and index. Hm, ok. I know we don't model them (yet, maybe ever) but could combined values appear in the PowerBUS messages that handle remote notifications? > > Same goes for > > the end_id, assuming there's a standard way of putting that into a > > single word. That will address the point I raised earlier about lisn > > being passed around as a single word, but these later stage ids being > > split. > > Hmm, I am not sure this is a good option. It is not how the PowerNV > model would use it, skiboot is very much aware of these blocks and > indexes and for remote accesses chips are identified using the block. > I will take a look at it but I am not found of it. I can add helpers > in some places though. Hm, ok. Do the block and index appear as an (effectively) single field in the EAS? > I agree we have some kind of issue linking the HW model with the sPAPR > machine. The guest interface is only about IRQ numbers, priorities and > cpu numbers. We really don't care about XIVE blocks and indexes in that > case. we can clarify the code by bypassing the XiveRouter interfaces > to the table and directly use the sPAPR interrupt controller. That > should help a bit for the hcalls but we would still have to fill in > the EAT and the END with some index values if we want to use the router > algorithm. I don't think this is too much of a problem. These are essentially machine internal details so we can choose an allocation to suit us. The obvious one is to put everything in a single block, at least as long as that won't limit our numbers too much. > > We'll probably want some inlines or macros to build an > > nvt/end/lisn/whatever id from block and index as well. > > > >> +} > >> + > >> #endif /* PPC_XIVE_H */ > >> diff --git a/include/hw/ppc/xive_regs.h b/include/hw/ppc/xive_regs.h > >> index 2e3d6cb507da..05cb992d2815 100644 > >> --- a/include/hw/ppc/xive_regs.h > >> +++ b/include/hw/ppc/xive_regs.h > >> @@ -158,4 +158,26 @@ typedef struct XiveEND { > >> #define END_W7_F1_LOG_SERVER_ID PPC_BITMASK32(1, 31) > >> } XiveEND; > >> > >> +/* Notification Virtual Target (NVT) */ > >> +typedef struct XiveNVT { > >> + uint32_t w0; > >> +#define NVT_W0_VALID PPC_BIT32(0) > >> + uint32_t w1; > >> + uint32_t w2; > >> + uint32_t w3; > >> + uint32_t w4; > >> + uint32_t w5; > >> + uint32_t w6; > >> + uint32_t w7; > >> + uint32_t w8; > >> +#define NVT_W8_GRP_VALID PPC_BIT32(0) > >> + uint32_t w9; > >> + uint32_t wa; > >> + uint32_t wb; > >> + uint32_t wc; > >> + uint32_t wd; > >> + uint32_t we; > >> + uint32_t wf; > >> +} XiveNVT; > >> + > >> #endif /* PPC_XIVE_REGS_H */ > >> diff --git a/hw/intc/xive.c b/hw/intc/xive.c > >> index 4c6cb5d52975..5ba3b06e6e25 100644 > >> --- a/hw/intc/xive.c > >> +++ b/hw/intc/xive.c > >> @@ -373,6 +373,32 @@ void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor > >> *mon) > >> } > >> } > >> > >> +/* The HW CAM (23bits) is hardwired to : > >> + * > >> + * 0x000||0b1||4Bit chip number||7Bit Thread number. > >> + * > >> + * and when the block grouping extension is enabled : > >> + * > >> + * 4Bit chip number||0x001||7Bit Thread number. > >> + */ > >> +static uint32_t tctx_hw_cam_line(bool block_group, uint8_t chip_id, > >> uint8_t tid) > >> +{ > >> + if (block_group) { > >> + return 1 << 11 | (chip_id & 0xf) << 7 | (tid & 0x7f); > >> + } else { > >> + return (chip_id & 0xf) << 11 | 1 << 7 | (tid & 0x7f); > >> + } > >> +} > >> + > >> +static uint32_t xive_tctx_hw_cam_line(XiveTCTX *tctx, bool block_group) > >> +{ > >> + PowerPCCPU *cpu = POWERPC_CPU(tctx->cs); > >> + CPUPPCState *env = &cpu->env; > >> + uint32_t pir = env->spr_cb[SPR_PIR].default_value; > > > > I don't much like reaching into the cpu state itself. I think a > > better idea would be to have the TCTX have its HW CAM id set during > > initialization (via a property) and then use that. This will mean > > less mucking about if future cpu revisions don't split the PIR into > > chip and tid ids in the same way. > > yes good idea. I will see how to handle the block_group boolean. may be we > can leave it out of the model for now as it is not used. Yes, it would be nice to leave the block_group stuff as a later extensions when/if we need it. If we put it in as a stub and nothing is using/testing it, it's likely it will be broken if we ever do actually try to use it. > > > > >> + return tctx_hw_cam_line(block_group, (pir >> 8) & 0xf, pir & 0x7f); > >> +} > >> + > >> static void xive_tctx_reset(void *dev) > >> { > >> XiveTCTX *tctx = XIVE_TCTX(dev); > >> @@ -1013,6 +1039,195 @@ int xive_router_set_end(XiveRouter *xrtr, uint8_t > >> end_blk, uint32_t end_idx, > >> return xrc->set_end(xrtr, end_blk, end_idx, end); > >> } > >> > >> +int xive_router_get_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t > >> nvt_idx, > >> + XiveNVT *nvt) > >> +{ > >> + XiveRouterClass *xrc = XIVE_ROUTER_GET_CLASS(xrtr); > >> + > >> + return xrc->get_nvt(xrtr, nvt_blk, nvt_idx, nvt); > >> +} > >> + > >> +int xive_router_set_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t > >> nvt_idx, > >> + XiveNVT *nvt) > >> +{ > >> + XiveRouterClass *xrc = XIVE_ROUTER_GET_CLASS(xrtr); > >> + > >> + return xrc->set_nvt(xrtr, nvt_blk, nvt_idx, nvt); > >> +} > >> + > >> +static bool xive_tctx_ring_match(XiveTCTX *tctx, uint8_t ring, > >> + uint8_t nvt_blk, uint32_t nvt_idx, > >> + bool cam_ignore, uint32_t logic_serv) > >> +{ > >> + uint8_t *regs = &tctx->regs[ring]; > >> + uint32_t w2 = be32_to_cpu(*((uint32_t *) ®s[TM_WORD2])); > >> + uint32_t cam = xive_tctx_cam_line(nvt_blk, nvt_idx); > >> + bool block_group = false; /* TODO (PowerNV) */ > >> + > >> + /* TODO (PowerNV): ignore low order bits of nvt id */ > >> + > >> + switch (ring) { > >> + case TM_QW3_HV_PHYS: > >> + return (w2 & TM_QW3W2_VT) && xive_tctx_hw_cam_line(tctx, > >> block_group) == > >> + tctx_hw_cam_line(block_group, nvt_blk, nvt_idx); > > > > The difference between "xive_tctx_hw_cam_line" and "tctx_hw_cam_line" > > here is far from obvious. > > yes. I lacked inspiration ... I'd suggest that the one which takes the tctx as a parameter be tctx_hw_cam_line() and the other be nvt_hw_cam_line() or similar. The crucial difference here is that one is what the thread is looking for, the other is what the NVT is advertising. > > Remember that namespacing prefixes aren't > > necessary for static functions, which can let you give more > > descriptive names without getting excessively long. > > OK. > > >> + case TM_QW2_HV_POOL: > >> + return (w2 & TM_QW2W2_VP) && (cam == GETFIELD(TM_QW2W2_POOL_CAM, > >> w2)); > >> + > >> + case TM_QW1_OS: > >> + return (w2 & TM_QW1W2_VO) && (cam == GETFIELD(TM_QW1W2_OS_CAM, > >> w2)); > >> + > >> + case TM_QW0_USER: > >> + return ((w2 & TM_QW1W2_VO) && (cam == GETFIELD(TM_QW1W2_OS_CAM, > >> w2)) && > >> + (w2 & TM_QW0W2_VU) && > >> + (logic_serv == GETFIELD(TM_QW0W2_LOGIC_SERV, w2))); > >> + > >> + default: > >> + g_assert_not_reached(); > >> + } > >> +} > >> + > >> +static int xive_presenter_tctx_match(XiveTCTX *tctx, uint8_t format, > >> + uint8_t nvt_blk, uint32_t nvt_idx, > >> + bool cam_ignore, uint32_t logic_serv) > >> +{ > >> + if (format == 0) { > >> + /* F=0 & i=1: Logical server notification */ > >> + if (cam_ignore == true) { > >> + qemu_log_mask(LOG_GUEST_ERROR, "XIVE: no support for LS " > >> + "NVT %x/%x\n", nvt_blk, nvt_idx); > >> + return -1; > >> + } > >> + > >> + /* F=0 & i=0: Specific NVT notification */ > >> + if (xive_tctx_ring_match(tctx, TM_QW3_HV_PHYS, > >> + nvt_blk, nvt_idx, false, 0)) { > >> + return TM_QW3_HV_PHYS; > >> + } > >> + if (xive_tctx_ring_match(tctx, TM_QW2_HV_POOL, > >> + nvt_blk, nvt_idx, false, 0)) { > >> + return TM_QW2_HV_POOL; > >> + } > >> + if (xive_tctx_ring_match(tctx, TM_QW1_OS, > >> + nvt_blk, nvt_idx, false, 0)) { > >> + return TM_QW1_OS; > >> + } > > > > Hm. It's a bit pointless to iterate through each ring calling a > > common function, when that "common" function consists entirely of a > > switch which makes it not really common at all. > > > > So I think you want separate helper functions for each ring's match, > > or even just fold the previous function into this one. > > yes. It can be improved. I did try different layouts. I might just fold > both routine in one as you propose. > > >> + } else { > >> + /* F=1 : User level Event-Based Branch (EBB) notification */ > >> + if (xive_tctx_ring_match(tctx, TM_QW0_USER, > >> + nvt_blk, nvt_idx, false, logic_serv)) { > >> + return TM_QW0_USER; > >> + } > >> + } > >> + return -1; > >> +} > >> + > >> +typedef struct XiveTCTXMatch { > >> + XiveTCTX *tctx; > >> + uint8_t ring; > >> +} XiveTCTXMatch; > >> + > >> +static bool xive_presenter_match(XiveRouter *xrtr, uint8_t format, > >> + uint8_t nvt_blk, uint32_t nvt_idx, > >> + bool cam_ignore, uint8_t priority, > >> + uint32_t logic_serv, XiveTCTXMatch > >> *match) > >> +{ > >> + CPUState *cs; > >> + > >> + /* TODO (PowerNV): handle chip_id overwrite of block field for > >> + * hardwired CAM compares */ > >> + > >> + CPU_FOREACH(cs) { > >> + PowerPCCPU *cpu = POWERPC_CPU(cs); > >> + XiveTCTX *tctx = XIVE_TCTX(cpu->intc); > >> + int ring; > >> + > >> + /* > >> + * HW checks that the CPU is enabled in the Physical Thread > >> + * Enable Register (PTER). > >> + */ > >> + > >> + /* > >> + * Check the thread context CAM lines and record matches. We > >> + * will handle CPU exception delivery later > >> + */ > >> + ring = xive_presenter_tctx_match(tctx, format, nvt_blk, nvt_idx, > >> + cam_ignore, logic_serv); > >> + /* > >> + * Save the context and follow on to catch duplicates, that we > >> + * don't support yet. > >> + */ > >> + if (ring != -1) { > >> + if (match->tctx) { > >> + qemu_log_mask(LOG_GUEST_ERROR, "XIVE: already found a > >> thread " > >> + "context NVT %x/%x\n", nvt_blk, nvt_idx); > >> + return false; > >> + } > >> + > >> + match->ring = ring; > >> + match->tctx = tctx; > >> + } > >> + } > >> + > >> + if (!match->tctx) { > >> + qemu_log_mask(LOG_GUEST_ERROR, "XIVE: NVT %x/%x is not > >> dispatched\n", > >> + nvt_blk, nvt_idx); > >> + return false; > > > > Hmm.. this isn't actually an error isn't it? At least not for powernv > > It is on sPAPR, it would mean the END was configured with an unknow CPU. Right. > It is not error on PowerNV, when we support escalations. > > > - that just means the NVT isn't currently dispatched, so we'll need to > > trigger the escalation interrupt. > > Yes. > > > Does this get changed later in the series? > > No. But this code is common to PAPR and powernv, yes, so it will need to? -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature