On Tue, Jan 23, 2024 at 05:19:18PM +0100, Mattias Rönnblom wrote: > On 2024-01-19 18:43, Bruce Richardson wrote: > > The description of ordered and atomic scheduling given in the eventdev > > doxygen documentation was not always clear. Try and simplify this so > > that it is clearer for the end-user of the application > > > > Signed-off-by: Bruce Richardson <bruce.richard...@intel.com> > > --- > > > > NOTE TO REVIEWERS: > > I've updated this based on my understanding of what these scheduling > > types are meant to do. It matches my understanding of the support > > offered by our Intel DLB2 driver, as well as the SW eventdev, and I > > believe the DSW eventdev too. If it does not match the behaviour of > > other eventdevs, let's have a discussion to see if we can reach a good > > definition of the behaviour that is common. > > --- > > lib/eventdev/rte_eventdev.h | 47 ++++++++++++++++++++----------------- > > 1 file changed, 25 insertions(+), 22 deletions(-) > > > > diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h > > index 2c6576e921..cb13602ffb 100644 > > --- a/lib/eventdev/rte_eventdev.h > > +++ b/lib/eventdev/rte_eventdev.h > > @@ -1313,26 +1313,24 @@ struct rte_event_vector { > > #define RTE_SCHED_TYPE_ORDERED 0 > > /**< Ordered scheduling > > * > > - * Events from an ordered flow of an event queue can be scheduled to > > multiple > > + * Events from an ordered event queue can be scheduled to multiple > > What is the rationale for this change? > > An implementation that impose a total order on all events on a particular > ordered queue will still adhere to the current, more relaxed, per-flow > ordering semantics. > > An application wanting a total order would just set the flow id to 0 on all > events destined that queue, and it would work on all event devices. > > Why don't you just put a note in the DLB driver saying "btw it's total > order", so any application where per-flow ordering is crucial for > performance (i.e., where the potentially needless head-of-line blocking is > an issue) can use multiple queues when running with the DLB. > > In the API as-written, the app is free to express more relaxed ordering > requirements (i.e., to have multiple flows) and it's up to the event device > to figure out if it's in a position where it can translate this to lower > latency. >
Yes, you are right. I'll roll-back or rework this change in V3. Keep it documented that flow-ordering is guaranteed, but note that some implementations may use total ordering to achieve that. > > * ports for concurrent processing while maintaining the original event > > order. > > Maybe it's worth mentioning what is the original event order. "(i.e., the > order in which the events were enqueued to the queue)". Especially since one > like to specify what ordering guarantees one have of events enqueued to the > same queue on different ports and by different lcores). > > I don't know where that information should go though, since it's relevant > for both atomic and ordered-type queues. > It's probably more relevant for ordered, but I'll try and see where it's best to go. > > * This scheme enables the user to achieve high single flow throughput by > > - * avoiding SW synchronization for ordering between ports which bound to > > cores. > > - * > > - * The source flow ordering from an event queue is maintained when events > > are > > - * enqueued to their destination queue within the same ordered flow > > context. > > - * An event port holds the context until application call > > - * rte_event_dequeue_burst() from the same port, which implicitly releases > > - * the context. > > - * User may allow the scheduler to release the context earlier than that > > - * by invoking rte_event_enqueue_burst() with RTE_EVENT_OP_RELEASE > > operation. > > - * > > - * Events from the source queue appear in their original order when > > dequeued > > - * from a destination queue. > > - * Event ordering is based on the received event(s), but also other > > - * (newly allocated or stored) events are ordered when enqueued within the > > same > > - * ordered context. Events not enqueued (e.g. released or stored) within > > the > > - * context are considered missing from reordering and are skipped at this > > time > > - * (but can be ordered again within another context). > > + * avoiding SW synchronization for ordering between ports which are polled > > by > > + * different cores. > > + * > > + * As events are scheduled to ports/cores, the original event order from > > the > > + * source event queue is recorded internally in the scheduler. As events > > are > > + * returned (via FORWARD type enqueue) to the scheduler, the original event > > + * order is restored before the events are enqueued into their new > > destination > > + * queue. > > Delete the first sentence on implementation. > > "As events are re-enqueued to the next queue (with the op field set to > RTE_EVENT_OP_FORWARD), the event device restores the original event order > before the events arrive on the destination queue." > > > + * > > + * Any events not forwarded, ie. dropped explicitly via RELEASE or > > implicitly > > + * released by the next dequeue from a port, are skipped by the reordering > > + * stage and do not affect the reordering of returned events. > > + * > > + * The ordering behaviour of NEW events with respect to FORWARD events is > > + * undefined and implementation dependent. > > For some reason I find this a little vague. "NEW and FORWARD events enqueued > to a queue are not ordered in relation to each other (even if the flow id is > the same)." > > I think I agree that NEW shouldn't be ordered vis-a-vi FORWARD, but maybe > one should say that an event device should avoid excessive reordering NEW > and FORWARD events. > > I think it would also be helpful to address port-to-port ordering guarantees > (or a lack thereof). > > "Events enqueued on one port are not ordered in relation to events enqueued > on some other port." > > Or are they? Not in DSW, at least, and I'm not sure I see a use case for > such a guarantee, but it's a little counter-intuitive to have them > potentially re-shuffled. > > (This is also relevant for atomic queues.) > Ack. > > * > > * @see rte_event_queue_setup(), rte_event_dequeue_burst(), > > RTE_EVENT_OP_RELEASE > > */ > > @@ -1340,18 +1338,23 @@ struct rte_event_vector { > > #define RTE_SCHED_TYPE_ATOMIC 1 > > /**< Atomic scheduling > > * > > - * Events from an atomic flow of an event queue can be scheduled only to a > > + * Events from an atomic flow, identified by @ref rte_event.flow_id, > > A flow is identified by the combination of queue_id and flow_id, so if you > reference one you should also reference the other. > Yes, this is probably one to be reflected globally. Also on your previous comment about priority, I believe that a flow for ordering guarantees should be a combination of queue_id, flow_id and priority. Two packets with different priorities should expect to be reordered, since that tends to be what priority implies. > > + * of an event queue can be scheduled only to a > > * single port at a time. The port is guaranteed to have exclusive > > (atomic) > > * access to the associated flow context, which enables the user to avoid > > SW > > * synchronization. Atomic flows also help to maintain event ordering > > "help" here needs to go, I think. It sounds like a best-effort affair. The > atomic queue ordering guarantees (or the lack thereof) should be spelled > out. > > "Event order in an atomic flow is maintained." Ack. > > > - * since only one port at a time can process events from a flow of an > > + * since only one port at a time can process events from each flow of an > > * event queue. > > Yes, and *but also since* the event device is not reshuffling events > enqueued to an atomic queue. And that's more complicated than just something > that falls out of atomicity, especially if you assume that FORWARD type > enqueues are not ordered with other FORWARD type enqueues on a different > port. > Ack. > > * > > - * The atomic queue synchronization context is dedicated to the port until > > + * The atomic queue synchronization context for a flow is dedicated to the > > port until > > What is an "atomic queue synchronization context" (except for something that > makes for long sentences). > Yes, it's rather wordy. I like the idea of using lock terminology you suggest. The use of the word "contexts" in relation to atomic/ordered I find confusing myself too. > How about: > "The atomic flow is locked to the port until /../" > > You could also used the word "bound" instead of "locked". > > > * application call rte_event_dequeue_burst() from the same port, > > * which implicitly releases the context. User may allow the scheduler to > > * release the context earlier than that by invoking > > rte_event_enqueue_burst() > > - * with RTE_EVENT_OP_RELEASE operation. > > + * with RTE_EVENT_OP_RELEASE operation for each event from that flow. The > > context > > + * is only released once the last event from the flow, outstanding on the > > port, > > + * is released. So long as there is one event from an atomic flow > > scheduled to > > + * a port/core (including any events in the port's dequeue queue, not yet > > read > > + * by the application), that port will hold the synchronization context. > > In case you like the "atomic flow locked/bound to port", this part would > also need updating. > > Maybe here is a good place to add a note on memory ordering and event > ordering. > > "Any memory stores done as a part of event processing will be globally > visible before the next event in the same atomic flow is dequeued on a > different lcore." > > I.e., enqueue includes write barrier before the event can be seen. > > One should probably mentioned a rmb in dequeue as well. > Do we think that that is necessary? I can add it, but I would have thought that - as with rings - it could be assumed. /Bruce