On Thu, Apr 18, 2019 at 04:34:53AM +0000, Honnappa Nagarahalli wrote:
> > 
> > On Wed, Apr 17, 2019 at 05:12:43AM +0000, Honnappa Nagarahalli wrote:
> > > Hello,
> > >   There was a conversation [1] in the context of RCU library. I thought
> > > it needs participation from broader audience. Summary for the context
> > > (please look at [1] for full details)
> > >
> > 
> > Thanks for kicking off this discussion
> > 
> > > 1) How do we provide ABI compatibility when the code base contains
> > inline functions? Unless we freeze development in these inline functions and
> > corresponding structures, it might not be possible to claim ABI 
> > compatibility.
> > Application has to be recompiled.
> > 
> > I agree that in some cases the application "might" need to be recompiled,
> > but on the other hand I also think that there are many cases where ABI
> > function versioning can still be used to mitigate things. For example, if we
> > think of a couple of various scenarios:
> > 
> > 1. If everything is inline and variables are allocated by app, e.g.
> > spinlock on stack, then there is no issue as everything is application
> > contained.
> If there is a bug fix which requires the structure to change, the application 
> would need to recompile. I guess you are talking about a scenario when 
> nothing changed in the inline function/variables.
> 

If the application wants the bugfix, then yes. However, if the app is
unaffected by the bug, then it should have the option of updating DPDK
without a recompile.

> > 
> > 2. If the situation is as in #1, but the structures in question are passed 
> > to
> > non-inline DPDK functions. In this case, any changes to the structures 
> > require
> > those functions taking the structures to be versioned for old and new
> > structures
> I think this can get complicated on the inline function side. The application 
> and the DPDK library will end up having different inline functions. The 
> changed inline function needs to be aware of 2 structure formats or the 
> inline function needs to be duplicated (one for each version of the 
> structure). I guess these are the workarounds we have to do.
> 
No, there is never any need for two versions of the inline functions, only
the newest version is needed. This is because in the case of a newly compiled
application only the newest version of the non-inline functions is ever
used. The other older versions are only used at runtime for compatilibity
with pre-compiled apps with the older inlines.

> > 
> > 3. If instead we have the case, like in rte_ring, where the data
> > structures are allocated using functions, but they are used via inlines
> > in the app. In this case the creation functions (and any other function
> > using the structures) need to be versioned so that the application gets
> > the expected structure back from the creation.
> The new structure members have to be added such that the previous layout
> is not affected. Either add at the end or use the gaps that are left
> because of cache line alignment.
> 
Not necessarily. There is nothing that requires the older and newer
versions of a function to use the same structure. We can rename the
original structure when versioning the old function, and then create a new
structure with the original name for later recompiled code using the newest
ABIs.

> > 
> > It might be useful to think of what other scenarios we have and what ones
> > are likely to be problematic, especially those that can't be solved by 
> > having
> > multiple function versions.
> > 
> > > 2) Every function that is not in the direct datapath (fastpath?)
> > > should not be inline. Exceptions or things like rx/tx burst, ring
> > > enqueue/dequeue, and packet alloc/free - Stephen
> > 
> > Agree with this. Anything not data path should not be inline. The next
> Yes, very clear on this.
> 
> > question then is for data path items how to determine whether they need to
> > be inline or not. In general my rule-of-thumb:
> > * anything dealing with bursts of packets should not be inline
> > * anything what works with single packet at a time should be inline
> > 
> > The one exception to this rule is cases where we have to consider "empty
> > polling" as a likely use-case. For example, rte_ring_dequeue_burst works
> > with bursts of packets, but there is a valid application use case where a
> > thread could be polling a set of rings where only a small number are
> > actually busy. Right now, polling an empty ring only costs a few cycles,
> > meaning that it's neglible to have a few polls of empty rings before you get
> > to a busy one. Having that function not-inline will cause that cost to jump
> > significantly, so could cause problems. (This leads to the interesting 
> > scenario
> > where ring enqueue is safe to un-inline, while dequeue is not).
> A good way to think about it would be - ratio of amount of work done in the 
> function to cost of function call.
> 

Yes, I would tend to agree in general.  The other thing is the frequency of
calls, and - as already stated - whether it takes a burst of not.  Because
even if it's a trivial function that takes only 10 cycles and we want to
uninline it, the cost may double; but if it takes a burst of packets and is
only used once/twice per burst the cost per packet should still only be a
fraction of a cycle.

> > 
> > > 3) Plus synchronization routines: spin/rwlock/barrier, etc. I think
> > > rcu should be one of such exceptions - it is just another
> > > synchronization mechanism after all (just a bit more sophisticated). -
> > > Konstantin
> > >
> > In general I believe the synchronisation primitives should be inline.
> > However, it does come down to cost too - if a function takes 300 cycles, do
> > we really care if it takes 305 or 310 instead to make it not inline?
> > Hopefully most synchronisation primitives are faster than this so this
> > situation should not occur.
> > 
> > > 2) and 3) can be good guidelines to what functions/APIs can be inline. 
> > > But,
> > I believe this guideline exists today too. Are there any thoughts to change
> > this?
> > >
> > > Coming to 1), I think DPDK cannot provide ABI compatibility unless all the
> > inline functions are converted to normal functions and symbol versioning is
> > done for those (not bothering about performance).
> > >
> > I disagree. I think even in the case of #1, we should be able to manage some
> > changes without breaking ABI.
> I completely agree with you on trying to keep the ABI break surface small and 
> doing due diligence when one is required. However, I think claiming 100% ABI 
> compatibility all the time (or as frequently as other projects claim) might 
> be tough. IMO, we need to look beyond this to solve the end user problem. May 
> be packaging multiple LTS with distros when DPDK could not avoid breaking ABI 
> compatibility?
> 
Having multiple LTS's per distro would be nice, but it's putting a lot more
work on the distro folks, I think.

> > 
> > > In this context, does it make sense to say that we will maintain API
> > > compatibility rather than saying ABI compatibility? This will also
> > > send the right message to the end users.
> > >
> > I would value ABI compatibility much higher than API compatibility. If
> > someone is recompiling the application anyway, making a couple of small
> > changes (large rework is obviously a different issue) to the code should not
> > be a massive issue, I hope. On the other hand, ABI compatibility is needed 
> > to
> > allow seamless update from one version to another, and it's that ABI
> > compatiblity that allows distro's to pick up our latest and greatest 
> > versions.
> > 
> I think it is also important to setup the right expectations to DPDK users. 
> i.e. DPDK will police itself better to provide ABI compatibility but 
> occasionally it might not be possible.
> 
The trouble here is that a DPDK release, as a product is either backward
compatible or not. Either a user can take it as a drop-in replacement for
the previous version or not. Users do not want to have to go through a
checklist for each app to see if the app uses only "compatible" APIs or
not. Same for the distro folks, they are not going to take some libs from a
release but not others because the ABI is broken for some but not others.

Regards,
/Bruce

Reply via email to