> On 18/04/2019 11:28, Bruce Richardson wrote: > > On Thu, Apr 18, 2019 at 04:34:53AM +0000, Honnappa Nagarahalli wrote: > >>> > >>> On Wed, Apr 17, 2019 at 05:12:43AM +0000, Honnappa Nagarahalli > wrote: > >>>> Hello, > >>>> There was a conversation [1] in the context of RCU library. I > >>>> thought it needs participation from broader audience. Summary for > >>>> the context (please look at [1] for full details) > >>>> > >>> > >>> Thanks for kicking off this discussion > >>> > >>>> 1) How do we provide ABI compatibility when the code base contains > >>> inline functions? Unless we freeze development in these inline > >>> functions and corresponding structures, it might not be possible to claim > ABI compatibility. > >>> Application has to be recompiled. > >>> > >>> I agree that in some cases the application "might" need to be > >>> recompiled, but on the other hand I also think that there are many > >>> cases where ABI function versioning can still be used to mitigate > >>> things. For example, if we think of a couple of various scenarios: > >>> > >>> 1. If everything is inline and variables are allocated by app, e.g. > >>> spinlock on stack, then there is no issue as everything is > >>> application contained. > >> If there is a bug fix which requires the structure to change, the > >> application > would need to recompile. I guess you are talking about a scenario when > nothing changed in the inline function/variables. > >> > > > > If the application wants the bugfix, then yes. However, if the app is > > unaffected by the bug, then it should have the option of updating DPDK > > without a recompile. > > I would also imagine that should be an extremely rare case, that a bugfix > would require a structure change ... perhaps for an alignment issues? > > > > >>> > >>> 2. If the situation is as in #1, but the structures in question are > >>> passed to non-inline DPDK functions. In this case, any changes to > >>> the structures require those functions taking the structures to be > >>> versioned for old and new structures > >> I think this can get complicated on the inline function side. The > >> application > and the DPDK library will end up having different inline functions. The > changed inline function needs to be aware of 2 structure formats or the inline > function needs to be duplicated (one for each version of the structure). I > guess these are the workarounds we have to do. > >> > > No, there is never any need for two versions of the inline functions, > > only the newest version is needed. This is because in the case of a > > newly compiled application only the newest version of the non-inline > > functions is ever used. The other older versions are only used at > > runtime for compatilibity with pre-compiled apps with the older inlines. > > > >>> > >>> 3. If instead we have the case, like in rte_ring, where the data > >>> structures are allocated using functions, but they are used via > >>> inlines in the app. In this case the creation functions (and any > >>> other function using the structures) need to be versioned so that > >>> the application gets the expected structure back from the creation. > >> The new structure members have to be added such that the previous > >> layout is not affected. Either add at the end or use the gaps that > >> are left because of cache line alignment. > >> > > Not necessarily. There is nothing that requires the older and newer > > versions of a function to use the same structure. We can rename the > > original structure when versioning the old function, and then create a > > new structure with the original name for later recompiled code using > > the newest ABIs. > > > >>> > >>> It might be useful to think of what other scenarios we have and what > >>> ones are likely to be problematic, especially those that can't be > >>> solved by having multiple function versions. > >>> > >>>> 2) Every function that is not in the direct datapath (fastpath?) > >>>> should not be inline. Exceptions or things like rx/tx burst, ring > >>>> enqueue/dequeue, and packet alloc/free - Stephen > >>> > >>> Agree with this. Anything not data path should not be inline. The > >>> next > >> Yes, very clear on this. > >> > >>> question then is for data path items how to determine whether they > >>> need to be inline or not. In general my rule-of-thumb: > >>> * anything dealing with bursts of packets should not be inline > >>> * anything what works with single packet at a time should be inline > >>> > >>> The one exception to this rule is cases where we have to consider > >>> "empty polling" as a likely use-case. For example, > >>> rte_ring_dequeue_burst works with bursts of packets, but there is a > >>> valid application use case where a thread could be polling a set of > >>> rings where only a small number are actually busy. Right now, > >>> polling an empty ring only costs a few cycles, meaning that it's > >>> neglible to have a few polls of empty rings before you get to a busy > >>> one. Having that function not-inline will cause that cost to jump > >>> significantly, so could cause problems. (This leads to the interesting > scenario where ring enqueue is safe to un-inline, while dequeue is not). > >> A good way to think about it would be - ratio of amount of work done in > the function to cost of function call. > >> > > > > Yes, I would tend to agree in general. The other thing is the > > frequency of calls, and - as already stated - whether it takes a burst > > of not. Because even if it's a trivial function that takes only 10 > > cycles and we want to uninline it, the cost may double; but if it > > takes a burst of packets and is only used once/twice per burst the > > cost per packet should still only be a fraction of a cycle. > > > >>> > >>>> 3) Plus synchronization routines: spin/rwlock/barrier, etc. I think > >>>> rcu should be one of such exceptions - it is just another > >>>> synchronization mechanism after all (just a bit more > >>>> sophisticated). - Konstantin > >>>> > >>> In general I believe the synchronisation primitives should be inline. > >>> However, it does come down to cost too - if a function takes 300 > >>> cycles, do we really care if it takes 305 or 310 instead to make it not > inline? > >>> Hopefully most synchronisation primitives are faster than this so > >>> this situation should not occur. > >>> > >>>> 2) and 3) can be good guidelines to what functions/APIs can be > >>>> inline. But, > >>> I believe this guideline exists today too. Are there any thoughts to > >>> change this? > >>>> > >>>> Coming to 1), I think DPDK cannot provide ABI compatibility unless > >>>> all the > >>> inline functions are converted to normal functions and symbol > >>> versioning is done for those (not bothering about performance). > >>>> > >>> I disagree. I think even in the case of #1, we should be able to > >>> manage some changes without breaking ABI. > >> I completely agree with you on trying to keep the ABI break surface small > and doing due diligence when one is required. However, I think claiming 100% > ABI compatibility all the time (or as frequently as other projects claim) > might > be tough. IMO, we need to look beyond this to solve the end user problem. > May be packaging multiple LTS with distros when DPDK could not avoid > breaking ABI compatibility? > >> > > Having multiple LTS's per distro would be nice, but it's putting a lot > > more work on the distro folks, I think. > > I completely disagree with this approach. > Anytime I have seen this done, most frequently with interpreted languages, > think multiple versions of Java and Python concurrently on a system, it has > always been a usability nightmare. > > > > > >>> > >>>> In this context, does it make sense to say that we will maintain > >>>> API compatibility rather than saying ABI compatibility? This will > >>>> also send the right message to the end users. > >>>> > >>> I would value ABI compatibility much higher than API compatibility. > >>> If someone is recompiling the application anyway, making a couple of > >>> small changes (large rework is obviously a different issue) to the > >>> code should not be a massive issue, I hope. On the other hand, ABI > >>> compatibility is needed to allow seamless update from one version to > >>> another, and it's that ABI compatiblity that allows distro's to pick up > >>> our > latest and greatest versions. > >>> > >> I think it is also important to setup the right expectations to DPDK users. > i.e. DPDK will police itself better to provide ABI compatibility but > occasionally > it might not be possible. > >> > > The trouble here is that a DPDK release, as a product is either > > backward compatible or not. Either a user can take it as a drop-in > > replacement for the previous version or not. Users do not want to have > > to go through a checklist for each app to see if the app uses only > > "compatible" APIs or not. Same for the distro folks, they are not > > going to take some libs from a release but not others because the ABI is > broken for some but not others. > > Agreed, a DPDK release is either backwards compatible or it isn't, and to > Bruce's point if it isn't, it maters less if this is for a big API change or > a small > API change - the failure of the API stability guarantee is the significant > piece. > > The reality is that most other system libraries provide strong guarantees ... > to > date we have provided very little. It would be good to look libraries that provide inline functions and ABI compatibility to better understand the methods used.
> > If we start saying "Yes, but except when", the exception case very quickly > becomes the general case and then we are back to the beginning again. > > > > > Regards, > > /Bruce > >