Hi Adrien, Konstantin, I'm jumping in this (interesting) discussion. I already talked a bit with Adrien in point to point, and I think its patch is valid. Please see some comments below.
On 06/08/2016 03:57 PM, Adrien Mazarguil wrote: > On Wed, Jun 08, 2016 at 01:09:18PM +0000, Ananyev, Konstantin wrote: >>> >>> Hi Konstantin, >>> >>> On Wed, Jun 08, 2016 at 10:34:17AM +0000, Ananyev, Konstantin wrote: >>>> Hi Adrien, >>>> >>>>> >>>>> An assertion failure occurs in __rte_mbuf_raw_free() (called by a few >>>>> PMDs) >>>>> when compiling DPDK with CONFIG_RTE_LOG_LEVEL=RTE_LOG_DEBUG and starting >>>>> applications with a log level high enough to trigger it. >>>>> >>>>> While rte_mbuf_raw_alloc() sets refcount to 1, __rte_mbuf_raw_free() >>>>> expects it to be 0. >>>>> Considering users are not expected to reset the >>>>> reference count to satisfy assert() and that raw functions are designed on >>>>> purpose without safety belts, remove these checks. >>>> >>>> Yes, it refcnt supposed to be set to 0 by __rte_pktmbuf_prefree_seg(). >>>> Wright now, it is a user responsibility to make sure refcnt==0 before >>>> pushing >>>> mbuf back to the pool. >>>> Not sure why do you consider that wrong? >>> >>> I do not consider this wrong and I'm all for using assert() to catch >>> programming errors, however in this specific case, I think they are >>> inconsistent and misleading. >> >> Honestly, I don't understand why. >> Right now the rule of thumb is - when mbuf is in the pool, it's refcnt >> should be equal zero. What is the purpose of this? Is there some code relying on this? >> Yes, as you pointed below - that rule probably can be changed to: >> when mbuf is in the pool, it's refcnt should equal one, and that would >> probably allow us >> to speedup things a bit, but I suppose that's the matter of another >> aptch/discussion. > > Agreed. I agree this is somehow another discussion, but let's dive into :) But since [1], it is allowed to call rte_mbuf_raw_alloc() in PMDs (all PMDs were calling an internal function before). We could argue that rte_mbuf_raw_free() should also be made public for PMDs. As you said below, no-one promised that the free() reverts the malloc(), but given the function names, one can legitimately expect that the following code is valid: m = __rte_mbuf_raw_alloc(); /* do nothing here */ __rte_mbuf_raw_free(m); If no code relies on having the refcnt set to 0 when a mbuf is in the pool, I suggest to relax this constraint as Adrien proposed. Then, my opinion is that the refcount should be set to 1 in rte_pktmbuf_reset(). And rte_pktmbuf_free() should not be allowed on an uninitialized mbuf (yes, this would require some changes in PMDs). This would open the door for bulk allocation/free in the mbuf api. This could be done in several steps: 1/ remove these assert(), add introduce a public rte_mbuf_raw_free() 2/ announce that rte_pktmbuf_free() won't work on uninitialized mbufs in a future version, and ensure that all PMD are inline with this requirement 3/ later, move refcount initialization in rte_pktmbuf_reset() What do you think? Another option is to remove the rte_mbuf_raw_alloc()/rte_mbuf_raw_free() and use mempool calls. But having a mbuf wrapper does not seem a bad thing to me. By the way, __rte_pktmbuf_prefree_seg() is also an internal function. I think we should try to update the mbuf API so that the PMDs do not need to call these internal functions. [1] http://dpdk.org/browse/dpdk/commit/?id=fbfd99551ca370266f4bfff58ce441cf5cb1203a Regards, Olivier > >>>> If the user calls __rte_mbuf_raw_free() manualy it is his responsibility >>>> to make >>>> sure mbuf's refcn==0. >>> >>> Sure, however what harm does it cause (besides assert() to fail), since the >>> allocation function sets refcount to 1? >>> >>> Why having the allocation function set the refcount if we are sure it is >>> already 0 (assert() proves it). Removing rte_mbuf_refcnt_set(m, 1) can >>> surely improve performance. >> >> That's' just an assert() enabled when MBUF_DEBUG is on. >> Its sole purpose is to help troubleshoot the bugs and help to catch >> situations >> when someone silently updates mbufs supposed to be free. > > I perfectly understand and I cannot agree more with this explanation, > however the fact these functions are not symmetrical remains an issue that > needs to be addressed somehow in my opinion. > >>>> BTW, why are you doing it? >>>> The comment clearly states that the function is for internal use: >>>> /** >>>> * @internal Put mbuf back into its original mempool. >>>> * The use of that function is reserved for RTE internal needs. >>>> * Please use rte_pktmbuf_free(). >>>> * >>>> * @param m >>>> * The mbuf to be freed. >>>> */ >>>> static inline void __attribute__((always_inline)) >>>> __rte_mbuf_raw_free(struct rte_mbuf *m) >>> >>> Several PMDs are using it anyway (won't name names, but I know one of them >>> quite well). >> >> Then it probably is a bug in these PMDs that need to be fixed. >> >>> I chose to modify this code instead of its users for the >>> following reasons: >>> >>> - Considering their names, these functions should be opposites and able to >>> work together like malloc()/free(). >> >> These are internal functions. >> Comments in mbuf clearly state that library users shouldn't call them >> directly. >> They are written to fit internal librte_mbuf needs, and no-one ever promised >> malloc/free() compatibility here. > > So it cannot be provided for the sake of not providing it or is there a good > reason? > > What I meant is that since PMDs already made the mistake of using these > functions to benefit from the improved performance, DPDK being all about > performance and stuff, let them use it as intended. Perhaps we should drop > those "__" like for rte_mbuf_raw_alloc(). > >>> >>> - PMDs are relying on these functions for performance reasons, we can assume >>> they took the extra care necessary to make sure it would work properly. >> >> That just doesn't seem correct to me. >> The proper way to do free fo mbuf segment is: >> >> static inline void __attribute__((always_inline)) >> rte_pktmbuf_free_seg(struct rte_mbuf *m) >> { >> if (likely(NULL != (m = __rte_pktmbuf_prefree_seg(m)))) { >> m->next = NULL; >> __rte_mbuf_raw_free(m); >> } >> } >> >> If by some reason you choose not to use this function, then it is your >> responsibility to perform similar actions on your own before pushing mbuf >> into the pool. >> That's what some TX functions for some Intel NICs do to improve performance: >> they call _prefree_seg() manually and try to put mbufs into the pool in >> groups. > > Not anymore it seems, but in the current code base both ena and mpipe PMDs > (soon mlx5 as well) seem to get this wrong. > >>> - Preventing it would make these PMDs slower and is not acceptable either. >> >> I can hardly imagine that __rte_pktmbuf_prefree_seg() impact would be that >> severe... >> But ok, probably you do have some very specific case, but then why you PMD >> just doesn't call: >> rte_mempool_put(m->pool, m); >> directly? > > To survive the upcoming transition to mbufs backed by libc malloc() without > having to fix them? Joke aside, I guess the reason is to use functions with > "mbuf" in their names when dealing with mbufs. > >> Why instead you choose to change common functions and compromise >> librte_mbuf debug ability? > > No, I'm fine with keeping the debug ability, however I did not find a way to > both keep it and fix the consistency issue without breaking something > (performance or refcount assumptions I'm not familiar with elsewhere). > >>> What remains is the consistency issue, I think these statements were only >>> added to catch multiple frees, >> >> Yes these asserts() here to help catch bugs, >> and I think it is a good thing to have them here. >> >>> and those should be caught at a higher >>> level, where other consistency checks are also performed. >> >> Like where? > > Possibly rte_pktmbuf_free_seg(). > >>>>> Signed-off-by: Adrien Mazarguil <adrien.mazarguil at 6wind.com> >>>>> --- >>>>> lib/librte_mbuf/rte_mbuf.h | 2 -- >>>>> 1 file changed, 2 deletions(-) >>>>> >>>>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h >>>>> index 11fa06d..7070bb8 100644 >>>>> --- a/lib/librte_mbuf/rte_mbuf.h >>>>> +++ b/lib/librte_mbuf/rte_mbuf.h >>>>> @@ -1108,7 +1108,6 @@ static inline struct rte_mbuf >>>>> *rte_mbuf_raw_alloc(struct rte_mempool *mp) >>>>> if (rte_mempool_get(mp, &mb) < 0) >>>>> return NULL; >>>>> m = (struct rte_mbuf *)mb; >>>>> - RTE_ASSERT(rte_mbuf_refcnt_read(m) == 0); >>>>> rte_mbuf_refcnt_set(m, 1); >>>>> __rte_mbuf_sanity_check(m, 0); >>>>> >>>>> @@ -1133,7 +1132,6 @@ __rte_mbuf_raw_alloc(struct rte_mempool *mp) >>>>> static inline void __attribute__((always_inline)) >>>>> __rte_mbuf_raw_free(struct rte_mbuf *m) >>>>> { >>>>> - RTE_ASSERT(rte_mbuf_refcnt_read(m) == 0); >>>>> rte_mempool_put(m->pool, m); >>>>> } >>>>> >>>>> -- >>>>> 2.1.4 >>>> >>> >>> -- >>> Adrien Mazarguil >>> 6WIND >