> -----Original Message-----
> From: Ola Liljedahl [mailto:ola.liljed...@arm.com]
> Sent: Tuesday, January 29, 2019 6:57 AM
> To: Eads, Gage <gage.e...@intel.com>; dev@dpdk.org
> Cc: jer...@marvell.com; mcze...@marvell.com; nd <n...@arm.com>;
> Richardson, Bruce <bruce.richard...@intel.com>; Ananyev, Konstantin
> <konstantin.anan...@intel.com>; step...@networkplumber.org;
> olivier.m...@6wind.com; arybche...@solarflare.com
> Subject: Re: [PATCH v4 1/5] ring: add 64-bit headtail structure
> 
> On Mon, 2019-01-28 at 12:14 -0600, Gage Eads wrote:
> > 64-bit head and tail index widths greatly increases the time it takes
> > for them to wrap-around (with current CPU speeds, it won't happen
> > within the author's lifetime). This is fundamental to avoiding the ABA
> > problem -- in which a thread mistakes reading the same tail index in
> > two accesses to mean that the ring was not modified in the intervening
> > time -- in the upcoming non-blocking ring implementation. Using a
> > 64-bit index makes the possibility of this occurring effectively zero.
> Just an observation.
> The following invariant holds (using ring_size instead of mask):
> ∀ index: ring[index % ring_size].index % ring_size == index % ring_size i.e. 
> the N
> (N=log2 ring size) lsb of ring[].index will always be the same (for a 
> specific slot)
> so serve no purpose.
> 
> This means we don't have to store the whole index in each slot, it is enough 
> to
> store "index / ring_size" (which I call the lap counter). This could be 
> useful for an
> implementation for 32-bit platforms which support 64-bit CAS (to write the 
> slot
> ptr & index (lap counter) atomically) and uses 64-bit head & tail indexes (to 
> avoid
> the quick wrap around you would have with 32-bit ring indexes).
> 
> So
> ring[index % ring_size].lap = index / ring_size;
> 
> An implementation could of course use bitwise-and instead of modulo and
> bitwise- right shift instead of division. The 2-logaritm of ring_size should 
> also be
> pre- calcucated and stored in the ring buffer metadata.
> 

That's a pretty interesting idea. The question is, with such a design, what 
should DPDK's minimum NB ring size be on 32-bit platforms?

If a ring entry is written on average every M cycles, then a lap occurs every 
M*N cycles and each counter repeats every M*N*2^32 cycles. If M=100 on a 2GHz 
system, then the counter repeats every

N=1: 3.33 minutes
...
N=256: 14.22 hours
N=512: 28.44 hours
N=1024: 2.37 days
...
N=16384: 37.92 days

I think a minimum size of 1024 strikes a good balance between not too 
burdensome and sufficiently low odds of ABA occurring.

Thanks,
Gage

[snip]

Reply via email to