Re: Suggestions on iterating eBPF maps

Alexei Starovoitov Tue, 01 May 2018 19:34:12 -0700

On Wed, May 02, 2018 at 11:05:19AM +0900, Lorenzo Colitti wrote:
> On Sat, Apr 28, 2018 at 10:04 AM, Alexei Starovoitov
> <alexei.starovoi...@gmail.com> wrote:
> > Another approach could be to use map-in-map and have almost atomic
> > replace of the whole map with new potentially empty map. The prog
> > can continue using the new map, while user space walks no longer
> > accessed old map.
> 
> That sounds like a promising approach. I assume this would be
> functionally equivalent to an approach where there is a map containing
> a boolean that says whether to write to map A or map B? We'd then do
> the following:
> 
> 0. Kernel program is writing to map A.
> 1. Userspace pushes config that says to write to map B.
> 2. Kernel program starts to write to map B.
> 3. Userspace scans map A, collecting stats and deleting everything it finds.
> 
> One problem with this is: if the effects of #1 are not immediately
> visible to the programs running on all cores, the program could still
> be writing to map A and the deletes in #3 would result in loss of
> data. Are there any guarantees around this? I know that hash map
> writes are atomic, but I'm not aware of any other guarantees here. Are
> there memory barriers around map writes and reads?
> 
> In the absence of guarantees, userspace could put a sleep between #1
> and #3 and things would be correct Most Of The Time(TM), but if the
> kernel is busy doing other things that might not be sufficient.
> Thoughts?


if you use map-in-map you don't need extra boolean map.
0. bpf prog can do
   inner_map = lookup(map_in_map, key=0);
   lookup(inner_map, your_real_key);
1. user space writes into map_in_map[0] <- FD of new map
2. some cpus are using old inner map and some a new
3. user space does sys_membarrier(CMD_GLOBAL) which will do synchronize_sched()
   which in CONFIG_PREEMPT_NONE=y servers is the same as synchronize_rcu()
   which will guarantee that progs finished.
4. scan old inner map

Re: Suggestions on iterating eBPF maps

Reply via email to