On Wed, May 02, 2018 at 11:05:19AM +0900, Lorenzo Colitti wrote: > On Sat, Apr 28, 2018 at 10:04 AM, Alexei Starovoitov > <alexei.starovoi...@gmail.com> wrote: > > Another approach could be to use map-in-map and have almost atomic > > replace of the whole map with new potentially empty map. The prog > > can continue using the new map, while user space walks no longer > > accessed old map. > > That sounds like a promising approach. I assume this would be > functionally equivalent to an approach where there is a map containing > a boolean that says whether to write to map A or map B? We'd then do > the following: > > 0. Kernel program is writing to map A. > 1. Userspace pushes config that says to write to map B. > 2. Kernel program starts to write to map B. > 3. Userspace scans map A, collecting stats and deleting everything it finds. > > One problem with this is: if the effects of #1 are not immediately > visible to the programs running on all cores, the program could still > be writing to map A and the deletes in #3 would result in loss of > data. Are there any guarantees around this? I know that hash map > writes are atomic, but I'm not aware of any other guarantees here. Are > there memory barriers around map writes and reads? > > In the absence of guarantees, userspace could put a sleep between #1 > and #3 and things would be correct Most Of The Time(TM), but if the > kernel is busy doing other things that might not be sufficient. > Thoughts?
if you use map-in-map you don't need extra boolean map. 0. bpf prog can do inner_map = lookup(map_in_map, key=0); lookup(inner_map, your_real_key); 1. user space writes into map_in_map[0] <- FD of new map 2. some cpus are using old inner map and some a new 3. user space does sys_membarrier(CMD_GLOBAL) which will do synchronize_sched() which in CONFIG_PREEMPT_NONE=y servers is the same as synchronize_rcu() which will guarantee that progs finished. 4. scan old inner map