That said, my theory is that you are in the loop of bihash searches in one worker (that eventually crashes),
while deleting the ACLs from another one, and with larger amount of data the probability is higher that at the moment of the hash lookup the vector is still there, but at the moment of attempting to access the vector’s element the vector is already gone, so you get the result you get. That would explain why it never has been observed so far (as I said i thought the ACL updates to be control plane operations)... but it doesn’t explain why it never happens if you only do the updates from only one worker... different traffic patterns/burstiness, that doesn’t result in this race ? As a test to see if this theory has any legs, try to move the assignment to applied_hash_aces into the loop just before where It is used and see if it makes the behavior somewhat better. (It’s not a correct fix still, but it will reduce the time window for a potential race if it’s there). Let me know if this helps any... —a >> On 15 Jul 2020, at 21:21, Andrew Yourtchenko via lists.fd.io >> <ayourtch=gmail....@lists.fd.io> wrote: > “ One more thing is this crash is happening only when we add acl rules and > lookup contexts from multiple workers.” - you should not be doing that from > any workers, let alone multiple. > > The manipulations with ACLs and lookup contexts are considered to be control > plane activity, thus should be performed from the control plane thread. > > --a > >>> On 15 Jul 2020, at 20:56, Satya Murthy <satyamurthy1...@gmail.com> wrote: >> One more thing is this crash is happening only when we add acl rules and >> lookup contexts from multiple workers. >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16976): https://lists.fd.io/g/vpp-dev/message/16976 Mute This Topic: https://lists.fd.io/mt/75527176/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-