Folks, I have a stale cache of interface data in the layer above my VPP API calls and I need to refresh it. So I wrote a vpp_intf_refresh_all() function. It looks roughly like this:
vpp_intf_refresh_all() { if (intf data is not dirty) return; for each is_ipv6 in {0,1} { vpp_ip_dump(is_ipv6); } sleep(2) // See commentary for each is_ipv6 in {0,1} { vpp_ip_address_dump_all(is_ipv6); // hits all IFs } vpp_sw_interface_dump() intf data is now clean } My "details handlers" develop a few vectors of information in almost the exact same way as the code in api_format.c does. That is to say: ip_dump/ip_details_t_handler -- form a vector of ip_details with an entry for each if-index that is returned. Note that there is no way to know how many interfaces will be handled by the ip_details_t_handler function. Let me say that differently: We have no way of knowing when it is finished and will not be called again on behalf of the original IP_DUMP request. ip_address_dump/ip_address_details -- Using the vector of ip_details formed during the ip_dump pass, iterate over each IF and request its ip_address_dump to form another vector of addresses on that specific interface. Here's the thing: If I remove the sleep(2), this code fails. If I leave the sleep(2), this code works. On the one hand, if there is enough time for all of the ip_details to be handled, and the vector of ip_details to be formed, then the next set of API calls, ip_address_dump, will work correctly. On the other hand, if the API driving code is allowed to proceed before the async replies to all the ip_dump requests are done, then it will not have a proper ip_details vector and thus fail. I've just described a classic asynchronous failure mode. Soltions abound in other worlds. What is the recommended approach in this world? So, why does VAT work? Because it effectively serializes these steps with enough time in between each one to allow all the async behavior to be unnoticed, and not affect the next step. But even beyond that, it tries to detect this situation and tells the user to do it differently. From vl_api_address_details_t_handler(): if (!details || vam->current_sw_if_index >= vec_len (details) || !details[vam->current_sw_if_index].present) { errmsg ("ip address details arrived but not stored"); errmsg ("ip_dump should be called first"); return; } Sending a CONTROL_PING to flush the write-side of the API isn't good enough. Placing an arbitrary sleep() in the code is an incredibly fragile approach. OK, it's the wrong solution. Is there some form of API synchronization that I missed somewhere? Can we instroduce an actual WAIT_FOR_COMPLETION event into the API message handling pipeline? I'm just thinking something that would be issued as an API call where my sleep(2) is, and would cause the API handling side to stall until the reply side is drained? Reading code, eg, vl_api_ip_dump_t_handler(), I see that it just iterates and drops messages into shmem queue. So, yeah, knowing when that reply send queue has drained will be hard. OK, so, what if we added a "is_last_detail" (bool) flag to all the *_details_t messages? That way we can know when we are done waiting for the results to come back? I could at least write a spin-until-last-messages-seen-or-timeout sort of watcher. Thoughts? jdl
_______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev