Hi all, I have a few applications that use the shared memory API. I’m running these on CentOS 7.4, and starting VPP using systemd. If VPP happens to crash or be intentionally restarted, those applications never seem to recover their API connection. They notice that the original VPP process died and try to call vl_client_disconnect_from_vlib(). That call tries to send API messages to cleanly shut down its connection. The application will time out waiting for a response, write a message like:
'vl_client_disconnect:301: peer unresponsive, give up and eventually consider itself disconnected. When it tries to reconnect, it hangs for a while (100 seconds on the last occurrence I checked on) and then prints messages like: vl_map_shmem:619: region init fail connect_to_vlib_internal:394: vl_client_api map rv -2 The client keeps on trying and continues seeing those same errors. If the client is restarted, it sees the same errors after restart. It doesn’t recover until VPP is restarted with the client stopped. Once that happens, the client can be started again and successfully connect. The VPP systemd service file that is installed with RPMs built via ‘make pkg-rpm' has the following: [Service] ExecStartPre=-/bin/rm -f /dev/shm/db /dev/shm/global_vm /dev/shm/vpe-api When systemd starts VPP, it removes these files which the still-running client applications have run shm_open/mmap on. I am guessing that when those clients try to disconnect with vl_client_disconnect_from_vlib(), they are stomping on something in shared memory that subsequently keeps them from being able to connect. If I comment that command from the systemd service definition, the problem behavior I described above disappears. The applications write one ‘peer unresponsive’ message and then they reconnect to the API successfully and all is (relatively) well. This also is the case if I don’t start VPP with systemd/systemctl and just run /usr/bin/vpp directly. Does anyone have any thoughts on whether it would be ok to remove that command from the systemd service file? Or is there some other better way to deal with VPP crashing from the perspective of a client to the shared memory API? Thanks! -Matt _______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev