Vipin from AMD expressed demand for telemetry support on Windows in order to collect port statistics. Implementing a PoC, he stumbled several issues. Together we have designed a solution that eliminates these issues. It affects telemetry operation for all platforms, at least code-wise, We would like community input if this is acceptable or if there are some better suggestions.
Telemetry today (talking only about v2): * Telemetry library starts a thread to listen for connections. It is affinitized to the main lcore. A thread is spawned to serve each client connection. * A listening AF_UNIX socket is created in DPDK runtime directory. This guarantees there will be no clash between sockets created by different DPDK processes. It has a fixed name within this directory, so it's trivial to discover. * The socket is SOCK_SEQPACKET. This allows the code to simply use write() and let the socket preserve message bounds and order. dpdk-telemetry.py relies on this, and probably external clients too. #define MAX_OUTPUT_LEN (1024 * 16). * The protocol is JSON objects, one per packet. * Telemetry can be enabled (default) or disabled with an EAL option. Windows issues: * Threading API is implemented in EAL. Currently this is a pthread.h shim, there are plans to replace it with rte_thread API [1]. Hence, there's a circular dependency: EAL -> telemetry -> threading API in EAL. There's a similar issue logging called by this shim. Bruce pointed out that a similar issue is with EAL logs and maybe logging should be moved its own library, on which EAL would depend. * There is no AF_UNIX and no simple way to select a unique free endpoint. * There is no SOCK_SEQPACKET and SOCK_DGRAM size limitation is too small. The only viable option is SOCK_STREAM, but it does not preserve message boundaries. Proposal: * Move threading outside of telemetry, let EAL manage the threads. eal_telemetry_client_thread(conn) { rte_telemetry_serve(conn); } eal_telemetry_server_thread(tm) { pthread_set_affinity_np(...); while (rte_telemetry_listen(tm)) { conn = rte_telemetry_accept(tm); pthread_create(eal_telemetry_client_thread, conn); } } rte_eal_init() { tm_logtype = rte_log_type_register(); tm = rte_telemetry_init(internal_conf.tm_kvargs, tm_logtype); pthread_create(eal_telementry_server_thread, tm); } Among Vipin, Micorosft engineers, and me there is a consensus that libraries should be passive and not create threads by themselves. * The logging issue can be solved differently: a) by factoring logging into a library, either as Bruce suggested; b) by replacing logging with simple print in Windows shim; c) by integrating and using the new threading API [1, 2]. * Allow to select listening endpoint and protocol. Different options may be supported on each system and be the default. rte_kvargs syntax and data structure can be used. --telemetry Default settings. Also kept for compatibility. --telemetry transport=seqpacket,protocol=message AF_UNIX+SOCK_SEQPACKET, no delimiters between objects (same as default for Unices). --telemetry transport=tcp,protocol=line,endpoint=:64000 Line-oriented JSON over a TCP socket at local port 64000 (may be same as --telemetry <no argument> on Windows). --no-telemetry Disable telemetry. Default on Windows. Parameter names and set are subject to discussion. An last example if a possible future extension as one of the reasons not to fix combinations of "transport=" and "protocol=" (similar to socket API): --telemetry transport=pipe,protocol=line,endpoint=some-name Stream of objects over a named pipe (fifo), for example. Initial support should cover: 1) "transport=seqpacket,protocol=message" on Unices; 2) "transport=tcp,protocol=line,endpoint=" at least on Windows, but this variant is really applicable everywhere. * dpdk-telemetry.py must support all combinations DPDK supports. Defaults don't change, so backward compatibility is preserved. * Because Windows cannot automatically select and endpoint that does not conflict with anything, the default would be to disable telemetry. We also propose to always log telemetry socket path. * It is technically possible to accept remote connections. However, it's a new attack surface and a security risk. OTOH, there's no need to allow remote connections: anyone who needs this can setup a tunnel or something. Local socket must be used by default. [1]: http://patchwork.dpdk.org/project/dpdk/list/?series=22319 [2]: http://patchwork.dpdk.org/project/dpdk/list/?series=20472