Vipin from AMD expressed demand for telemetry support on Windows
in order to collect port statistics.
Implementing a PoC, he stumbled several issues.
Together we have designed a solution that eliminates these issues.
It affects telemetry operation for all platforms, at least code-wise,
We would like community input if this is acceptable
or if there are some better suggestions.

Telemetry today (talking only about v2):

* Telemetry library starts a thread to listen for connections.
  It is affinitized to the main lcore.
  A thread is spawned to serve each client connection.

* A listening AF_UNIX socket is created in DPDK runtime directory.
  This guarantees there will be no clash between sockets
  created by different DPDK processes.
  It has a fixed name within this directory, so it's trivial to discover.

* The socket is SOCK_SEQPACKET.
  This allows the code to simply use write()
  and let the socket preserve message bounds and order.
  dpdk-telemetry.py relies on this, and probably external clients too.
  #define MAX_OUTPUT_LEN (1024 * 16).

* The protocol is JSON objects, one per packet.

* Telemetry can be enabled (default) or disabled with an EAL option.

Windows issues:

* Threading API is implemented in EAL.
  Currently this is a pthread.h shim,
  there are plans to replace it with rte_thread API [1].
  Hence, there's a circular dependency:
  EAL -> telemetry -> threading API in EAL.
  There's a similar issue logging called by this shim.

  Bruce pointed out that a similar issue is with EAL logs
  and maybe logging should be moved its own library,
  on which EAL would depend.

* There is no AF_UNIX and no simple way to select a unique free endpoint.

* There is no SOCK_SEQPACKET and SOCK_DGRAM size limitation is too small.
  The only viable option is SOCK_STREAM,
  but it does not preserve message boundaries.

Proposal:

* Move threading outside of telemetry, let EAL manage the threads.

        eal_telemetry_client_thread(conn) {
                rte_telemetry_serve(conn);      
        }

        eal_telemetry_server_thread(tm) {
                pthread_set_affinity_np(...);
                while (rte_telemetry_listen(tm)) {
                        conn = rte_telemetry_accept(tm);
                        pthread_create(eal_telemetry_client_thread,
                                       conn);
                }
        }

        rte_eal_init() {
                tm_logtype = rte_log_type_register();
                tm = rte_telemetry_init(internal_conf.tm_kvargs,
                                        tm_logtype);
                pthread_create(eal_telementry_server_thread, tm);
        }

  Among Vipin, Micorosft engineers, and me there is a consensus
  that libraries should be passive and not create threads by themselves.

* The logging issue can be solved differently:

  a) by factoring logging into a library, either as Bruce suggested;
  b) by replacing logging with simple print in Windows shim;
  c) by integrating and using the new threading API [1, 2].

* Allow to select listening endpoint and protocol.
  Different options may be supported on each system and be the default.
  rte_kvargs syntax and data structure can be used.

        --telemetry
                Default settings. Also kept for compatibility.
        --telemetry transport=seqpacket,protocol=message
                AF_UNIX+SOCK_SEQPACKET, no delimiters between objects
                (same as default for Unices).
        --telemetry transport=tcp,protocol=line,endpoint=:64000
                Line-oriented JSON over a TCP socket at local port 64000
                (may be same as --telemetry <no argument> on Windows).
        --no-telemetry
                Disable telemetry. Default on Windows.

  Parameter names and set are subject to discussion.

  An last example if a possible future extension
  as one of the reasons not to fix combinations of "transport="
  and "protocol=" (similar to socket API):

        --telemetry transport=pipe,protocol=line,endpoint=some-name
                Stream of objects over a named pipe (fifo), for example.

  Initial support should cover:

  1) "transport=seqpacket,protocol=message" on Unices;
  2) "transport=tcp,protocol=line,endpoint=" at least on Windows,
     but this variant is really applicable everywhere.

* dpdk-telemetry.py must support all combinations DPDK supports.
  Defaults don't change, so backward compatibility is preserved.

* Because Windows cannot automatically select and endpoint that does not
  conflict with anything, the default would be to disable telemetry.
  We also propose to always log telemetry socket path.

* It is technically possible to accept remote connections.
  However, it's a new attack surface and a security risk.
  OTOH, there's no need to allow remote connections:
  anyone who needs this can setup a tunnel or something.
  Local socket must be used by default.

[1]: http://patchwork.dpdk.org/project/dpdk/list/?series=22319
[2]: http://patchwork.dpdk.org/project/dpdk/list/?series=20472

Reply via email to