On Tue, Apr 16, 2024 at 3:47 PM Robin Jarry <rja...@redhat.com> wrote:
>
> For now the telemetry socket is local to the machine running a DPDK
> application. Also, there is no official "schema" for the exposed
> metrics. Add a framework and a script to collect and expose these
> metrics to telemetry and observability agree gators such as Prometheus,
> Carbon or Influxdb. The exposed data must be done with end-users in
> mind, some DPDK terminology or internals may not make sense to everyone.
>
> The script only serves as an entry point and does not know anything
> about any specific metrics nor JSON data structures exposed in the
> telemetry socket.
>
> It uses dynamically loaded endpoint exporters which are basic python
> files that must implement two functions:
>
>  def info() -> dict[MetricName, MetricInfo]:
>      Mapping of metric names to their description and type.
>
>  def metrics(sock: TelemetrySocket) -> list[MetricValue]:
>      Request data from sock and return it as metric values. A metric
>      value is a 3-tuple: (name: str, value: any, labels: dict). Each
>      name must be present in info().
>
> The sock argument passed to metrics() has a single method:
>
>  def cmd(self, uri: str, arg: any = None) -> dict | list:
>      Request JSON data to the telemetry socket and parse it to python
>      values.
>
> The main script invokes endpoints and exports the data into an output
> format. For now, only two formats are implemented:
>
> * openmetrics/prometheus: text based format exported via a local HTTP
>   server.
> * carbon/graphite: binary (python pickle) format exported to a distant
>   carbon TCP server.
>
> As a starting point, 3 built-in endpoints are implemented:
>
> * counters: ethdev hardware counters
> * cpu: lcore usage
> * memory: overall memory usage
>
> The goal is to keep all built-in endpoints in the DPDK repository so
> that they can be updated along with the telemetry JSON data structures.
>
> Example output for the openmetrics:// format:
>
>  ~# dpdk-telemetry-exporter.py -o openmetrics://:9876 &
>  INFO using endpoint: counters (from .../telemetry-endpoints/counters.py)
>  INFO using endpoint: cpu (from .../telemetry-endpoints/cpu.py)
>  INFO using endpoint: memory (from .../telemetry-endpoints/memory.py)
>  INFO listening on port 9876
>  [1] 838829
>
>  ~$ curl http://127.0.0.1:9876/
>  # HELP dpdk_cpu_total_cycles Total number of CPU cycles.
>  # TYPE dpdk_cpu_total_cycles counter
>  # HELP dpdk_cpu_busy_cycles Number of busy CPU cycles.
>  # TYPE dpdk_cpu_busy_cycles counter
>  dpdk_cpu_total_cycles{cpu="73", numa="0"} 4353385274702980
>  dpdk_cpu_busy_cycles{cpu="73", numa="0"} 6215932860
>  dpdk_cpu_total_cycles{cpu="9", numa="0"} 4353385274745740
>  dpdk_cpu_busy_cycles{cpu="9", numa="0"} 6215932860
>  dpdk_cpu_total_cycles{cpu="8", numa="0"} 4353383451895540
>  dpdk_cpu_busy_cycles{cpu="8", numa="0"} 6171923160
>  dpdk_cpu_total_cycles{cpu="72", numa="0"} 4353385274817320
>  dpdk_cpu_busy_cycles{cpu="72", numa="0"} 6215932860
>  # HELP dpdk_memory_total_bytes The total size of reserved memory in bytes.
>  # TYPE dpdk_memory_total_bytes gauge
>  # HELP dpdk_memory_used_bytes The currently used memory in bytes.
>  # TYPE dpdk_memory_used_bytes gauge
>  dpdk_memory_total_bytes 1073741824
>  dpdk_memory_used_bytes 794197376
>
> Link: 
> https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format
> Link: 
> https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#text-format
> Link: 
> https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-pickle-protocol
> Link: 
> https://github.com/influxdata/telegraf/tree/master/plugins/inputs/prometheus
> Signed-off-by: Robin Jarry <rja...@redhat.com>

Applied, thanks.


-- 
David Marchand

Reply via email to