Hi Robin, Thanks for this patch. I did test it and it works as expected. Nonetheless, maybe we can improve on some parts.
In 'class TelemetrySocket', there is: ... self.sock.connect(path) data = json.loads(self.sock.recv(1024).decode()) ... Maybe we can improve with something like: try: rcv_data = self.sock.recv(1024) if rcv_data: data = json.loads(rcv_data.decode()) else: print("No data received from socket.") except json.JSONDecodeError as e: print("Error decoding JSON:", e) except Exception as e: print("An error occurred:", e) So that it handles a bit better the error cases. In the same way to implement more robust error handling mechanisms in: def load_endpoints ... except Exception as e: LOG.error("Failed to load endpoint module '%s' from '%s': %s", name, f, e) ... For example, you might catch FileNotFoundError, ImportError, or SyntaxError. That could help to debug! About TelemetryEndpoint I would see something like: class TelemetryEndpoint: """ Placeholder class only used for typing annotations. """ @staticmethod def info() -> typing.Dict[MetricName, MetricInfo]: """ Mapping of metric names to their description and type. """ raise NotImplementedError() @staticmethod def metrics(sock: TelemetrySocket) -> typing.List[MetricValue]: """ Request data from sock and return it as metric values. Each metric name must be present in info(). """ try: metrics = [] metrics_data = sock.fetch_metrics_data() for metric_name, metric_value in metrics_data.items(): metrics.append((metric_name, metric_value, {})) return metrics except Exception as e: LOG.error("Failed to fetch metrics data: %s", e) # If unable to fetch metrics data, return an empty list return [] With these changes, the metrics method of the TelemetryEndpoint class could handle errors better and the exporter can continue functioning even if there are issues with fetching metrics data. I don't know if all of that makes sens or if it's just nitpicking ! I can also propose an enhanced version of your patch if you prefer. Regards, Anthony