Hi Robin, 

Thanks for this patch. I did test it and it works as expected. 
Nonetheless, maybe we can improve on some parts. 

In 'class  TelemetrySocket', there is:
...
self.sock.connect(path)
data = json.loads(self.sock.recv(1024).decode())
...

Maybe we can improve with something like: 

        try:
            rcv_data = self.sock.recv(1024)

            if rcv_data:
                data = json.loads(rcv_data.decode())
            else:
                print("No data received from socket.")
        except json.JSONDecodeError as e:
                print("Error decoding JSON:", e)
        except Exception as e:
                print("An error occurred:", e)

So that it handles a bit better the error cases.

In the same way to implement more robust error handling mechanisms in:
def load_endpoints
...
except Exception as e:
    LOG.error("Failed to load endpoint module '%s' from '%s': %s", name, f, e)
...

For example, you might catch FileNotFoundError, ImportError, or SyntaxError.
That could help to debug!


About TelemetryEndpoint I would see something like: 

class TelemetryEndpoint:
    """
    Placeholder class only used for typing annotations.
    """

    @staticmethod
    def info() -> typing.Dict[MetricName, MetricInfo]:
        """
        Mapping of metric names to their description and type.
        """
        raise NotImplementedError()

    @staticmethod
    def metrics(sock: TelemetrySocket) -> typing.List[MetricValue]:
        """
        Request data from sock and return it as metric values. Each metric
        name must be present in info().
        """
        try:
            metrics = []
            metrics_data = sock.fetch_metrics_data()
            for metric_name, metric_value in metrics_data.items():
                metrics.append((metric_name, metric_value, {}))
            return metrics
        except Exception as e:
            LOG.error("Failed to fetch metrics data: %s", e)
            # If unable to fetch metrics data, return an empty list
            return []

With these changes, the metrics method of the TelemetryEndpoint class could
handle errors better and the exporter can continue functioning even if there
are issues with fetching metrics data.

I don't know if all of that makes sens or if it's just nitpicking !
I can also propose an enhanced version of your patch if you prefer.

Regards,
Anthony

Reply via email to