Anthony Harivel, Mar 27, 2024 at 16:18:
Hi Robin,

Thanks for this patch. I did test it and it works as expected. Nonetheless, maybe we can improve on some parts.

Hey Anthony, thanks a lot for testing!

In 'class  TelemetrySocket', there is:
...
self.sock.connect(path)
data = json.loads(self.sock.recv(1024).decode())
...

Maybe we can improve with something like:

        try:
            rcv_data = self.sock.recv(1024)
            if rcv_data:
                data = json.loads(rcv_data.decode())
            else:
                print("No data received from socket.")
        except json.JSONDecodeError as e:
                print("Error decoding JSON:", e)
        except Exception as e:
                print("An error occurred:", e)

So that it handles a bit better the error cases.

This is undesired as it would actually mask the errors from the calling code. Unless you can do something about the error, printing it should be left to the calling code. I will handle these errors better in v2.

In the same way to implement more robust error handling mechanisms in:
def load_endpoints
...
except Exception as e:
    LOG.error("Failed to load endpoint module '%s' from '%s': %s", name, f, e)
...

For example, you might catch FileNotFoundError, ImportError, or SyntaxError.
That could help to debug!

We could print the whole stack trace but I don't see what special handling could be done depending on the exception. I will try to make this better for v2.

About TelemetryEndpoint I would see something like:

class TelemetryEndpoint:
    """
    Placeholder class only used for typing annotations.
    """

    @staticmethod
    def info() -> typing.Dict[MetricName, MetricInfo]:
        """
        Mapping of metric names to their description and type.
        """
        raise NotImplementedError()

    @staticmethod
    def metrics(sock: TelemetrySocket) -> typing.List[MetricValue]:
        """
        Request data from sock and return it as metric values. Each metric
        name must be present in info().
        """
        try:
            metrics = []
            metrics_data = sock.fetch_metrics_data()
            for metric_name, metric_value in metrics_data.items():
                metrics.append((metric_name, metric_value, {}))
            return metrics
        except Exception as e:
            LOG.error("Failed to fetch metrics data: %s", e)
            # If unable to fetch metrics data, return an empty list
            return []

With these changes, the metrics method of the TelemetryEndpoint class could handle errors better and the exporter can continue functioning even if there are issues with fetching metrics data.

I don't know if all of that makes sens or if it's just nitpicking! I can also propose an enhanced version of your patch if you prefer.

As indicated in the docstring, this class is merely a placeholder. Its code is never executed. It only stands to represent what functions must be implemented in endpoints.

However, your point is valid. I will update my code to handle errors in endpoints more gracefully and avoid failing the whole request if there is only one error.

Thanks for the thorough review!

Reply via email to