Tue, Jan 01, 2019 at 02:47:27AM CET, jakub.kicin...@netronome.com wrote: >On Mon, 31 Dec 2018 16:31:54 +0200, Eran Ben Elisha wrote: >> The health mechanism is targeted for Real Time Alerting, in order to know >> when >> something bad had happened to a PCI device >> - Provide alert debug information >> - Self healing >> - If problem needs vendor support, provide a way to gather all needed >> debugging >> information. >> >> The main idea is to unify and centralize driver health reports in the >> generic devlink instance and allow the user to set different >> attributes of the health reporting and recovery procedures. >> >> The devlink health reporter: >> Device driver creates a "health reporter" per each error/health type. >> Error/Health type can be a known/generic (eg pci error, fw error, rx/tx >> error) >> or unknown (driver specific). >> For each registered health reporter a driver can issue error/health reports >> asynchronously. All health reports handling is done by devlink. >> Device driver can provide specific callbacks for each "health reporter", e.g. >> - Recovery procedures >> - Diagnostics and object dump procedures >> - OOB initial attributes >> Different parts of the driver can register different types of health >> reporters >> with different handlers. >> >> Once an error is reported, devlink health will do the following actions: >> * A log is being send to the kernel trace events buffer >> * Health status and statistics are being updated for the reporter instance >> * Object dump is being taken and saved at the reporter instance (as long as >> there is no other Objdump which is already stored) >> * Auto recovery attempt is being done. Depends on: >> - Auto-recovery configuration >> - Grace period vs. time passed since last recover >> >> The user interface: >> User can access/change each reporter attributes and driver specific callbacks >> via devlink, e.g per error type (per health reporter) >> - Configure reporter's generic attributes (like: Disable/enable auto >> recovery) >> - Invoke recovery procedure >> - Run diagnostics >> - Object dump > >Thanks for continuing this work! > >> The devlink health interface (via netlink): >> DEVLINK_CMD_HEALTH_REPORTER_GET >> Retrieves status and configuration info per DEV and reporter. >> DEVLINK_CMD_HEALTH_REPORTER_SET >> Allows reporter-related configuration setting. >> DEVLINK_CMD_HEALTH_REPORTER_RECOVER >> Triggers a reporter's recovery procedure. >> DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE >> Retrieves diagnostics data from a reporter on a device. >> DEVLINK_CMD_HEALTH_REPORTER_OBJDUMP_GET > >The addition of "objdump" and its marshalling is a bit disappointing. >It seemed to me when region snapshots were added that they would serve >this exact purpose. Taking a region snapshot or "core dump", if you >will, after something went south. Now it's confusing to have two >mechanisms serving the same purpose. > >It's unclear from quick reading of the code how if the buffers get >timestamped. Can you report more than one? > >About the marshalling, I'm not sure it belongs in the kernel. There is >precedent for adding interpretation of FW blobs in user space (ethtool). >IMHO it's a more scalable approach, if we want to avoid having 100 kLoC >drivers. Amount of code you add to print the simple example from last >patch is not inspiring confidence. > >And on the bike shedding side :) -> I think you should steer clear of >calling this objdump, as it has very little to do with the >functionality of well-known objdump tool. Its not even clear what the >object the name is referring to is. > >Long story short the overlap with region snapshots makes it unclear >what purpose either serves, and IMHO we should avoid the marshalling by >teaching user space how to interpret snapshots. Preferably we only >have one dump mechanism, and user space can be taught the interpretation >once.
Unlike regions, which are binary blobs, the "objdump" we have here is a tree of values (json like). It is not taken from FW/HW. It is generated in driver and passed down to user to be shown. Originally, Eran just pushed that info into a string buffer and the user had to parse it again. I asked to format this in Netlink attributes JSON-style so the user gets all hierarchy without need of parsing and can easily do Netlink->human_stdout or Netlink->JSON. > >> Retrieves the last stored objdump. Devlink health >> saves a single objdump. If an objdump is not already stored by the devlink >> for this reporter, devlink generates a new objdump. >> Objdump output is defined by the reporter. >> DEVLINK_CMD_HEALTH_REPORTER_OBJDUMP_CLEAR >> Clears the last saved objdump file for the specified reporter. >> >> >> netlink >> +--------------------------+ >> | | >> | + | >> | | | >> +--------------------------+ >> |request for ops >> |(diagnose, >> mlx5_core devlink |recover, >> |dump) >> +--------+ +--------------------------+ >> | | | reporter| | >> | | | +---------v----------+ | >> | | ops execution | | | | >> | <----------------------------------+ | | >> | | | | | | >> | | | + ^------------------+ | >> | | | | request for ops | >> | | | | (recover, dump) | >> | | | | | >> | | | +-+------------------+ | >> | | health report | | health handler | | >> | +-------------------------------> | | >> | | | +--------------------+ | >> | | health reporter create | | >> | +----------------------------> | >> +--------+ +--------------------------+ >> >> Available reporters: >> In this patchset, three reporters of mlx5e driver are included. The FW >> reporters implement devlink_health_reporter diagnostic, object dump and >> recovery procedures. The TX reporter implements devlink_health_reporter >> diagnostic and recovery procedures. >> >> This RFC is based on the same concepts as previous RFC I sent earlier this >> year: "[RFC PATCH iproute2-next] System specification health API". The API >> was >> changed, also devlink code and mlx5e reporters were not available at the >> previous RFC.