Fri, Aug 30, 2019 at 05:54:57AM CEST, michael.c...@broadcom.com wrote: >From: Vasundhara Volam <vasundhara-v.vo...@broadcom.com> > >Create new FW devlink_health_reporter, to know the current health >status of FW. > >Command example and output: >$ devlink health show pci/0000:af:00.0 reporter fw > >pci/0000:af:00.0: > name fw > state healthy error 0 recover 0 > > FW status: Healthy; Reset count: 1
I'm puzzled how did you get this output, since you put "FW status" into "diagnose" callback fmsg and that is called upon "devlink health diagnose". [...] >+static int bnxt_fw_reporter_diagnose(struct devlink_health_reporter *reporter, >+ struct devlink_fmsg *fmsg) >+{ >+ struct bnxt *bp = devlink_health_reporter_priv(reporter); >+ struct bnxt_fw_health *health = bp->fw_health; >+ u32 val, health_status; >+ int rc; >+ >+ if (!health || test_bit(BNXT_STATE_IN_FW_RESET, &bp->state)) >+ return 0; >+ >+ val = bnxt_fw_health_readl(bp, BNXT_FW_HEALTH_REG); >+ health_status = val & 0xffff; >+ >+ if (health_status == BNXT_FW_STATUS_HEALTHY) { >+ rc = devlink_fmsg_string_pair_put(fmsg, "FW status", >+ "Healthy;"); First of all, the ";" is just wrong. You should put plain string if anything. You are trying to format user output here. Don't do that please. Please see json output: $ devlink health show pci/0000:af:00.0 reporter fw -j -p Please remove ";" from the strings. Second, I do not understand why you need this "FW status" at all. The reporter itself has state healthy/error: pci/0000:af:00.0: name fw state healthy error 0 recover 0 ^^^^^^^ "FW" is redundant of course as the reporter name is "fw". Please remove "FW status" and replace with some pair indicating the actual error state. In mlx5 they call it "Description". >+ if (rc) >+ return rc; >+ } else if (health_status < BNXT_FW_STATUS_HEALTHY) { >+ rc = devlink_fmsg_string_pair_put(fmsg, "FW status", >+ "Not yet completed >initialization;"); >+ if (rc) >+ return rc; >+ } else if (health_status > BNXT_FW_STATUS_HEALTHY) { >+ rc = devlink_fmsg_string_pair_put(fmsg, "FW status", >+ "Encountered fatal error and >cannot recover;"); >+ if (rc) >+ return rc; >+ } >+ >+ if (val >> 16) { >+ rc = devlink_fmsg_u32_pair_put(fmsg, "Error", val >> 16); Perhaps rather call this "Error code"? >+ if (rc) >+ return rc; >+ } >+ >+ val = bnxt_fw_health_readl(bp, BNXT_FW_RESET_CNT_REG); >+ rc = devlink_fmsg_u32_pair_put(fmsg, "Reset count", val); What is this counter counting? Number of recoveries? If so, that is also already counted internally by devlink. [...]