Wed, Oct 09, 2019 at 06:55:45AM CEST, vasundhara-v.vo...@broadcom.com wrote: >On Mon, Oct 7, 2019 at 3:26 PM Jiri Pirko <j...@resnulli.us> wrote: >> >> Fri, Aug 30, 2019 at 05:54:57AM CEST, michael.c...@broadcom.com wrote: >> >From: Vasundhara Volam <vasundhara-v.vo...@broadcom.com> >> > >> >Create new FW devlink_health_reporter, to know the current health >> >status of FW. >> > >> >Command example and output: >> >$ devlink health show pci/0000:af:00.0 reporter fw >> > >> >pci/0000:af:00.0: >> > name fw >> > state healthy error 0 recover 0 >> > >> > FW status: Healthy; Reset count: 1 >> >> I'm puzzled how did you get this output, since you put "FW status" into >> "diagnose" callback fmsg and that is called upon "devlink health diagnose". >> >> [...] >Jiri, you are right last line is output of diagnose command. Command >is missing here. > >$ devlink health diagnose pci/0000:af:00.0 reporter fw > FW status: Healthy; Reset count: 0 > >> >> >+static int bnxt_fw_reporter_diagnose(struct devlink_health_reporter >> >*reporter, >> >+ struct devlink_fmsg *fmsg) >> >+{ >> >+ struct bnxt *bp = devlink_health_reporter_priv(reporter); >> >+ struct bnxt_fw_health *health = bp->fw_health; >> >+ u32 val, health_status; >> >+ int rc; >> >+ >> >+ if (!health || test_bit(BNXT_STATE_IN_FW_RESET, &bp->state)) >> >+ return 0; >> >+ >> >+ val = bnxt_fw_health_readl(bp, BNXT_FW_HEALTH_REG); >> >+ health_status = val & 0xffff; >> >+ >> >+ if (health_status == BNXT_FW_STATUS_HEALTHY) { >> >+ rc = devlink_fmsg_string_pair_put(fmsg, "FW status", >> >+ "Healthy;"); >> >> First of all, the ";" is just wrong. You should put plain string if >> anything. You are trying to format user output here. Don't do that >> please. >> >> Please see json output: >> $ devlink health show pci/0000:af:00.0 reporter fw -j -p >> >> Please remove ";" from the strings. >Okay, I will send a patch for removing ";" > >> >> >> Second, I do not understand why you need this "FW status" at all. The >> reporter itself has state healthy/error: >> pci/0000:af:00.0: >> name fw >> state healthy error 0 recover 0 >> ^^^^^^^ >> >> "FW" is redundant of course as the reporter name is "fw". >> >> Please remove "FW status" and replace with some pair indicating the >> actual error state. >Okay, I can rename to "Status description" so that "FW" name will not >be repeated.
Also please remove the redundant "Healthy" information. That is an attribute of the reporter itself. > >> >> In mlx5 they call it "Description". >> >> >> >+ if (rc) >> >+ return rc; >> >+ } else if (health_status < BNXT_FW_STATUS_HEALTHY) { >> >+ rc = devlink_fmsg_string_pair_put(fmsg, "FW status", >> >+ "Not yet completed >> >initialization;"); >> >+ if (rc) >> >+ return rc; >> >+ } else if (health_status > BNXT_FW_STATUS_HEALTHY) { >> >+ rc = devlink_fmsg_string_pair_put(fmsg, "FW status", >> >+ "Encountered fatal error >> >and cannot recover;"); >> >+ if (rc) >> >+ return rc; >> >+ } >> >+ >> >+ if (val >> 16) { >> >+ rc = devlink_fmsg_u32_pair_put(fmsg, "Error", val >> 16); >> >> Perhaps rather call this "Error code"? >Okay. > >> >> >> >+ if (rc) >> >+ return rc; >> >+ } >> >+ >> >+ val = bnxt_fw_health_readl(bp, BNXT_FW_RESET_CNT_REG); >> >+ rc = devlink_fmsg_u32_pair_put(fmsg, "Reset count", val); >> >> What is this counter counting? Number of recoveries? >> If so, that is also already counted internally by devlink. >"Reset count" is the counter that displays the number of times >firmware has gone for >reset through different mechanisms and devlink is one of it. Firmware >could have gone >for a reset through other tools as well. > >Driver gets the information from firmware health register, when >diagnose command is invoked. okay, sounds sane. Thanks! > >> >> [...]