On 7/14/20 6:01 PM, Markus Armbruster wrote: > Philippe Mathieu-Daudé <f4...@amsat.org> writes: > >> +Markus >> >> On 7/14/20 2:44 AM, Havard Skinnemoen wrote: >>> On Mon, Jul 13, 2020 at 8:02 AM Cédric Le Goater <c...@kaod.org> wrote: >>>> >>>> On 7/9/20 2:36 AM, Havard Skinnemoen wrote: >>>>> The Nuvoton NPCM7xx SoC family are used to implement Baseboard >>>>> Management Controllers in servers. While the family includes four SoCs, >>>>> this patch implements limited support for two of them: NPCM730 (targeted >>>>> for Data Center applications) and NPCM750 (targeted for Enterprise >>>>> applications). >>>>> >>>>> This patch includes little more than the bare minimum needed to boot a >>>>> Linux kernel built with NPCM7xx support in direct-kernel mode: >>>>> >>>>> - Two Cortex-A9 CPU cores with built-in periperhals. >>>>> - Global Configuration Registers. >>>>> - Clock Management. >>>>> - 3 Timer Modules with 5 timers each. >>>>> - 4 serial ports. >>>>> >>>>> The chips themselves have a lot more features, some of which will be >>>>> added to the model at a later stage. >>>>> >>>>> Reviewed-by: Tyrone Ting <kft...@nuvoton.com> >>>>> Reviewed-by: Joel Stanley <j...@jms.id.au> >>>>> Signed-off-by: Havard Skinnemoen <hskinnem...@google.com> >>>>> --- >> ... >> >>>>> +static void npcm7xx_realize(DeviceState *dev, Error **errp) >>>>> +{ >>>>> + NPCM7xxState *s = NPCM7XX(dev); >>>>> + NPCM7xxClass *nc = NPCM7XX_GET_CLASS(s); >>>>> + int i; >>>>> + >>>>> + /* CPUs */ >>>>> + for (i = 0; i < nc->num_cpus; i++) { >>>>> + object_property_set_int(OBJECT(&s->cpu[i]), >>>>> + arm_cpu_mp_affinity(i, >>>>> NPCM7XX_MAX_NUM_CPUS), >>>>> + "mp-affinity", &error_abort); >>>>> + object_property_set_int(OBJECT(&s->cpu[i]), >>>>> NPCM7XX_GIC_CPU_IF_ADDR, >>>>> + "reset-cbar", &error_abort); >>>>> + object_property_set_bool(OBJECT(&s->cpu[i]), true, >>>>> + "reset-hivecs", &error_abort); >>>>> + >>>>> + /* Disable security extensions. */ >>>>> + object_property_set_bool(OBJECT(&s->cpu[i]), false, "has_el3", >>>>> + &error_abort); >>>>> + >>>>> + qdev_realize(DEVICE(&s->cpu[i]), NULL, &error_abort); >>>> >>>> I would check the error: >>>> >>>> if (!qdev_realize(DEVICE(&s->cpu[i]), NULL, errp)) { >>>> return; >>>> } >>>> >>>> same for the sysbus_realize() below. >>> >>> Hmm, I used to propagate these errors until Philippe told me not to >>> (or at least that's how I understood it). >> >> It was before Markus simplification API were merged, you had to >> propagate after each call, since this is a non hot-pluggable SoC >> I suggested to use &error_abort to simplify. >> >>> I'll be happy to do it >>> either way (and the new API makes it really easy to propagate errors), >>> but I worry that I don't fully understand when to propagate errors and >>> when not to. >> >> Markus explained it on the mailing list recently (as I found the doc >> not obvious). I can't find the thread. I suppose once the work result >> after the "Questionable aspects of QEMU Error's design" discussion is >> merged, the documentation will be clarified. > > The Error API evolved recently. Please peruse the big comment in > include/qapi/error.h. If still unsure, don't hesitate to ask here. > >> My rule of thumb so far is: >> - programming error (can't happen) -> &error_abort > > Correct. Quote the big comment: > > * Call a function aborting on errors: > * foo(arg, &error_abort); > * This is more concise and fails more nicely than > * Error *err = NULL; > * foo(arg, &err); > * assert(!err); // don't do this > >> - everything triggerable by user or management layer (via QMP command) >> -> &error_fatal, as we can't risk loose the user data, we need to >> shutdown gracefully. > > Quote the big comment: > > * Call a function treating errors as fatal: > * foo(arg, &error_fatal); > * This is more concise than > * Error *err = NULL; > * foo(arg, &err); > * if (err) { // don't do this > * error_report_err(err); > * exit(1); > * } > > Terminating the process is generally fine during initial startup, > i.e. before the guest runs. > > It's generally not fine once the guest runs. Errors need to be handled > more gracefully then. A QMP command, for instance, should fail cleanly, > propagating the error to the monitor core, which then sends it to the > QMP client, and loops to process the next command. > >>> It makes sense to me to propagate errors from *_realize() and >>> error_abort on failure to set simple properties, but I'd like to know >>> if Philippe is on board with that. > > Realize methods must not use &error_fatal. Instead, they should clean > up and fail. > > "Clean up" is the part we often neglect. The big advantage of > &error_fatal is that you don't have to bother :) > > Questions?
One on my side. So in this realize(), all &error_abort uses has to be replaced by local_err + propagate ...: static void npcm7xx_realize(DeviceState *dev, Error **errp) { NPCM7xxState *s = NPCM7XX(dev); NPCM7xxClass *nc = NPCM7XX_GET_CLASS(s); int i; /* CPUs */ for (i = 0; i < nc->num_cpus; i++) { object_property_set_int(OBJECT(&s->cpu[i]), arm_cpu_mp_affinity(i, NPCM7XX_MAX_NUM_CPUS), "mp-affinity", &error_abort); object_property_set_int(OBJECT(&s->cpu[i]), NPCM7XX_GIC_CPU_IF_ADDR, "reset-cbar", &error_abort); object_property_set_bool(OBJECT(&s->cpu[i]), true, "reset-hivecs", &error_abort); /* Disable security extensions. */ object_property_set_bool(OBJECT(&s->cpu[i]), false, "has_el3", &error_abort); qdev_realize(DEVICE(&s->cpu[i]), NULL, &error_abort); } [...] ... but the caller does: static void quanta_gsj_init(MachineState *machine) { NPCM7xxState *soc; soc = npcm7xx_create_soc(machine, QUANTA_GSJ_POWER_ON_STRAPS); npcm7xx_connect_dram(soc, machine->ram); qdev_realize(DEVICE(soc), NULL, &error_abort); ^^^^^^^^^^^^ npcm7xx_load_kernel(machine, soc); } So we overload the code... My question: Do you confirm this is worth it to propagate?