On 2020-07-30 at 10:13 CEST, Markus Armbruster wrote... > Christophe de Dinechin <dinec...@redhat.com> writes: > >> On 2020-07-29 at 13:53 CEST, Markus Armbruster wrote... >>> Christophe de Dinechin <dinec...@redhat.com> writes: >>> >>>> On 2020-07-27 at 10:23 CEST, Markus Armbruster wrote... >>>>> Christophe de Dinechin <dinec...@redhat.com> writes: >>>>> >>>>>> On 2020-07-23 at 16:06 CEST, Markus Armbruster wrote... >>>>>>> Christophe de Dinechin <dinec...@redhat.com> writes: >>> [...] >>>>>>>> I'm certainly not against adding a command-line option to activate >>>>>>>> recorder >>>>>>>> options specifically, but as I understand, the option -trace already >>>>>>>> exists, >>>>>>>> and its semantics is sufficiently different from the one in recorder >>>>>>>> patterns that I decided to not connect the two for now. For example, to >>>>>>>> disable trace foo, you'd pass "-foo" to the -trace option, but "foo=0" >>>>>>>> to >>>>>>>> RECORDER_TRACES. The parsing of graphing options and other related >>>>>>>> recorder-specific stuff is a bit difficult to integrate with -trace >>>>>>>> too. >>>>>>> >>>>>>> We need proper integration with the existing trace UI. >>>>>> >>>>>> I agree, but I don't think this belongs to this particular patch series. >>>>>> See below why. >>>>>> >>>>>>> >>>>>>> In particular, the ability to enable and disable trace events >>>>>>> dynamically provided by QMP commands trace-event-get-state, >>>>>>> trace-event-set-state, and HMP command trace-event is really useful. >>>>>> >>>>>> That ability exists, but given the many differences between the >>>>>> recorder and other tracing mechanisms, I found it useful to add a >>>>>> specific >>>>>> "recorder" command. >>>>> >>>>> Precedence for commands specific to a trace backend: trace-file. >>>>> >>>>> Name yours trace-recorder? >>>> >>>> But then "recorder dump" does not fit with any "trace" concept. >>> >>> Once you make the recorder a trace backend, whatever the recorder does >>> becomes a trace concept :) >> >> I understand your point, but I want to make a distinction between recorder >> tracing and other recorder features. Does that make sense? > > Maybe :) My thoughts haven't settled, yet. > >>>>>> For example, assuming I built with both recorder and log trace backends, >>>>>> from the monitor, I can type: >>>>>> >>>>>> trace-event kvm_run_exit on >>>>>> >>>>>> What I get then is something like that: >>>>>> >>>>>> 2091091@1595518935.441273:kvm_run_exit cpu_index 0, reason 2 >>>>>> 2091091@1595518935.441292:kvm_run_exit cpu_index 0, reason 2 >>>>>> 2091091@1595518935.441301:kvm_run_exit cpu_index 0, reason 2 >>>>>> 2091091@1595518935.441309:kvm_run_exit cpu_index 0, reason 2 >>>>>> 2091091@1595518935.441254:kvm_run_exit cpu_index 0, reason 2 >>>>>> >>>>>> It would not be very useful to activate recorder traces as well when that >>>>>> happens, which would have the undesired side effect of purging any >>>>>>> corresponding entry from a following recorder dump. >>>>> >>>>> I'm afraid I don't understand, because the gaps in my understanding of >>>>> what the recorder can do are just too big. >>>> >>>> There is a video at the top of https://github.com/c3d/recorder, or direct >>>> link https://www.youtube.com/watch?v=kEnQY1zFa0Y. Hope this helps. >>>> >>>> >>>>> >>>>> From your cover letter: >>>>> >>>>> 1. Flight recorder: Dump information on recent events in case of crash >>>>> >>>>> Define "recent events", please. Is it all trace events (except for the >>>>> ones disabled at build time, of course)? >>>> >>>> For event categories only known through qemu trace definitions, by default, >>> >>> Right now, there are no others, aren't there? >> >> There is one in the second patch, the example. That was actually the whole >> point of that second patch, which is not otherwise particularly exciting. >> >>> >>>> it's the last 8 events. If you manually declare a recorder, then you can >>>> specify any size. >>>> >>>> (The difference between this manual recorder usage and the recorder backend >>>> generated code is similar to the difference between the log backend and >>>> "DPRINTF") >>> >>> I'm not sure I get the parenthesis. >> >> I took that example because it's mentioned in docs/devel/tracing.txt: >> >> === Log === >> >> The "log" backend sends trace events directly to standard error. This >> effectively turns trace events into debug printfs. >> >> This is the simplest backend and can be used together with existing code >> that >> uses DPRINTF(). >> >> Just the same way, the recorder back-end can both use existing trace points, >> using the recorder back-end, but you can also add explicit record() >> statements much like you can add DPRINTF() statements. > > Got it. > >> Whether this is a good idea or not is debatable. I would argue that in some >> cases, it's a good idea, or at least, probably a better idea than DPRINTF ;-) > > No need to convince me a low-overhead flight recorder can be useful. > I've improvised special-purpose flight recorders many times when > printf-like tracing interfered enough with the timing to make the > interesting behavior go away. > > I doubt having both trace points and record() makes sense in QEMU except > for ad hoc debugging. > >>>>> "Flight recorder" hints at recording in some ring buffer. True? >>>> >>>> Yes. >>>> >>>>> >>>>> Can you explain the difference to trace backend "ftrace"? >>>> >>>> - It's not recording function calls nor stack traces. >>>> - It's faster (no fd write, no % parameter expansion) >>> >>> I guess "simple" doesn't format either, but it still writes to disk. >>> >>>> - It works on non-Linux platforms >>>> - It's always on >>> >>> The combination of "always on" and "minimal overhead" is obviously >>> useful. >>> >>>>> >>>>> 2. Tracing: Individual traces can be enabled using environment >>>>> variables >>>>> >>>>> I guess "individual traces" means "individual trace events". >>>> >>>> Maybe I'm unwittingly opposing some well-established qemu habits here, as >>>> indicated by the "trace-event" command, but to me, a "trace" reports >>>> multiple >>>> "events" of the same class (distinguished e.g. by their timestamps). So I >>>> would say that I am indeed enabling a trace in order to be shown all the >>>> events associated to this trace. >>>> >>>> Does that make sense? >>>> >>>> But then I understand that the existing command is called "trace-event". >>>> So there's that. I don't mind changing the text that much. >>> >>> I'm hardly an expert on tracing in QEMU, I merely use it from time to >>> time. I am the QMP maintainer, though. Adding new commands for every >>> new thing is easy. Proper integration takes thought. And that's what >>> we're doing. >>> >>> As far as I can tell, QEMU has no concept of a "trace" in the sense of >>> your definition. There are only individual trace events. Enabled trace >>> events emit via the enabled trace backends. Simple & stupid. >> >> I understand this terminology, and I was pointing out that it sounds really >> a strange use of the words. To me, an event is definitely a single point in >> time. However, I don't mind aligning the patch terminology where necessary >> for consistency. > > I think we use "trace event" (one trace record) and "trace point" (the > point in code where a certain kind of event gets emitted) the same way, > and the terminology isn't strange at all. > > You additionally use "trace". Also not strange, once you defined it. > >>>>> What does it mean to enable a trace event? How does this affect the >>>>> recorder's behavior? >>>> >>>> For a given recorder instrumentation point, three things can happen: >>>> - It is recorded in a circular buffer (always) >>>> - If tracing is active, then a trace message is displayed >>>> - If sharing is active, then it is copied in shared memory for use by some >>>> external program, e.g. recorder_scope. >>> >>> Enabled trace events get emitted to the configured backends, which can >>> do whatever they want with the events. >>> >>> Current trace backends emit to a single destination (log: stderr, >>> simple: a file, ftrace: the kernel, nop: the bit bucket, you get the >>> idea), but that's detail. >>> >>> My (possibly naive) expecation for a recorder backend: as always, only >>> the enabled trace events get emitted to the recorder backend. All of >>> "circular buffer (always), display if tracing is active, copy if sharing >>> is active" happens within the backend. In particular, enabling a trace >>> event cannot control the "tracing is active" bit. It effectively >>> controls the "is recorded in a circular buffer" bit. Correct? >> >> Actually, not in the patches as sent, no. IMO, this would defeat the whole >> purpose of always-on instrumentation. It's possible to disable recording, >> but by default, it's on. > > I misread the tracing infrastructure. Honoring the enabled bit is > actually up to the backend. Some backends open-code it in the generated > _nocheck__trace_FOO() for speed. > >>> I guess much of the confusion comes from the fact that trace backends >>> are just that, but the recorder is more. >>> >>> Trace backends emit trace events. >>> >>> The recorder can *also* emit trace events. But it primarily records. I >>> figure users commonly record more events than they trace. >> >> Yes. The normal state for a trace point is to be off, the normal state for a >> record point is to be recording but not tracing. >> >>> >>> If you do the recorder as a trace backend, then the existing UI to >>> control what's being traced becomes UI to control what's being recorded >>> by the recorder and traced by everything else. Confusing, and we're >>> short one UI to control what's being traced by the recorder. >>> >>> Perhaps doing the recorder as a trace backend is simply a bad idea. >>> Perhaps it should be integrated in the trace subsystem differently, so >>> that trace-enable continues to control what's being traced for all >>> backends. >> >> That means you don't benefit from any of the existing trace points. >> >> As an aside, the existing tracing system has the trace points in a different >> file than the actual code. In other words, we have something like: >> >> # ioport.c >> cpu_in(unsigned int addr, char size, unsigned int val) "addr 0x%x(%c) >> value %u" >> cpu_out(unsigned int addr, char size, unsigned int val) "addr 0x%x(%c) >> value %u" >> >> and in ioport.c we have: >> >> trace_cpu_in(addr, 'b', val); >> >> The recorder native approach would be something like: >> >> trace(cpu_in, "addr 0x%x(%c) value %u", addr, 'b', val); >> >> Is the current syntax an artifact of how tracetool works, or is it >> intentional? > > I'm not sure. A bit of both? > > The separate trace-events file makes generating code for events simple > and easy. > > Some backends heavily rely on code generation. The recorder backend > seems not to. Code generation includes the .dtrace generated with > backend dtrace. > >>> Perhaps the recorder should emit the events it traces to a trace >>> backend. >> >> Ah, interesting. The recorder has a configurable output, so it is somewhat >> trivial to connect to some generated function that calls the various trace >> backends. >> >>> >>>>> 3. Real-time graphing / control, using the recorder_scope application >>>>> >>>>> I'm ignoring this one for now. >>>>> >>>>>> There are, however, new monitor commands, so now I can do: >>>>>> >>>>>> recorder trace kvm_run_exit >>>>>> >>>>>> I now get an output where each trace shows up twice. >>>>>> >>>>>> [120139718 137.127269] kvm_run_exit: cpu_index 0, reason 2 >>>>>> 2091091@1595518954.209803:kvm_run_exit cpu_index 0, reason 2 >>>>>> [120139734 137.127281] kvm_run_exit: cpu_index 0, reason 2 >>>>>> 2091091@1595518954.209814:kvm_run_exit cpu_index 0, reason 2 >>>>>> [120139751 137.127293] kvm_run_exit: cpu_index 0, reason 2 >>>>>> >>>>>> I don't find particularly useful, so I don't think having that as the >>>>>> normal behaviour would qualify as what you called proper integration. >>>>> >>>>> Well, I asked for it by configuring two trace backends, didn't I? >>>> >>>> Not really IMO. By configuring two trace backends, what you requested was >>>> to >>>> have two different kinds of instrumentation active at the same time. It >>>> does >>>> not imply (at least to me) that these two kinds of instrumentation have to >>>> react to *run-time* configuration identically, in particular if that means >>>> in a way that happens to not be helpful. >>>> >>>> If you configure log and ftrace, you are not requesting them to behave >>>> identically at run time. >>> >>> No, but by configuring two trace backends, you ask each one to do its >>> thing. If "its thing" includes printing to stderr for both, then >>> getting events printed twice is exactly what you asked for. >> >> I think that you are arguing that since trace-event currently activates or >> disables all trace back-ends at once, then that's necessarily what I asked >> for. >> >> There is a nice ambiguity about the word "asked for" here, but that's >> definitely not what I would like. I described this behavior as unhelpful. >> >> However, addressing that correctly may indeed belong to the common code >> rather than to individual backends, I'll certainly grant you that. >> >> >>> >>>>> If I configured just the recorder backend, would I get a useful trace? >>>> >>>> Yes, but not the full functionality that "recorder" gives you. >>>> Notably, there is nothing in the existing trace-event command that would >>>> allow me to configure recorder sharing or trace output format, whereas the >>>> proposed patch allows me to do that. >>> >>> Christophe, leave that windmill alone! I'm over here! ;) >> >> ;-) >> >> >>> I'm not trying to shoot down commands specific to trace backends. I'm >>> trying to understand how the proposed recorder backend interacts with >>> the existing tracing UI and infrastructure, so I can figure out how to >>> best fit it with the existing UI and infrastructure. >>> >>> QEMU's tracing subsystem consists of a common, backend-independent part >>> and backends. >>> >>> Command trace-event controls one aspect of the common part: whether to >>> hand the trace event to the backend or not. The backend has no say in >>> that. > > I was confused. Honoring the enabled bit is actually up to the backend. > >> This is the part I changed in the patch, because there is already a >> mechanism to activate / deactivate tracing in the recorder, so having two >> 'if(condition)' testing essentially the same condition using two different >> flags seems useless. >> >>> >>> Command trace-file controls one aspect of the backend: writing output to >>> a file. Only the simple backend makes use of it, but that's detail. >>> >>> You tell me the recorder backend needs to expose more configuration to >>> really make sense. The question isn't whether that's true (I trust you >>> on that), it's how to best integrate it with what we have. >> >> You gave me a number of interesting ideas. Let me get back to you with a new >> spin. You can tell me which one you prefer. I also made a couple of changes >> to the recorder itself to make future qemu integration easier. >> >>> >>>>> If yes, what's wrong with letting me control what's being traced with >>>>> the common trace-event command in addition to a backend-specific >>>>> command? >>>> >>>> This would force that code to a lower common denominator that would remove >>>> a >>>> lot of useful functionality specific to the recorder, even assuming we keep >>>> both the "recorder" and "trace-event" commands to configure the recorder. >>>> >>>> There is a problem with the semantics of the underlying API. For example, >>>> disabling trace foo is "-foo" in qemu, "foo=0" in the recorder API. I don't >>>> really like the idea of adding some string manipulation code to transform >>>> "-foo" into "foo=0". I would prefer to extend the recorder API to accept >>>> -foo, but that's not the case today. >>> >>> You're talking about the configuration file specified with trace >>> events=... >>> >>> The existing UI for the recorder library is different. Hardly a >>> surprise. >>> >>> Consistency between the existing UI for the common part and the new UI >>> for the recorder backend is quite desirable. I'm not making it a hard >>> requirement. I would like to see concrete ideas on how to get it, >>> though. >> >> Here what can be done, but it's a larger undertaking: >> >> - Convert named trace entry points to anonymous trace(...) calls following a >> syntax similar to record(...), so that traces are readable in one place. >> >> - Have a mechanism for trace activation / deactivation that delegates to the >> trace back-end. That way, any ugly recorder-specific string manipulation >> converting regexps and such will be in the back-end, not in common code. >> >> - When multiple trace back-ends are configured, make it possible to >> configure them independently, ideally using back-end specific options >> >> - Add some notion of back-end specific option, to implement back-end >> specific configurations like trace-file or recorder_dump_on_signal. >> >> - Split the notion of "trace" and "record", and add some "record" >> semantics. Some existing backends already are closer to record semantics, >> e.g. ftrace. >> >> - Convert the existing DPRINTF stuff to trace(). > > Plenty of ideas. I'd recommend to wait for Stefan's opinion before you > run with them.
Yes. And not just Stefan's. My point is I agree with you better integration is desirable, but it's a relatively long process. It looks to me, for example, like other back-ends also have recording features. So we could really expose the two features, and slowly shift towards tracing/recording back=ends. -- Cheers, Christophe de Dinechin (IRC c3d)