Hi Adrian, On Wed, Nov 18, 2020 at 05:57:09PM +0200, Adrian Hunter wrote:
[...] > >>> The follow diagram depicts the flow for how the AUX event is arriving > >>> ahead than MMAP2 event: > >>> > >>> T1: T3: T4: > >>> perf process Open PMU device Perf is scheduled out; > >>> invoke perf_aux_output_end() > >>> and generate AUX event > >>> ^ ^ ^ > >>> | | | > >>> CPU0 ---------------------------------------------------> (T) > >>> \ > >>> \ Forked process is placed on another CPU > >>> V > >>> CPU1 ---------------------------------------------------> (T) > >>> | | > >>> V V > >>> T2: T5: > >>> Invoke execvp() for profiled Record MMAP2 event > >>> program > >> > >> I don't understand. An MMAP at T5 cannot affect data recorded before T5 > >> (e.g. T4). > >> Are you saying the timestamps from different CPUs are not in time order? > >> Ot > >> are you saying the data is actually recorded after T5? > > > > Here refers to the later one. > > > > I observed the PMU device is opened and closed for multiple times, so > > the AUX events and MMAP2 events are interleave with each other. Below > > is the dumping with debugging option "--debug ordered-events=1" (and I > > added code for printing out event type value). > > > > When perf tool starts to decode the trace data for the first AUX event > > (its timestamp is 160.767623) and try to synthesize samples, it doesn't > > give chance for handling the sequential MMAP2 events (ts: 160.770223, > > 160.770451, 160.773111). > > > > [ 160.765623] queue_event nr_events 1 > > [ 160.765623] new event on queue: type=12 > > [ 160.765714] queue_event nr_events 2 > > [ 160.765714] new event on queue: type=3 > > [ 160.765822] queue_event nr_events 3 > > [ 160.765822] new event on queue: type=10 ==> MMAP2 event > > [ 160.766180] queue_event nr_events 4 > > [ 160.766180] new event on queue: type=14 > > [ 160.767623] queue_event nr_events 5 > > [ 160.767623] new event on queue: type=11 ==> AUX event > > [ 160.769710] queue_event nr_events 6 > > [ 160.769710] new event on queue: type=12 > > [ 160.769762] queue_event nr_events 7 > > [ 160.769762] new event on queue: type=14 > > [ 160.769810] queue_event nr_events 8 > > [ 160.769810] new event on queue: type=14 > > [ 160.769943] queue_event nr_events 9 > > [ 160.769943] new event on queue: type=11 ==> AUX event > > [ 160.770028] queue_event nr_events 10 > > [ 160.770028] new event on queue: type=12 > > [ 160.770073] queue_event nr_events 11 > > [ 160.770073] new event on queue: type=14 > > [ 160.770223] queue_event nr_events 12 > > [ 160.770223] new event on queue: type=10 ==> MMAP2 event > > [ 160.770451] queue_event nr_events 13 > > [ 160.770451] new event on queue: type=10 ==> MMAP2 event > > [ 160.770628] queue_event nr_events 14 > > [ 160.770628] new event on queue: type=14 > > [ 160.771650] queue_event nr_events 15 > > [ 160.771650] new event on queue: type=11 ==> AUX event > > [ 160.771752] queue_event nr_events 16 > > [ 160.771752] new event on queue: type=12 > > [ 160.771798] queue_event nr_events 17 > > [ 160.771798] new event on queue: type=14 > > [ 160.773111] queue_event nr_events 18 > > [ 160.773111] new event on queue: type=10 ==> MMAP2 event > > [ 160.782613] queue_event nr_events 19 > > [ 160.782613] new event on queue: type=14 > > > > Thanks for the reviewing, > > > > Leo > > > > When you say AUX event, do you mean PERF_RECORD_AUX or PERF_RECORD_AUXTRACE? The AUX event is PERF_RECORD_AUX. > The kernel creates PERF_RECORD_AUX and the data range aux_offset/aux_size > should be for things prior to the sample time. > > perf tool creates PERF_RECORD_AUXTRACE when copying data out of the aux > area. PERF_RECORD_AUXTRACE does not have a timestamp and can contain data > corresponding to several PERF_RECORD_AUX. Data is queued based on > PERF_RECORD_AUXTRACE because that is what the data is attached to i.e. one > buffer can contain data from several PERF_RECORD_AUX events. > > If you are processing data based on PERF_RECORD_AUX timestamp then you need > to pay attention to the offset. PERF_RECORD_AUX gives you > aux_offset/aux_size and auxtrace_buffer (which may contain data from several > PERF_RECORD_AUX) gives you offset/size. Yeah, as you suggested, in this case the key point is for the first PERF_RECORD_AUX event handling, in theory it should contain very short trace data and exit the decoding quickly. The issue is the first PERF_RECORD_AUX event will decode all AUX trace data; and near the end of events handling, the tool finally has chance to process MMAP2 events. So this RFC patch is not the right way, I will dig more for PERF_RECORD_AUX event handling for cs-etm. Thanks for the detailed explanation and suggetions. Leo