Em Wed, Jun 18, 2014 at 09:42:54PM +0400, Stanislav Fomichev escreveu: > > I try, when possible, to not use short options that are used in > > 'strace', so what if we use 'F' here? > Agreed, will change. > > > And: > > > > trace --pgfaults --pgfaults > > > > for major and min page faults looks ugly, what if we instead use --pf > > for both, and allow passing min or maj as args? > > > > I.e.: > > > > For both major and minor: > > > > trace --pf > > > > for just major page faults: > > > > trace --pf maj > Not sure I like it. Currently, when we need to trace page faults its > major faults in 99.9% of cases, we are not interested in minor ones (and
If 99.9% of the cases is for for major faults, then make --pf default to that :-) While allowing to change this behaviour via: For just major faults: trace --pf For all kinds: trace --pf all For just minor faults: trace --pf min Sure you can have the shorthand that -FF means "--pf all" What I'm trying to avoid is to have to use with long options: trace --pf --pf Also, with your scheme, how would I ask for just minor faults, if somebody happens to want that? > there are thousands of them in a second). So we just do 'perf trace -F'. > If we need minor faults, we are probably interested in all fault events, > so we do -F twice. > > With 'trace --pf' we by default hit our 0.01% use case, so we always > need to type 'trace --pf maj'. --pf may be clear from the documentation > standpoint, but I don't like the fact that --pf defaults to all faults. So make it default to the most common case :-) > > > + int trace_pgfaults; > > uint8_t should be enough > By using int I state: 'I don't care about underlying type, give me > something to count'. If we use uint8_t it would imply (to me) that > for some reason we care about struct layout, size, endianness, etc. > IOW, I don't think we need to care about +-3 bytes here. Guess I've been using pahole too much... ;-) Also, uint8_t is something that can count, as is u8, that is more commonly used in tools/perf/ to make it look like kernel code :-) But nah, not a biggie, just trying to be judicious in using types. > > > -typedef int (*tracepoint_handler)(struct trace *trace, struct perf_evsel > > > *evsel, > > > +typedef int (*tracepoint_handler)(struct trace *trace, > > > + union perf_event *event, > > > + struct perf_evsel *evsel, > > > struct perf_sample *sample); > > > > You'll reduce patch size if you leave the first line alone and just add > > the new parameter (event) after evsel. > > > > It'll become then: > > > > typedef int (*tracepoint_handler)(struct trace *trace, struct perf_evsel > > *evsel, > > + union perf_event *event, > > struct perf_sample *sample); > > > > Then please make one separate patch adding this new parameter, stating > > that it will be used by pagefault tracing, that will come in a followup > > patch in this series. > Agreed, thanks. > > > > +static bool valid_dso(struct addr_location *al, struct perf_sample > > > *sample) > > > > Humm, can't this be reduced to just: > > > > return al->map && al->map->dso; > > > > ? I.e. if the helper returned a dso, it is because the address that was > > looked up is in that dso, no? > > > > I even guess that if there is al->map, that should be enough to check, > > byt haven't thought this thru nor looked at the relevant sources, in a > > hurry now. > Yes, we don't need to check for ->dso, it's always non-null, I'll remove > the check. > But we do need to check for the range. Because > thread__find_addr_map searches for the closest map using only ->start and > doesn't check if address is within map range (->end). > Maybe we need to fix it in thread__find_addr_map so it always returns valid > map? That is my expectation, yes, if I ask for the map where address N is, it should return just that, where have you found this bug? The thread__find_addr_map will end up calling maps__find() and it does this: if (ip < m->start) p = &(*p)->rb_left; else if (ip > m->end) p = &(*p)->rb_right; else return m; Where is the problem? > > Please separate adding page fault tracing recording on a file in a > > separate patch from adding it to live mode, then the changelog message > > can concentrate on examples for each feature. > Ok. > > > > - if (sample.raw_data == NULL) { > > > + if (evsel->attr.type != PERF_TYPE_SOFTWARE && > > > + sample.raw_data == NULL) { > > > > This looks like a separate patch with relevant associated changelog > > message explaining why this is needed. > No, this belongs here. When we enable page faults, they don't have any > raw data (while syscalls have). So we still want to keep this check for > syscalls but don't want it for fault events. Understood the intent, but the test here is really: if (evsel->attr.type == PERF_TYPE_TRACEPOINT && sample.raw_data == NULL) Ok? > > > + if (evsel && > > > + (perf_evsel__init_syscall_tp(evsel, trace__sys_enter) < 0 || > > > + perf_evsel__init_sc_tp_ptr_field(evsel, args))) { > > > pr_err("Error during initialize raw_syscalls:sys_enter > > > event\n"); > > > goto out; > > > > the above can be ditched, not needed. Care to explain if you think > > otherwise? > This doesn't belong to this patch, but it's required because we can do > 'trace --no-syscalls -F' and get only fault events without syscall events; > we don't want to fail here. > Will move to appropriate patch. Thanks > > > + evlist__for_each(session->evlist, evsel) { > > > + if (evsel->attr.type == PERF_TYPE_SOFTWARE && > > > + (evsel->attr.config == PERF_COUNT_SW_PAGE_FAULTS_MAJ || > > > + evsel->attr.config == PERF_COUNT_SW_PAGE_FAULTS_MIN || > > > + evsel->attr.config == PERF_COUNT_SW_PAGE_FAULTS)) > > > + evsel->handler = trace__pgfault; > > > > the above looks ugly, can't we set the handler when adding the events? > > Or is this just for the perf.data handling case? Have to dig deeper, > > just a first feeling. > It's for perf.data parsing. If you know a better way to do it without > iterating over all events, pls let me know. We had something related in a perf_evlist__set_tracepoint_handlers(), but this case is for a software event, so we would need a perf_evlist__set_attr_handlers(), I can do that later, for now this is the only user, as this evsel->handler thing was introduced with 'trace', so keep it like this, I can revisit this later. > > > } > > > > > > err = parse_target_str(trace); > > > @@ -2290,6 +2416,8 @@ int cmd_trace(int argc, const char **argv, const > > > char *prefix __maybe_unused) > > > .user_interval = ULLONG_MAX, > > > .no_buffering = true, > > > .mmap_pages = 1024, > > > + .sample_address = true, > > > + .sample_time = true, > > > > The above should be made conditional, i.e. if --pf is used? > Yes, fixed, thanks. Also, have you considered using: [root@zoo ~]# perf list exceptions:page_fault* exceptions:page_fault_user [Tracepoint event] exceptions:page_fault_kernel [Tracepoint event] [root@zoo ~]# Instead? I need to check if they're completely equivalent to what we need here... - Arnaldo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/