Hi, Peter,
I have a question regarding to one of your comments below.
On 3/12/18 3:01 PM, Peter Zijlstra wrote:
On Mon, Mar 12, 2018 at 01:39:56PM -0700, Song Liu wrote:
+static void stack_map_get_build_id_offset(struct bpf_map *map,
+ struct stack_map_bucket *bucket,
+ u64 *ips, u32 trace_nr)
+{
+ int i;
+ struct vm_area_struct *vma;
+ struct bpf_stack_build_id *id_offs;
+
+ bucket->nr = trace_nr;
+ id_offs = (struct bpf_stack_build_id *)bucket->data;
+
+ if (!current || !current->mm ||
+ down_read_trylock(¤t->mm->mmap_sem) == 0) {
You probably want an in_nmi() before the down_read_trylock(). Doing
up_read() is an absolute no-no from NMI context.
The below is the final code from Song:
/*
* We cannot do up_read() in nmi context, so build_id lookup is
* only supported for non-nmi events. If at some point, it is
* possible to run find_vma() without taking the semaphore, we
* would like to allow build_id lookup in nmi context.
*
* Same fallback is used for kernel stack (!user) on a stackmap
* with build_id.
*/
if (!user || !current || !current->mm || in_nmi() ||
down_read_trylock(¤t->mm->mmap_sem) == 0) {
/* cannot access current->mm, fall back to ips */
for (i = 0; i < trace_nr; i++) {
id_offs[i].status = BPF_STACK_BUILD_ID_IP;
id_offs[i].ip = ips[i];
}
return;
}
....
And IIUC its 'trivial' to use this stuff with hardware counters.
Here, you mentioned that it was 'trivial' to use buildid thing with
hardware counters, if I interpreted correctly. However, the hardware
counter overflow will trigger NMI. Based on the above logic,
it will default to old IP only behavior.
Could you explain a little more how to get buildid with hardware
counter overflow events?
Thanks!
+ /* cannot access current->mm, fall back to ips */
+ for (i = 0; i < trace_nr; i++) {
+ id_offs[i].status = BPF_STACK_BUILD_ID_IP;
+ id_offs[i].ip = ips[i];
+ }
+ return;
+ }
+
+ for (i = 0; i < trace_nr; i++) {
+ vma = find_vma(current->mm, ips[i]);
+ if (!vma || stack_map_get_build_id(vma, id_offs[i].build_id)) {
+ /* per entry fall back to ips */
+ id_offs[i].status = BPF_STACK_BUILD_ID_IP;
+ id_offs[i].ip = ips[i];
+ continue;
+ }
+ id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + ips[i]
+ - vma->vm_start;
+ id_offs[i].status = BPF_STACK_BUILD_ID_VALID;
+ }
+ up_read(¤t->mm->mmap_sem);
+}