On Sun, 14 Jan 2024 11:23:24 -0500
Steven Rostedt <rost...@goodmis.org> wrote:

> On Sun, 14 Jan 2024 23:26:43 +0900
> Masami Hiramatsu (Google) <mhira...@kernel.org> wrote:
> 
> > Hi Vincent,
> > 
> > On Thu, 11 Jan 2024 16:17:11 +0000
> > Vincent Donnefort <vdonnef...@google.com> wrote:
> > 
> > > It is now possible to mmap() a ring-buffer to stream its content. Add
> > > some documentation and a code example.
> > > 
> > > Signed-off-by: Vincent Donnefort <vdonnef...@google.com>
> > > 
> > > diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst
> > > index 5092d6c13af5..0b300901fd75 100644
> > > --- a/Documentation/trace/index.rst
> > > +++ b/Documentation/trace/index.rst
> > > @@ -29,6 +29,7 @@ Linux Tracing Technologies
> > >     timerlat-tracer
> > >     intel_th
> > >     ring-buffer-design
> > > +   ring-buffer-map
> > >     stm
> > >     sys-t
> > >     coresight/index
> > > diff --git a/Documentation/trace/ring-buffer-map.rst 
> > > b/Documentation/trace/ring-buffer-map.rst
> > > new file mode 100644
> > > index 000000000000..2ba7b5339178
> > > --- /dev/null
> > > +++ b/Documentation/trace/ring-buffer-map.rst
> > > @@ -0,0 +1,105 @@
> > > +.. SPDX-License-Identifier: GPL-2.0
> > > +
> > > +==================================
> > > +Tracefs ring-buffer memory mapping
> > > +==================================
> > > +
> > > +:Author: Vincent Donnefort <vdonnef...@google.com>
> > > +
> > > +Overview
> > > +========
> > > +Tracefs ring-buffer memory map provides an efficient method to stream 
> > > data
> > > +as no memory copy is necessary. The application mapping the ring-buffer 
> > > becomes
> > > +then a consumer for that ring-buffer, in a similar fashion to trace_pipe.
> > > +
> > > +Memory mapping setup
> > > +====================
> > > +The mapping works with a mmap() of the trace_pipe_raw interface.
> > > +
> > > +The first system page of the mapping contains ring-buffer statistics and
> > > +description. It is referred as the meta-page. One of the most important 
> > > field of
> > > +the meta-page is the reader. It contains the subbuf ID which can be 
> > > safely read
> > > +by the mapper (see ring-buffer-design.rst).
> > > +
> > > +The meta-page is followed by all the subbuf, ordered by ascendant ID. It 
> > > is
> > > +therefore effortless to know where the reader starts in the mapping:
> > > +
> > > +.. code-block:: c
> > > +
> > > +        reader_id = meta->reader->id;
> > > +        reader_offset = meta->meta_page_size + reader_id * 
> > > meta->subbuf_size;
> > > +
> > > +When the application is done with the current reader, it can get a new 
> > > one using
> > > +the trace_pipe_raw ioctl() TRACE_MMAP_IOCTL_GET_READER. This ioctl also 
> > > updates
> > > +the meta-page fields.
> > > +
> > > +Limitations
> > > +===========
> > > +When a mapping is in place on a Tracefs ring-buffer, it is not possible 
> > > to
> > > +either resize it (either by increasing the entire size of the 
> > > ring-buffer or
> > > +each subbuf). It is also not possible to use snapshot or splice.  
> > 
> > I've played with the sample code.
> > 
> > - "free_buffer" just doesn't work when the process is mmap the ring buffer.
> > - After mmap the buffers, when the snapshot took, the IOCTL returns an 
> > error.
> > 
> > OK, but I rather like to fail snapshot with -EBUSY when the buffer is 
> > mmaped.
> > 
> > > +
> > > +Concurrent readers (either another application mapping that ring-buffer 
> > > or the
> > > +kernel with trace_pipe) are allowed but not recommended. They will 
> > > compete for
> > > +the ring-buffer and the output is unpredictable.
> > > +
> > > +Example
> > > +=======
> > > +
> > > +.. code-block:: c
> > > +
> > > +        #include <fcntl.h>
> > > +        #include <stdio.h>
> > > +        #include <stdlib.h>
> > > +        #include <unistd.h>
> > > +
> > > +        #include <linux/trace_mmap.h>
> > > +
> > > +        #include <sys/mman.h>
> > > +        #include <sys/ioctl.h>
> > > +
> > > +        #define TRACE_PIPE_RAW 
> > > "/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw"
> > > +
> > > +        int main(void)
> > > +        {
> > > +                int page_size = getpagesize(), fd, reader_id;
> > > +                unsigned long meta_len, data_len;
> > > +                struct trace_buffer_meta *meta;
> > > +                void *map, *reader, *data;
> > > +
> > > +                fd = open(TRACE_PIPE_RAW, O_RDONLY);
> > > +                if (fd < 0)
> > > +                        exit(EXIT_FAILURE);
> > > +
> > > +                map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 
> > > 0);
> > > +                if (map == MAP_FAILED)
> > > +                        exit(EXIT_FAILURE);
> > > +
> > > +                meta = (struct trace_buffer_meta *)map;
> > > +                meta_len = meta->meta_page_size;
> > > +
> > > +                printf("entries:        %lu\n", meta->entries);
> > > +                printf("overrun:        %lu\n", meta->overrun);
> > > +                printf("read:           %lu\n", meta->read);
> > > +                printf("subbufs_touched:%lu\n", meta->subbufs_touched);
> > > +                printf("subbufs_lost:   %lu\n", meta->subbufs_lost);
> > > +                printf("subbufs_read:   %lu\n", meta->subbufs_read);
> > > +                printf("nr_subbufs:     %u\n", meta->nr_subbufs);
> > > +
> > > +                data_len = meta->subbuf_size * meta->nr_subbufs;
> > > +                data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, 
> > > data_len);
> 
> The above is buggy. It should be:
> 
>               data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, 
> meta_len);
> 
> The last parameter is where to start the mapping from, which is just
> after the meta page. The code is currently starting the map far away
> from that.

Ah, indeed! I confirmed that fixed the bus error.

Thank you!

> 
> -- Steve
> 
> > > +                if (data == MAP_FAILED)
> > > +                        exit(EXIT_FAILURE);
> > > +
> > > +                if (ioctl(fd, TRACE_MMAP_IOCTL_GET_READER) < 0)
> > > +                        exit(EXIT_FAILURE);
> > > +
> > > +                reader_id = meta->reader.id;
> > > +                reader = data + meta->subbuf_size * reader_id;  
> > 
> > Also, this caused a bus error if I add below 2 lines here.
> > 
> >         printf("reader_id: %d, addr: %p\n", reader_id, reader);
> >         printf("read data head: %lx\n", *(unsigned long *)reader);
> > 
> > -----
> > / # cd /sys/kernel/tracing/
> > /sys/kernel/tracing # echo 1 > events/enable 
> > [   17.941894] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked 
> > and stat_runtime require the kernel parameter schedstats=enable or 
> > kernel.sched_schedstats=1
> > /sys/kernel/tracing # 
> > /sys/kernel/tracing # echo 1 > buffer_percent 
> > /sys/kernel/tracing # /mnt/rbmap2 
> > entries:        245291
> > overrun:        203741
> > read:           0
> > subbufs_touched:2041
> > subbufs_lost:   1688
> > subbufs_read:   0
> > nr_subbufs:     355
> > reader_id: 1, addr: 0x7f0cde51a000
> > Bus error
> > -----
> > 
> > Is this expected behavior? how can I read the ring buffer?
> > 
> > Thank you,
> > 
> > > +
> > > +                munmap(data, data_len);
> > > +                munmap(meta, meta_len);
> > > +                close (fd);
> > > +
> > > +                return 0;
> > > +        }
> > > -- 
> > > 2.43.0.275.g3460e3d667-goog
> > >   
> > 
> > 
> 


-- 
Masami Hiramatsu (Google) <mhira...@kernel.org>

Reply via email to