Hi Vincent, I had a chance to look into this and came up with the following fix: https://review.lttng.org/c/lttng-tools/+/7478/4
Would you have a chance to try it on your end before I merge it? Thanks for the great bug report! Jérémie ----- Original Message ----- > From: "Vincent Whitchurch" <vincent.whitchu...@axis.com> > To: "Jeremie Galarneau" <jeremie.galarn...@efficios.com> > Cc: "lttng-dev" <lttng-dev@lists.lttng.org>, "kernel" <ker...@axis.com> > Sent: Wednesday, March 2, 2022 4:27:30 AM > Subject: Re: [lttng-dev] [PATCH lttng-tools] Fix: consumer-stream: > use-after-free of metadata bucket > On Tue, Mar 01, 2022 at 06:19:23PM +0100, Jérémie Galarneau wrote: >> Thanks a lot for reporting the problem. If I understand the ASAN >> report correctly, the stream itself will also be double free'd, so >> I don't think this is the complete fix. > > Yeah, it looked odd that consumer_stream_destroy() is called recursively > on the same stream but AFAICS the code's been like this for a while so I > assumed it was on purpose, and only the metadata bucket stuff was > relatively new. ASAN doesn't detect any double frees of the stream > itself, but I guess calling call_rcu(..., free_stream_rcu) twice on the > same stream is not expected behaviour and could lead to other problems. > >> There definitely seems to be a problem with regards to the ownership >> of the metadata channel vs stream. Let me look into it. > > Great, thank you! > >> I see that you fall into a case where the metadata setup fails, >> can you share more info about how this can be reproduced? > > In the core dump I received (on v2.12.4), consumer_stream_destroy() was > called from the error label in setup_metadata and ret was set to > LTTCOMM_CONSUMERD_ERROR_METADATA. So consumer_send_relayd_stream() had > returned an error. I only had the core dump and no other logs, so I did > not know which of the paths inside consumer_send_relayd_stream() had > failed, but since I was primarily interested in fixing the crash itself > I simply forced this code path to be taken: > > diff --git a/src/common/ust-consumer/ust-consumer.c > b/src/common/ust-consumer/ust-consumer.c > index fa1c71299..97ed59632 100644 > --- a/src/common/ust-consumer/ust-consumer.c > +++ b/src/common/ust-consumer/ust-consumer.c > @@ -908,8 +908,7 @@ static int setup_metadata(struct lttng_consumer_local_data > *ctx, uint64_t key) > > /* Send metadata stream to relayd if needed. */ > if (metadata->metadata_stream->net_seq_idx != (uint64_t) -1ULL) { > - ret = consumer_send_relayd_stream(metadata->metadata_stream, > - metadata->pathname); > + ret = -1; > if (ret < 0) { > ret = LTTCOMM_CONSUMERD_ERROR_METADATA; > goto error; > > With the above patch, I could easily reproduce the use-after-free using > the following steps on the latest release v2.13.4, and it was clear that > this use-after-free was the cause of the original core dump on the older > release too. > > Build with ASAN: > > lttng-tools$ LDFLAGS=-fsanitize=address CFLAGS=-fsanitize=address ./configure > > Shell #1: > > lttng-ust$ tests/compile/api0/hello/hello 10000 > > Shell #2: > > lttng-tools$ ASAN_OPTIONS=detect_odr_violation=0 > ./src/bin/lttng-sessiond/lttng-sessiond > > Shell #3: > > lttng-tools$ export ASAN_OPTIONS=detect_odr_violation=0 > lttng-tools$ ./src/bin/lttng/lttng create --live && ./src/bin/lttng/lttng > enable-event --userspace 1 && ./src/bin/lttng/lttng start && sleep 1 && > ./src/bin/lttng/lttng stop > > The ASAN splat should be seen in shell #2. Note that you may have to > run the command in shell #3 a couple of times since > LTTNG_CONSUMER_SETUP_METADATA doesn't seem to be sent every time. _______________________________________________ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev